-
Kirill Smelkov authored
With Python3 I've got tired to constantly use .encode() and .decode(); getting exception if original argument was unicode on e.g. b.decode(); getting exception on raw bytes that are invalid UTF-8, not being able to use bytes literal with non-ASCII characters, etc. So instead of this pain provide two functions that make sure an object is either bytes or unicode: - b converts str/unicode/bytes s to UTF-8 encoded bytestring. Bytes input is preserved as-is: b(bytes_input) == bytes_input Unicode input is UTF-8 encoded. The encoding always succeeds. b is reverse operation to u - the following invariant is always true: b(u(bytes_input)) == bytes_input - u converts str/unicode/bytes s to unicode string. Unicode input is preserved as-is: u(unicode_input) == unicode_input Bytes input is UTF-8 decoded. The decoding always succeeds and input information is not lost: non-valid UTF-8 bytes are decoded into surrogate codes ranging from U+DC80 to U+DCFF. u is...
bcb95cd5