Using demjson with Python 3 =========================== Starting with release 2.0, demjson and jsonlint, can support either Python 2 or Python 3 -- though it must be installed as described below to work with Python 3. Be aware that the API will have slightly different behavior in Python 3, mainly because it uses the 'bytes' type in a few places. Installing for Python 3 ======================= The source for the demjson module is written for Python 2.x. However, since release 2.0, it has been designed to be converted to a Python 3 equivalent form by using the standard "2to3" Python conversion utility. If you have installed demjson with a standard PyPI package distribution mechanism; such as pip, easy_install, or just typing "python3 setup.py install"; then the 2to3 conversion will be performed automatically as part of the installation process. Running self-tests: if you install using a PyPI distribution mechanism then the test program "tests/test_demjson.py" may either not be installed or may not have been converted to Python 3. You can do this manually if needed by: cd test 2to3 -w test_demjson.py PYTHONPATH=.. python3 test_demjson.py Bytes versus strings ==================== When calling demjson functions and classes from a Python 3 environment, be aware that there are a few differences from what is documented for Python 2. Most of these differences involve Python's byte-oriented types ('bytes', 'bytearray', and 'memoryview'). Decoding JSON into Python values -------------------------------- When you decode a JSON document you can pass either a string or a bytes type. If you pass a string, then it is assumed to already be a sequence of Unicode characters. So demjson's own Unicode decoding step will be skipped. When you pass a byte-oriented type the decode() function will attempt to detect the Unicode encoding and appropriately convert the bytes into a Unicode string first. You can override the guessed encoding by specifying the appropriate codec name, or codec object. For example, the following are equivalent and have the same result: demjson.decode( '"\u2014"' ) demjson.decode( b'"\xe2\x80\x94"' ) demjson.decode( bytes([ 0x22, 0xE2, 0x80, 0x94, 0x22 ]) ) Notice that with the last two examples the decode() function has automatically detected that the byte array was UTF-8 encoded. You can of course pass in an 'encoding' argument to force the Unicode decoding codec to use -- though if you get this wrong a UnicodeDecodeError may be raised. Reading JSON from a file ------------------------ When reading from a file the bytes it contains must be converted into Unicode characters. If you want demjson to do that be sure to open the file in binary mode: json_data = open("myfile.json", "rb").read() # => json_data is a bytes array py_data = demjson.decode( json_data, encoding="utf8" ) But if you read the file in text mode then the Unicode decoding is done by Python's own IO core, and demjson will parse the already-Unicode string without doing any decoding: json_data = open("myfile.json", "r", encoding="utf8").read() # => json_data is a (unicode) string py_data = demjson.decode( json_data ) Encoding Python values to JSON ------------------------------ When encoding a Python value into a JSON document, you will generally get a string result (which is a sequence of Unicode characters). However if you specify a particular encoding, then you will instead get a byte array as a result. demjson.encode( "\u2012" ) # => Returns a string of length 3 demjson.encode( "\u2012", encoding="utf-8" ) # => Returns 5 bytes b'"\xe2\x80\x94"' Writing JSON to a file ---------------------- When generating JSON and writing it to a file all the Unicode characters must be encoded into bytes. You can let demjson do that by specifying an encoding, though be sure that you open the output file in binary mode: json_data = demjson.encode( py_data, encoding="utf-8" ) # json_data will be a bytes array open("myfile.json", "wb" ).write( json_data ) The above has the advantage that demjson can automatically adjust the \u-escaping depending on the output encoding. But if you don't ask for any encoding you'll get the JSON output as a Unicode string, in which case you need to open your output file in text mode with a specific encoding. You must choose a suitable encoding or you could get a UnicodeEncodeError. json_data = demjson.encode( py_data ) # json_data will be a (unicode) string open("myfile.json", "w", encoding="utf-8" ).write( json_data ) Encoding byte types ------------------- If you are encoding into JSON and the Python value you pass is, or contains, any byte-oriented type ('bytes', 'bytearray', or 'memoryview') value; then the bytes must be converted into a different value that can be represented in JSON. The default is to convert bytes into an array of integers, each with a value from 0 to 255 representing a single byte. For example: py_data = b'\x55\xff' demjson.encode( py_data ) # Gives => '[85,255]' You can supply a function to the 'encode_bytes' hook to change how bytes get encoded. def to_hex( bytes_val ): return ":".join([ "%02x" % b for b in bytes_val ]) demjson.encode( py_data, encode_bytes=to_hex ) # Gives => '"55:ff"' See the 'encode_bytes' hook description in HOOKS.txt for further details. Other Python 3 specifics ======================== Data types ---------- When encoding JSON, most of the new data types introduced with Python 3 will be encoded. Note only does this include the byte-oriented types, but also Enum and ChainMap.

Chained exceptions ------------------ Any errors that are incidentally raised during JSON encoding or decoding will be wrapped in a 'JSONError' (or subclass). In Python 3, this wrapping uses the standard Exception Chaining (PEP 3134) mechanism.

See the Exception Handling example in the file HOOKS.txt