Using callback hooks in demjson =============================== Starting with demjson release 2.0 it is possible to hook into the encoding and decoding engine to transform values. There are a set of hooks available to use during JSON decoding and another set of hooks to use during encoding. The complete descriptions of all the available hooks appear at the end of this document, but in summary they are: Decoding Hooks Encoding Hooks -------------- -------------- decode_string encode_value decode_float encode_dict decode_number encode_dict_key decode_array encode_sequence decode_object encode_bytes encode_default Although hooks can be quite powerful, they are not necessarily suitable for every situation. You may need to perform some complex transformations outside of demjson. Simple example: decoding dates ------------------------------ Say you have some JSON document where all dates have been encoded as strings with the format "YYYY-MM-DD”. You could use a hook function to automatically convert those strings into a Python date object. import demjson, datetime, re def date_converter( s ): match = re.match( '^(\d{4})-(\d{2})-(\d{2})$', s ) if match: y, m, d = [int(n) for n in match.groups()] return datetime.date( y, m, d ) else: raise demjson.JSONSkipHook demjson.decode( '{"birthdate": "1994-01-17"}' ) # gives => {'birthdate': '1994-01-17'} demjson.decode( '{"birthdate": "1994-01-17"}', decode_string=date_converter ) # gives => {'birthdate': datetime.date(1994, 1, 17) } Simple example: encoding complex numbers ---------------------------------------- Say you are generating JSON but your Python object contains complex numbers. You could use an encoding hook to transform them into a 2-ary array. import demjson def complex_to_array( val ): if isinstance( val, complex ): return [ val.real, val.imag ] else: raise demjson.JSONSkipHook demjson.encode( {'x': complex('3+9j')} ) # raises JSONEncodeError demjson.encode( {'x': complex('3+9j')}, encode_value=complex_to_array ) # gives => '{"x": [3.0, 9.0]}' Defining and setting hooks ========================== A hook must be a callable function that takes a single argument, and either returns the same value, or some other value after a transformation. # A sample user-defined hook function def my_sort_array( arr ): arr.sort() return arr You can set hook functions either by calling the set_hook() method of the JSON class; or by passing an equivalent named argument to the encode() and decode() functions. If you are using the encode() or decode() functions, then your hooks are specified with named arguments: demjson.decode( '[3,1,2]', decode_array=my_sort_array ) # gives => [1, 2, 3] If you are using the JSON class directly you need to call the set_hook() method with both the name of the hook and your function: j = demjson.JSON() j.set_hook( 'decode_array', my_sort_array ) You can also clear a hook by specifying None as your function, or with the clear_hook() method. j.set_hook( 'decode_array', None ) j.clear_hook( 'decode_array' ) # equivalent And to clear all hooks: j.clear_all_hooks() Selective processing (Skipping) =============================== When you specify a hook function, that function will be called for every matching object. Many times though your hook may only want to perform a transformation on specific objects. A hook function may therefore indicate that it does not wish to perform a transformation on a case-by-case basis. To do this, any hook function may raise a JSONSkipHook exception if it does not wish to handle a particular invocation. This will have the effect of skipping the hook, for that particular value, as if the hook was net set. # another sample hook function def my_sum_arrays( arr ): try: return sum(arr) except TypeError: # Don't do anything raise demjson.JSONSkipHook If you just 'return None', you are actually telling demjson to convert the value into None or null. Though some hooks allow you to just return the same value as passed with the same effect, be aware that not all hooks do so. Therefore it is always recommended that you raise JSONSkipHook when intending to skip processing. Other important details when using hooks ======================================== Order of processing: Decoding hooks are generally called in a bottom-up order, whereas encoding hooks are called in a top-down order. This is discussed in more detail in the description of the hooks. Modifying values: The value that is passed to a hook function is not a deep copy. If it is mutable and you make changes to it, and then raise a JSONSkipHook exception, the original value may not be preserved as expected. Exception handling ================== If your hook function raises an exception, it will be caught and wrapped in a JSONDecodeHookError or JSONEncodeHookError. Those are correspondingly subclasses of the JSONDecodeError and JSONEncodeError; so your outermost code only needs to catch one exception type. When running in Python 3 the standard Exception Chaining (PEP 3134) mechanism is employed. Under Python 2 exception chaining is simulated, but a printed traceback of the original exception may not be printed. You can get to the original exception in the '__cause__' member of the outer exception. Consider the following example: def my_encode_fractions( val ): if 'numerator' in val and 'denominator' in val: return d['numerator'] / d['denominator'] # OOPS. DIVIDE BY ZERO else: raise demjson.JSONSkipHook demjson.encode( {'numerator':0, 'denominator':0}, encode_dict=my_encode_fractions ) For this example a divide by zero error will occur within the hook function. The exception that eventually gets propagated out of the encode() call will be something like: # ==== Exception printed Python 3: Traceback (most recent call last): ... File "example.py", line 3, in my_encode_fractions ZeroDivisionError: division by zero The above exception was the direct cause of the following exception: Traceback (most recent call last): ... File "..../demjson.py", line 9999, in call_hook demjson.JSONEncodeHookError: Hook encode_dict raised 'ZeroDivisionError' while encoding type List of decoding hooks ====================== Decoding hooks let you intercept values during JSON parsing and perform additional translations. You could for example recognize strings with particular patterns and convert them into specific Python types. Decoding hooks are not given raw JSON, but instead fundamental python values that closely correspond to the original JSON text. So you don't need to worry, for instance, about Unicode surrogate pairs or other such complexities. When dealing with nested JSON data structures, objects and arrays, the decoding hooks are called bottom-up: the innermost value first working outward, and then left to right. The available hooks are: decode_string ------------- Called for every JSON string literal with the Python-equivalent string value as an argument. Expects to get a Python object in return. Remember that the keys to JSON objects are also strings, and so your hook should typically either return another string or some type that is hashable. decode_float ------------ Called for every JSON number that looks like a float (has a "."). The string representation of the number is passed as an argument. Expects to get a Python object in return. decode_number ------------- Called for every JSON number. The string representation of the number is passed as an argument. Expects to get a Python object in return. NOTE: If the number looks like a float and the 'decode_float' hook is set, then this hook will not be called. Warning: If you are decoding in non-strict mode, then your number decoding hook may also encounter so-called non-numbers. You should be prepared to handle any of 'NaN', 'Infinity', '+Infinity', '-Infinity'. Or raise a 'JSONSkipHook' exception to let demjson handle those values. decode_array ------------ Called for every JSON array. A Python list is passed as the argument, and expects to get a Python object back. NOTE: this hook will get called for every array, even for nested arrays. decode_object ------------- Called for every JSON object. A Python dictionary is passed as the argument, and expects to get a Python object back. NOTE: this hook will get called for every object, even for nested objects. List of encoding hooks ====================== Encoding hooks are used during JSON generation to let you intercept python values in the input and perform translations on them prior to encoding. You could for example convert complex numbers into a 2-valued array or a custom class instance into a dictionary. Remember that these hooks are not expected to output raw JSON, but instead a fundamental Python type which demjson already knows how to properly encode. When dealing with nested data structures, such as with dictionaries and lists, the encoding hooks are called top-down: the outermost value first working inward, and then from left to right. This top-down order means that encoding hooks can be dangerous in that they can create infinite loops. The available hooks are: encode_value ------------ Called for every Python object which is to be encoded into JSON. This hook will get a chance to transform every value once, regardless of it's type. Most of the time this hook will probably raise a 'JSONSkipHook', and only selectively return a transformed value. If the hook function returns a value that is generally of a different kind, then the encoding process will run again. This means that your returned value is subject to all the various hook processing again. Therefore careless coding of this hook function can result in infinite loops. This hook will be called before hook 'encode_dict_key' for dictionary keys. encode_dict ----------- Called for every Python dictionary: any 'dict', 'UserDict', 'OrderedDict', or ChainMap; or any subclass of those.

It will also be called for anything that is dictionary-like in that it supports either of iterkeys() or keys() as well as __getitem__(), so be aware in your hook function that the object you receive may not support all the varied methods that the standard dict does. On recursion: if your hook function returns a dictionary, either the same one possibly modified or anything else that looks like a dictionary, then the result is immediately processed. This means that your hook will _not_ be re-invoked on the new dictionary. Though the contents of the returned dictionary will individually be subject to their own hook processing. However if your hook function returns any other kind of object other than a dictionary then the encoding for that new object starts over; which means that other hook functions may get invoked. encode_dict_key --------------- Called for every dictionary key. This allows you to transform non-string keys into strings. Note this will also be called even for keys that are already strings. This hook is expected to return a string value. However if running in non-strict mode, it may also return a number, as ECMAScript allows numbers to be used as object keys. This hook will be called after the 'encode_value' hook. encode_sequence --------------- Called for every Python sequence-like object that is not a dictionary or string. This includes all 'list' and 'tuple' types or their subtypes, or any other type that allows for basic iteration. encode_bytes ------------ PYTHON 3 ONLY. Called for every Python 'bytes' or 'bytearray' type. Additionally, memory views (type 'memoryview') with an “unsigned byte” format ('B') will converted to a normal 'bytes' type and passed to this hook as well. Memory view objects with different item formats are treated as ordinary sequences (lists of numbers). If this hook is not set then byte types are encoded as a sequence (e.g., a list of numbers), or according to the 'encode_sequence' hook. One useful example is to compress and Base-64 encode any bytes value: import demjson, bz2, base64 def my_bz2_and_base64_encoder( bytes_val ): return base64.b64encode( bz2.compress( bytes_val ) ).decode('ascii') data = open( "some_binary_file.dat", 'rb' ).read() # Returns bytes type demjson.encode( {'opaque': data}, encode_bytes=my_bz2_and_base64_encoder ) # gives => '{"opaque": "QLEALFP ... N8WJ/tA=="}' And if you replace 'b64encode' with 'b16encode' in the above, you will hexadecimal-encode byte arrays. encode_default -------------- Called for any Python type which can not otherwise be converted into JSON, even after applying any other encoding hooks. A very simple catch-all would be the built-in 'repr()' or 'str()' functions, as in: today = datetime.date.today() demjson.encode( {"today": today}, encode_default=repr ) # gives => '{"today": "datetime.date(2014, 4, 22)"}' demjson.encode( {"today": today}, encode_default=str ) # gives => '{"today": "2014-04-22"}'