Thursday, 1 May 2014

Saving and loading data in Python with JSON

Scroll down if you just want to see the example code.


Long-winded pre-amble about computer documentation


If you don't use a computer language too often, then they can be pretty baffling when you come back to them after time away coding up something else. Furthermore, you might find yourself stuck in the mindset (and terminology) of the other language, which makes it pretty difficult to search for the correct terms on the Internet.

Recently, I had just this problem. It was a simple problem.

In Python, I have a record structure (= dictionary) which has labels (= keys) and data (= values). I just want to save it to disk and then later read it back again. That's not so bad, but the one extra point is that I'd like the save file to human-readable, so I can quickly check it with an editor to either see what's there or make corrections.

I quickly decided that "json" was what I needed, instead of  "pickle" (not human-readable files) or csv (seemed to require more work to write out the file). Whether that was the right choice or not is neither here nor there. The fact is that I couldn't quite get it to work.

You see, the json documenation (LINK) while comprehensive doesn't match the terminology I was thinking in my head. Look for the phrase "save to disk" on that page and there is nothing. The word "save" simply does not appear.

Also, if you are used to simple terms like load, save, read, write, etc., then that documentation doesn't read so well. Seriously... go to that page, start at the heading and read it out loud and either listen to yourself and let someone else listen to it. Read to at least the end of the first example.

See what I mean?

Checking websites like StackOverflow finds others occasionally asking similar questions, but with aggressive/arrogant answers along the lines of "the documentation is fine, what's your problem? RTM!"

Well, sometimes we are just on the way to something else. We are scripting, not coding. And we need something in a hurry. We simply don't have the time/skill/experience to get our heads around nested associative array object stream hierarchies (or whatever).

Don't get me wrong. The Python documentation is very good. It is thorough and comprehensive. But if you are coming to it raw and with a problem already in mind, then it can be a bit impenetrable at times. And I have been programming for years and have written many hundreds of thousands of lines of code in various other languages, so I'd hate to think what it would be like for a complete novice.

In any case, if some search engine just happens to put up this weblog post and help one other person in the world who wants a quick and easy, fully working, example of how to write some data to disk and read it back in again, then I'll be happy!   :-)


Saving a Python dictionary to disk using JSON


Okay, here's the code I came up with...

#!/usr/bin/python

# This only uses the json package
import json

# Create a dictionary (a key-value-pair structure in Python)
my_dict = {                   
  'Name':      'KAIRA',
  'Location':  u'Kilpisj\u00E4rvi',
  'Longitude': 20.76,
  'Latitude':  69.07
}

# We can print the dictionary to show we have data. E.g.
print my_dict                 
print my_dict['Location']     

# Open a file for writing
out_file = open("test.json","w")

# Save the dictionary into this file
# (the 'indent=4' is optional, but makes it more readable)
json.dump(my_dict,out_file, indent=4)                                    

# Close the file
out_file.close()
 
At this point, my little programme has created a file called "test.json". The contents of it look like this...

{
    "Latitude": 69.07, 
    "Name": "KAIRA", 
    "Longitude": 20.76, 
    "Location": "Kilpisj\u00e4rvi"
}

You can edit this as a text file or print it to screen with Linux utilities like 'cat' or 'more'. It can be e-mailed too.

You can tweak the "indent" parameter to change the number of spaces, and there are other options in the json.dump() which can also be used to control the behaviour.

There are some complications if you have non-standard dictionaries or weird data types. You will probably also hit performance issues if you try to save vast quantities of data. However, for the purposes of something quick and simple, this was sufficient for me.

Loading a JSON file into a Python dictionary


Having saved our data, we need to read it back in again. I've done this as a separate programme.

#!/usr/bin/python

# This only uses the json package
import json

# Open the file for reading
in_file = open("test.json","r")

# Load the contents from the file, which creates a new dictionary
new_dict = json.load(in_file)

# Close the file... we don't need it anymore  
in_file.close()

# Print the contents of our freshly loaded dictionary
print new_dict

So, there you go. I hope that helps!

12 comments:

  1. Thank you! This is exactly what I needed. (And KAIRA sounds fascinating, too!)

    ReplyDelete
  2. It has helped many novices like me . Thanks

    ReplyDelete
  3. Your intro says it all. Your tutorial was spot on what I was looking for, despite opening half a dozen results.

    ReplyDelete
  4. Another distressed Sysadmin saved!

    ReplyDelete
  5. how do i write a single func return value in Json.. for e.g
    foo()
    value= do some work
    return (value)

    Now i want to save this value in Json

    ReplyDelete
  6. Great!! This is what I was searching! Thanks!

    ReplyDelete