Clash between JSON out of Drupal with Youtube Embedded Field and MongoDB Key Names


(Quick background) As part of a much larger project, I’m pulling content out of Drupal as JSON-formatted nodes into MongoDB. The plan is to use Drupal as the content creation platform, and MongoDB as the database store for the content delivery platform, with JSON as the data format.

Importing JSON content from Drupal into MongoDB is actually very easy - I’ve got Drupal 6 already configured to serve JSON through the REST Service (for more, read my post Using Drupal’s Views as a JSON Web Service with the REST Server)

I installed MongoDB on Mac OS X, installed the mongo ruby gem, and then wrote a quick ruby script to pull JSON off my server and dump it into mongo. This worked great - it throws errors when it hits nodes that aren’t published, but that’s fine. What didn’t work was pulling in anything that used the Embedded Media Framework CCK field for Youtube - one of the JSON tags that gets embedded uses the string ”http://www.w3.org/2005/Atom” as a key, and the periods aren’t legal in a key name in MongoDB.

Luckily, this was just the one key name in my JSON output, so I added a quick gsub call to replace www.w3.org with www-w3-org, and all was well. I don’t plan to use the Youtube Atom feed anyway, but if I do, I’ll just have to remember to use my key, not the Drupal 6 CCK key.