Thursday, April 14, 2011

Converting between JSON and XML

Converting between JSON and XML

Most of structured data on web used to be encoded in XML format,
and most of new data on the web are encoded and expected in JSON format.
Ability to convert reliably from XML to JSON and back would be very helpful.

Challenge with conversion between XML and JSON
is that XML element has two sets of content:
attributes and sub-elements (plus white spaces)
* Attributes would map nicely to JSON object (hash, map), and
* sub-elements map nicely to JSON array,
but not both at the same time.

This is because XML is a hybrid syntax, intended for text markup
rather than for representing data, in particular not object-oriented data.

Same problem exist when mapping objects from OO programming language to XML and back. Direct example is XAML syntax used by .NET (WPF, Silverlight, WF...).
Objects make a verbose and a bit complicated XML...

Here is another useful article about the same topic
Converting Between XML and JSON @ XML.com


Author of JSON.NET library suggest a method where XML attributes
are converted to JSON names that start with '@' character inside of JSON object/hash/map.
For example

<person id="1">
<name>Alan</name>
<url>http://www.google.com</url>
</person>

"person": {
"@id": "1",
"name": "Alan",
"url": "http://www.google.com"
}

so far, so good...
while added @ make a bit of semantic difference,
it makes for a more compact final JSON document,
and help distinguish from XML elements as names.

Same example does have more questionable departure from original XML structure
when two elements with same name are converted
to an array of values, rather than array of objects.

<root>
<person id="1">
<name>Alan</name>
<url>http://www.google.com</url>
</person>
<person id="2">
<name>Louis</name>
<url>http://www.yahoo.com</url>
</person>
</root>

suggested conversion to JSON looks like this:

"root": {
"person": [
{
"@id": "1",
"name": "Alan",
"url": "http://www.google.com"
},
{
"@id": "2",
"name": "Louis",
"url": "http://www.yahoo.com"
}
]
}

While suggested format may be convenient for usage, it is different than original XML. XML has two elements, and JSON has one "person" object that has array of anonymous objects. More correct 1:1 mapping should look like this, I think:

"root": [
"person": {
"@id": "1",
"name": "Alan",
"url": "http://www.google.com"
},
"person": {
"@id": "2",
"name": "Louis",
"url": "http://www.yahoo.com"
}
]

Each XML element <person> becomes one JSON object "person",
and multiple XML elements are stored in JSON array.

Clearly, <name> and <url> are more convenient when stored as names/values in JSON object, but this is not a direct 1:1 mapping to XML.
Good thing about such mapping is that in special cases where sub-elements
do not repeat it is possible to convert back to original XML without loss of info.
On the other side, exact and non-problematic mapping would be:

"root": [
"person": [
"@": { "id": "1" },
"name": "Alan",
"url": "http://www.google.com"
],
"person": [
"@": { "id": "2" },
"name": "Louis",
"url": "http://www.yahoo.com"
]
]

The syntax difference is just in using [ ] instead of { }.
I have stored attributes into an object with name "@", to preserve original names of attributes. Result is that if for example there could be more sub-elements in XML, so JSON formatted this way is able to take any XML.

"person": [
"@": { "id": "1" },
"name": "Alan",
"url": "http://www.google.com",
"url": "http://www.facebook.com"
],

The 'price' is in possibly less optimal performance and complexity of using such JSON objects.

A good XML-JSON converter should be able to detect problem situations and use 1:1 mapping as needed, I think. In fact, I think 1:1 should be default conversion, and "optimized" should be an option.

Here is a few more links related to JSON-XML conversion:
*Online XML / JSON converter
*JSON-java/XML.java by Douglas Crockford
*How to convert XML to JSON in ASP.NET C#
*Converting Between XML and JSON @ XML.com