Accessing XML DOM Data

This page demonstrates how to turn an XML DOM document or tree into a JavaScript struct that allows you to access the data and attributes easily.





In this example, we are simulating a server request for some XML that contains information about email messages for a particular user. Each press of the "Fetch Messages" button fires off a load request which returns the following XML when the request is successful:

<?xml version="1.0" encoding="utf-8"?>
<messages count="5">
	<message id="349348" recieved="Thu, 12 Oct 2006 16:29:35 -0800" size="1K" attachments="false">
		<from id="123456" address="esmith@foo.com">Edward Smith</from>
		<subject>Donuts in the break room.</subject>
	</message>
	<message id="351325" recieved="Wed, 25 Oct 2006 14:12:13 -0800" size="4K" attachments="false">
		<from id="127937" address="njohnson@foo.com">Neil Johnson</from>
		<subject>Please send check via certified-express snail mail!</subject>
		<flags important="true" needsreply="true" returnreceipt="true" />
	</message>
	<message id="456848" recieved="Wed, 01 Nov 2006 09:35:57 -0800" size="720K" attachments="true">
		<from id="126474" address="swilliams@foo.com">Steve Williams</from>
		<subject>Sign the attached doc and return by Friday at the latest.</subject>
		<flags returnreceipt="true" />
	</message>
	<message id="456976" recieved="Tue, 07 Nov 2006 14:36:04 -0800" size="100K" attachments="true">
		<from id="120585" address="jjones@foo.com">John Jones</from>
		<subject>Ship Party Details</subject>
	</message>
	<message id="567833" recieved="Mon, 04 Dec 2006 18:24:05 -0800" size="12K" attachments="false">
		<from id="127493" address="jbrown@foo.com">Joe Brown</from>
		<subject>Check this out!</subject>
	</message>
</messages>

If the "Simulate Server Down" checkbox is checked, the request results in the following XML being returned to the client:

<?xml version="1.0" encoding="utf-8"?>
<reply id="326737" type="error" date="Fri, 15 Dec 2006 11:47:36 -0800">
	<status>404</status>
	<msg>This server is currently undergoing maintenance and was unable to process your request.</msg>
</reply>

In both cases, the server request isn't resulting in a request error, but the data that is being sent back by the server is different, and in one particular case, the data is telling the client that the server is down, so the client code should handle that case gracefully.

If you view the source for this page, you will see that it contains 2 JavaScript functions, LoadMessages() and ProcessMessageResults(). The LoadMessages() function simply fires off an HTTP request for the message data, and tells the load function to call ProcessMessageResults() when it has recieved the server response. The function ProcessMessageResults() function is where all the action happens. It uses Spry.XML.documentToObject() to create a JS object version of the XML DOM to ease the extraction of information from the XML data that was returned by the server. It uses the resulting object to figure out what type of XML data was returned, and then extracts the appropriate information out of the data for logging purposes.

The intent of Spry.XML.documentToObject() was to produce an object that had properties and methods on it that allowed you to traverse the XML data easily to check for tag/attribute existence and extract out data values without all the overhead of multiple calls to getElementsByTagName() and getAttribute(). The return value of Spry.XML.documentToObject() will either be an object that represents the XML document, or null. Spry.XML.documentToObject() will return a null value if the document node you pass into it is null/undefined, the document node has no element children or value. It should be noted that valid XML documents should have only one top-level element, aka the root node, defined at any time in the document. This means that the object returned by documentToObject() will always contain one property on it. The property's name is the same as the root element's tag name, and its value is an object that contains properties that describe its attributes, value, and child nodes.

Using the <reply> XML example above, this call to Spry.Utils.documentToObject():

var xml = Spry.Utils.documentToObject(domDoc); 

is roughly equivalent to this object assignment in JS:

var xml =
{
	"reply":
	{
		"@id": "326737",
		"@type": "error",
		"@date": "Fri, 15 Dec 2006 11:47:36 -0800",
		"status":
		{
			"#text": "404",
		},
		"msg":
		{
			"#text": "This server is currently undergoing maintenance and was unable to process your request.",
		}
	}
};

The properties on this object will be the names of any child tags it contains and any attributes its start tag may contain. All attribute names will be prefixed by an '@' character to help differenciate them from any child tag names which could theoretically have the same name. Aside from the naming differences, there is one more important distinction between attributes and child tags. The value of a child tag properties will always be an object, while the value of an attribute property is always a string. This is an important point to remember, because it means that there is a difference in the way you extract a value out of an attribute, versus the value a child tag. For a child tag, you have to call the method _value() on it to get its value (text between the child tag's open and close tags). For attributes, the value for the attribute property is the value for the attribute. To illustrate, lets walk through an example of how the <reply> XML example above is converted to an object, and how we access the information inside it.

var xml = Spry.XML.documentToObject(docDOM);

// Check for the presence of a root node named reply.

if (xml && xml.reply)
{
	// We have a reply node in the document. Grab its type and status.

	var type = xml.reply["@type"] // type now has the value "error" in it.

	var status = xml.reply.status._value(); // status now has the value "404" in it.
}

After the call to documentToObject(), the resulting object is assigned to a variable called 'xml'. To access the root node of the document, we simply use "dot notation" (xml.reply) to access the object that represents the root node. Notice that the name of the property used to access the root node is the same as the name of the tag for the root node. You can use "dot notation" to access the data of any other child node as long as its tag name contains only alpha-numeric and underline characters, so in the example above, we access the "status" node by using "xml.reply.status". If the tag name contains any other characters, you may have to switch to using "square bracket notation". An example of a name that falls into this category is a namespaced tag name (ex: <foo:bar>). To get at a child node with "foo:bar" as its tag name, you would have to use "square bracket notation" that looked like this:

var bar = xml.reply["foo:bar"]; // Assigns the object representing the foo:bar node to bar

Attributes also fall into this same category, because they are prefixed with a '@' character. So in the example above, the "type" attribute on the reply node is retrieved by using bracket notation:

var type = xml.reply["@type"] // type now has the value "error" in it.

There will be cases where a node has more than one child with the same tag name. You can see this with the messages XML format example above. The <messages> node has more than one <message> node child. In this specific case, the messages.message property is an array object. So if this sample code called documentToObject() with a document DOM that described the <messages> XML above:

var xml = Spry.Utils.documentToObject(domDoc); 

the JS object that is produced looks like:

var xml =
{
	"messages":
	{
		@count": "5",
		"message":
		[
			{
				"@id": "349348",
				"@recieved": "Thu, 12 Oct 2006 16:29:35 -0800",
				"@size": "1K",
				"@attachments": "false",
				"from":
				{
					"@id": "123456",
					"@address": "esmith@foo.com",
					"#text": "Edward Smith"
				},
				"subject":
				{
					"#text": "Donuts in the break room."
				}
			},
			{
				"@id": "351325",
				"@recieved": "Wed, 25 Oct 2006 14:12:13 -0800",
				"@size": "4K",
				"@attachments": "false",
				"from":
				{
					"@id": "127937",
					"@address": "njohnson@foo.com",
					"#text": "Neil Johnson"
				},
				"subject":
				{
					"#text": "Please send check via certified-express snail mail!"
				},
				"flags":
				{
					"@important": "true",
					"@needsreply": "true"
				}
			},
			{
				"@id": "456848",
				"@recieved": "Wed, 01 Nov 2006 09:35:57 -0800",
				"@size": "720K",
				"@attachments": "true",
				"from":
				{
					"@id": "126474",
					"@address": "swilliams@foo.com",
					"#text": "Steve Williams"
				},
				"subject":
				{
					"#text": "Sign the attached doc and return by Friday at the latest."
				},
				"flags":
				{
					"@returnreceipt": "true"
				}
			},
			{
				"@id": "456976",
				"@recieved": "Tue, 07 Nov 2006 14:36:04 -0800",
				"@size": "100K",
				"@attachments": "true",
				"from":
				{
					"@id": "120585",
					"@address": "jjones@foo.com",
					"#text": "John Jones",
				},
				"subject":
				{
					"#text": "Ship Party Details"
				}
			},
			{
				"@id": "567833",
				"@recieved": "Mon, 04 Dec 2006 18:24:05 -0800",
				"@size": "12K",
				"@attachments": "false",
				"from":
				{
					"@id": "127493",
					"@address": "jbrown@foo.com",
					"#text": "Joe Brown"
				},
				"subject":
				{
					"#text": "Check this out!"
				}
			}
		], 
	}
}

 so getting the value of xml.messages.message would give you back an array:

var xml = Spry.XML.documentToObject(docDOM);

...

var msg = xml.messages.message; // msg now contains an array of message objects.

So how can you tell if a property is going to contain an object or an array? You can call _propertyIsArray() function on the parent object to ask if a particular property is an array or not. It will return true if the value for the property is an array, or false if it is a single object or undefined.

var xml = Spry.XML.documentToObject(docDOM);

...

if (xml.messages.propertyIsArray("message"))
{
	// Array processing code.
}
else
{
	// Object processing code.
}

Or, you can just fetch the property as an array using _getPropertyAsArray() so you can always assume that it is an array:

var xml = Spry.XML.documentToObject(docDOM);

...

var arr = xml.message._getPropertyAsArray("message"); // arr now contains an array of zero or more length.

for (var i = 0; i < arr.length; i++)
{
	// process each message object.
}

The length of the resulting array will tell you how many <message> nodes are under the <messages> node. It is safe to call _getPropertyAsArray() for any property, even if it doesn't exist. It will always return an array of zero or more elements.