Wednesday, December 31, 2014

MicroXPath library for Arduino

There are probably hundreds of C++ XML parsers out there, but reinventing the wheel is fun. MicroXPath is a state machine which purpose is to enable XML navigation on the Arduino platform while keeping the memory footprint as small as possible. With only 2048 bytes of SRAM on the Arduino Uno conserving memory is quite important. The PROGMEM version of MicroXPath uses no more than 9 bytes of RAM. The length of the search strings is irrelevant, but two bytes are consumed (to store a PROGMEM pointer) for each XPath level when calling setPath. This means that no more than 15 bytes of RAM is used when searching for an XPath that is three levels deep.

Why XML?

Sadly XML is widely used for data exchange on the internet and it’s likely that you, at some point, will find yourself creating a project where reading data from an XML data source is useful. For me it was a project that I’m currently working on, where I need to read the status of my Sonos speakers, that triggered the need. The Sonos speaker system uses UPnP which in turn uses SOAP over HTTP.

Usage:

The library needs no configuration. There is one preprocessor directive called XML_PICO_MODE which is on by default and disables XML validation and error tracking.

When using the library, start by setting the XML search path by calling the “setPath” function. Don’t forget to specify the path depth (length of the search path string array). If the path is “menu”, “food”, “name”, the pathSize should be set to 3. When you have configured the path you can start passing characters to the parser using the “findValue” function. “findValue” returns true when the specified path is matched and the parser reaches the element end tag character. This means that the next character is the first character of the XML element content.

If you would simply like to get the text content of the XML element matching the specified XML path you can use the “getValue” function. The “getValue” function works in the same way that “findValue” does, but it takes a pointer to an output buffer and an output buffer size in addition to the character being parsed. It will return true when it reaches the end tag of the matched XML element. Text content will be written to the output buffer as long as there is room.

When searching for several paths within one XML stream you must search for them in the same order that they occur in the XML and you must change the path by calling “setPath” as soon as you are done reading content. Given the following XML:

<menu>
  <food>
    <english>
      <name>Toast</name>
    </ english >
    <price>$5.95</price>
  </food>
</menu>

If you are getting both the name “Toast” and the price “$5.95” from the XML stream, your code should look something like this:

char result[10] = “”;
// Set path of the first element
xPath.setPath((const char *[]){ " menu", "food", "english", "name" }, 4);
// Read until the matching element end tag is found or end of stream
while (client.available() && !xPath.getValue(client.read(), result, sizeof(result)));
Serial.print("Name: ");
Serial.println(result);
// Set path of the second element
xPath.setPath((const char *[]){ "menu", "food", "price" }, 3);
// Read until the matching element end tag is found or end of stream
while (client.available() && !xPath.getValue(client.read(), result, sizeof(result)));
Serial.print("Price: ");
Serial.println(result);

How does it work?

The parser is a state machine which cycles to a set of predefined states while reading the XML file character by character. For example: When at the root level and a < (less than) character is read, the state changes from XML_PARSER_ROOT to XML_PARSER_START_TAG. In addition to the parser state, to be able to match on a specific XML path, the parser needs to keep track of the current node level in the XML node tree, the match node level, the element name character position and the element name match position. This is all the state that is needed: 5 bytes + the single character being read.

Limitations:

  • The library does currently not support finding or reading XML attributes. I did not need this for my Sonos project so I decided not to spend time on it. The feature can be easily added. If you need it or would like to contribute: please contact me by emailing me (use the email address included in license of all source and example files).
  • Because the parser can only parse the XML stream once, top-down, you must know the order of the elements on beforehand when searching for multiple paths within the same stream.

Downloads:

Tool used to calculate RAM usage:

1 comment :

  1. Still it means that you pay much more for energy during the cold months, but it implies that you are prepared to spend far less throughout the remaining year. This protects one bit, and it has consistently slightly cheaper energy. It's saved; you can then use to deal with the higher energy prices during the cold months. Most Norwegians (60%) have selected this particular answer, or even the 3rd third solution strøm test.

    ReplyDelete