Via Martin, PugXML looks like a neat project:
Presented is a small, fast, non-validating DOM XML parser, contained in a single header, having no dependencies other than the standard C libraries, and <iostream> (KERNEL32.DLL with WIN32). This XML parser segments a given string in situ (like strtok), performing scanning/tokenization, and parsing in a single pass. Preliminary analysis shows a best-case of 22 X86 CPU clock cycles average per input byte, and a worst-case of 108 CPU cycles...
Cool. What other project do you know that meaures its efficency in best case and worse case numbers of machine instructions? And it has a full finite state machine diagram and everything! Rock on. Full (and very small! 29k) source code available.
Side note:
... however parsers checking for well-formedness would choke on it. Because this is not compliant behavior, the PugXML parser, and parsers taking a similar approach would be indicated for situations requiring the parsing of machine-generated, well-formed fully-compliant XML documents, and not human-generated documents.
One of the problems with XML is there's just so much of it out there that isn't.
Feel free to post a comment below. Please see my comment policy.
Formatting Rules (No HTML):