XPath (XML Path Language) can be used to parse highly complex XML structures without the need to iterate over each node independently.
A likely use case is to select the nth sibling or nth element from a list. Luckily, this can be easily achieved by using position(). To demonstrate the usage imagine the following XML structure (courtesy of http://onjava.com/pub/a/onjava/2005/01/12/xpath.html):
<?xml version="1.0" encoding="UTF-8"?> <catalog xmlns:journal="http://www.w3.org/2001/XMLSchema-Instance" > <journal:journal title="XML" publisher="IBM developerWorks"> <article journal:level="Intermediate" date="February-2003"> <title>Design XML Schemas Using UML</title> <author>Ayesha Malik</author> </article> </journal:journal> <journal title="Java Technology" publisher="IBM developerWorks"> <article level="Advanced" date="January-2005"> <title>Design service-oriented architecture frameworks with J2EE technology Scn. Edit.</title> <author>Naveen Balani</author> </article> <article level="Advanced" date="January-2004"> <title>Design service-oriented architecture frameworks with J2EE technology</title> <author>Naveen Balani</author> </article> <article level="Advanced" date="October-2003"> <title>Advance DAO Programming</title> <author>Sean Sullivan</author> </article> </journal> </catalog>
Parsing nth Sibling
In order to parse the second article from the journal “Java Technology” you would need to following XPath expression:
/catalog/journal[@title='Java Technology']/article[position()=1]
Of course you can add any additional constraints:
/catalog/journal[@title='Java Technology']/article[@level='Advanced' and position()=1]
which yields the same result:
<article level="Advanced" date="January-2004"> <title>Design service-oriented architecture frameworks with J2EE technology</title> <author>Naveen Balani</author> </article>
Testing XPath Expressions
Especially when working with more complex expressions you will definitely want to test them on-the-fly. Luckily, http://www.xpathtester.com/test provides a very handy interface to test your XPath expressions.
Happy parsing!XPath (XML Path Language) can be used to parse highly complex XML structures without the need to iterate over each node independently.
A likely use case is to select the nth sibbling or nth element from a list. Luckily, this can be easily achieved by using position(). To demonstrate the usage imagine the following XML structure (courtesy of http://onjava.com/pub/a/onjava/2005/01/12/xpath.html):
<?xml version="1.0" encoding="UTF-8"?> <catalog xmlns:journal="http://www.w3.org/2001/XMLSchema-Instance" > <journal:journal title="XML" publisher="IBM developerWorks"> <article journal:level="Intermediate" date="February-2003"> <title>Design XML Schemas Using UML</title> <author>Ayesha Malik</author> </article> </journal:journal> <journal title="Java Technology" publisher="IBM developerWorks"> <article level="Advanced" date="January-2005"> <title>Design service-oriented architecture frameworks with J2EE technology Scn. Edit.</title> <author>Naveen Balani</author> </article> <article level="Advanced" date="January-2004"> <title>Design service-oriented architecture frameworks with J2EE technology</title> <author>Naveen Balani</author> </article> <article level="Advanced" date="October-2003"> <title>Advance DAO Programming</title> <author>Sean Sullivan</author> </article> </journal> </catalog>
In order to parse the second article from the journal “Java Technology” you would need to following XPath expression:
/catalog/journal[@title='Java Technology']/article[position()=1]
Of course you can add any additional constraints:
/catalog/journal[@title='Java Technology']/article[@level='Advanced' and position()=1]
which yields the same result:
<article level="Advanced" date="January-2004"> <title>Design service-oriented architecture frameworks with J2EE technology</title> <author>Naveen Balani</author> </article>
Testing XPath Expressions
http://www.xpathtester.com/test provides a very handy way to test you XPath expressions.
Leave a Reply