XPath is a syntax for selecting nodes out of xml-formatted documents. You can look at the tutorials here and here. Below are some basic examples that should suffice for our assignments. Shown in the boxes are code snippets, followed by the output that results from running them. You can download the code and example xml file for these examples if you want to change and test them yourself.
</code> <bookstore location="Philadelphia"> <book category="COOKING"> <title lang="en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price> </book> <book category="CHILDREN"> <title lang="es">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> <dvd category="COMEDY"> <title lang="en">Legally Blonde</title> <year>2001</year> <price>9.95</price> </dvd> </bookstore> </code>
>> import lxml.etree
>> doc = lxml.etree.parse(open('example.xml'))
>> print "Bookstore locations"
>> for bookstore in doc.xpath('/bookstore'):
>> print bookstore.get('location')
Bookstore locations
Philadelphia
>> print "Book categories"
>> for book in doc.xpath('/book'):
>> print book.get('category')
Book categories
>> print "Book categories"
>> for book in doc.xpath('//book'):
>> print book.get('category')
Book categories
COOKING
CHILDREN
>> for book in doc.xpath('//dvd'):
>> print book.get('category')
DVD categories
COMEDY
>> print "All titles"
>> for title in doc.xpath('//title'):
>> print '%s available in language %s'%(title.text, title.get('lang'))
All titles
Everyday Italian available in language en
Harry Potter available in language es
Legally Blonde available in language en
>> print "Only DVD titles"
>> for title in doc.xpath('//dvd//title'):
>> print '%s available in language %s'%(title.text, title.get('lang'))
Only DVD titles
Legally Blonde available in language en