python - What is an ElementTree object exactly, and how can I get data from it? -
i'm trying teach myself how parse xml. i've read lxml tutorials, they're hard understand. far, can do:
>>> lxml import etree >>> xml=etree.parse('ham.xml') >>> xml <lxml.etree._elementtree object @ 0x118de60>
but how can data object? can't indexed xml[0]
, , can't iterated over.
more specifically, i'm using this xml file , i'm trying extract, say, between <l>
tags that's surrounded <sp>
tags contain, say, barnardo
attribute.
it elementtree element
object.
you can @ lxml api documentation, has lxml.etree._element
page. page tells every single attribute , method on class ever want know about.
i'd start reading lxml.etree
tutorial, however.
if element cannot indexed, however, empty tag, , there no child nodes retrieve.
to find lines bernardo
, xpath expression needed, namespace map. doesn't matter prefix use, long non-empty string lxml
map correct namespace url:
nsmap = {'s': 'http://www.tei-c.org/ns/1.0'} line in tree.xpath('.//s:sp[@who="barnardo"]/s:l/text()', namespaces=nsmap): print line.strip()
this extracts text in <l>
elements contained in <sp who="barnardo">
tags. note s:
prefixes on tag names, nsmap
dictionary tells lxml
namespace use. printed these without surrounding whitespace.
for sample document, gives:
>>> line in tree.xpath('.//s:sp[@who="barnardo"]/s:l/text()', namespaces=nsmap): ... print line.strip() ... who's there? long live king! he. 'tis struck twelve; thee bed, francisco. have had quiet guard? well, night. if meet horatio , marcellus, rivals of watch, bid them make haste. say, what, horatio there? welcome, horatio: welcome, marcellus. have seen nothing. sit down awhile; , let once again assail ears, fortified against our story have 2 nights seen. last night of all, when yond same star that's westward pole had made course illume part of heaven burns, marcellus , myself, bell beating one, in same figure, king that's dead. looks 'a not king? mark it, horatio. spoke to. see, stalks away! how now, horatio! tremble , pale: not more fantasy? think on't? think no other e'en so: may sort portentous figure comes armed through our watch; king , question of these wars. 'tis here! speak, when cock crew.
Comments
Post a Comment