Programatically clean/ignore namespaces in XML

Programatically clean/ignore namespaces in XML - python -

July 15, 2011

i'm trying write simple program read financial xml files gnucash, , learn python in process.

the xml looks this:

<?xml version="1.0" encoding="utf-8" ?> <gnc-v2      xmlns:gnc="http://www.gnucash.org/xml/gnc"      xmlns:act="http://www.gnucash.org/xml/act"      xmlns:book="http://www.gnucash.org/xml/book"      {...}      xmlns:vendor="http://www.gnucash.org/xml/vendor"> <gnc:count-data cd:type="book">1</gnc:count-data> <gnc:book version="2.0.0"> <book:id type="guid">91314601aa6afd17727c44657419974a</book:id> <gnc:count-data cd:type="account">80</gnc:count-data> <gnc:count-data cd:type="transaction">826</gnc:count-data> <gnc:count-data cd:type="budget">1</gnc:count-data> <gnc:commodity version="2.0.0">   <cmdty:space>iso4217</cmdty:space>   <cmdty:id>brl</cmdty:id>   <cmdty:get_quotes/>   <cmdty:quote_source>currency</cmdty:quote_source>   <cmdty:quote_tz/> </gnc:commodity>

right now, i'm able iterate , results using

import xml.etree.elementtree et  r = et.parse("file.xml").findall('.//')

after manually cleaning namespaces, i'm looking solution either read entries regardless of namespaces or remove namespaces before parsing.

note i'm complete noob in python, , i've read: python , gnucash: extract data gnucash files, cleaning xml file in python before parsing , python: xml.etree.elementtree, removing "namespaces" along elementtree docs , i'm still lost...

i've come solution:

def strip_namespaces(self, tree):      nspopen = re.compile("<\w*:", re.ignorecase)     nspclose = re.compile("<\/\w*:", re.ignorecase)      in tree:         start = re.sub(nspopen, '<', tree.tag)                   end = re.sub(nspopen, '<\/', tree.tag)      # pprint(finaltree)     return

but i'm failing apply it. can't seem able retrieve tag names appear on file.

Search This Blog

Three

Programatically clean/ignore namespaces in XML - python -

Comments

Post a Comment

Popular posts from this blog

Socket.connect doesn't throw exception in Android -

SPSS keyboard combination alters encoding -

iphone - How do I keep MDScrollView from truncating my row headers and making my cells look bad? -