python - `pyparsing`: iterating over `ParsedResults` -

May 15, 2011

i've started using pyparsing evening , i've built complex grammar describes sources i'm working effectively. easy , powerful. however, i'm having trouble working parsedresults. need able iterate on nested tokens in order they're found, , i'm finding little frustrating. i've abstracted problem simple case:

import pyparsing pp  word = pp.word(pp.alphas + ',.')('word*') direct_speech = pp.suppress('“') + pp.group(pp.oneormore(word))('direct_speech*') + pp.suppress('”') sentence = pp.group(pp.oneormore(word | direct_speech))('sentence')  test_string = 'lorem ipsum “dolor sit” amet, consectetur.'  r = sentence.parsestring(test_string)  print r.asxml('div')  print ''  name, item in r.sentence.items():     print name, item  print ''  item in r.sentence:     print item.getname(), item.aslist()

as far can see, ought work? here output:

<div>   <sentence>     <word>lorem</word>     <word>ipsum</word>     <direct_speech>       <word>dolor</word>       <word>sit</word>     </direct_speech>     <word>amet,</word>     <word>consectetur.</word>   </sentence> </div>  word ['lorem', 'ipsum', 'amet,', 'consectetur.'] direct_speech [['dolor', 'sit']]  traceback (most recent call last):   file "./test.py", line 27, in <module>     print item.getname(), item.aslist() attributeerror: 'str' object has no attribute 'getname'

the xml output seems indicate string parsed wish, can't iterate on sentence, example, reconstruct it.

is there way need to?

thanks!

edit:

i've been using this:

for item in r.sentence:     if isinstance(item, basestring):         print item     else:         print item.getname(), item

but doesn't me much, because can't distinguish different types of string. here expanded example:

word = pp.word(pp.alphas + ',.')('word*') number = pp.word(pp.nums + ',.')('number*')  direct_speech = pp.suppress('“') + pp.group(pp.oneormore(word | number))('direct_speech*') + pp.suppress('”') sentence = pp.group(pp.oneormore(word | number | direct_speech))('sentence')  test_string = 'lorem 14 ipsum “dolor 22 sit” amet, consectetur.'  r = sentence.parsestring(test_string)  i, item in enumerate(r.sentence):     if isinstance(item, basestring):         print i, item     else:         print i, item.getname(), item

the output is:

0 lorem 1 14 2 ipsum 3 word ['dolor', '22', 'sit'] 4 amet, 5 consectetur.

not helpful. can't distinguish between word , number, , direct_speech element labelled word?!

i'm missing something. want is:

for item in r.sentence:     if (item number):             elif (item word):         else etc. ...

should approaching differently?

r.sentence contains mix of strings , parseresults, , parseresults support getname(). have tried iterating on r.sentence? if print out using aslist(), get:

['lorem', 'ipsum', ['dolor', 'sit'], 'amet,', 'consectetur.']

or snippet:

for item in r.sentence:     print type(item),item.aslist() if isinstance(item,pp.parseresults) else item

gives:

<type 'str'> lorem <type 'str'> ipsum <class 'pyparsing.parseresults'> ['dolor', 'sit'] <type 'str'> amet, <type 'str'> consectetur.

i'm not sure answered question, shed light on go next?

(welcome pyparsing)

Search This Blog

Three

python - `pyparsing`: iterating over `ParsedResults` -

Comments

Post a Comment

Popular posts from this blog

.htaccess - First slash is removed after domain when entering a webpage in the browser -

Automatically create pages in phpfox -

c# - Farseer ContactListener is not working -