html - How to read text from a website in Python -


this question has answer here:

i read of information website: http://www.federalreserve.gov/monetarypolicy/beigebook/beigebook201301.htm

i have following code, , reads html source

def connect2web():     aresp = urllib2.urlopen("http://www.federalreserve.gov/monetarypolicy/" +      "beigebook/beigebook201301.htm")      web_pg = aresp.read()      print web_pg 

i lost on how parse information, however, because html parsers require file or original website, whereas have information need in string.

we started bs time ago moved lxml

from lxml import html my_tree = html.fromstring(web_pg) elements = [item item in my_tree.iter()] 

so have decide elements want , need make sure elements keep not children of other elements decide want keep instance

<div> stuff <table> <tr> <td> banana </td> </tr> <table> more stuff </div> 

the html above table child of div in table contained in div have use logic keep elements parents not kept


Comments

Popular posts from this blog

SPSS keyboard combination alters encoding -

Add new record to the table by click on the button in Microsoft Access -

CSS3 Transition to highlight new elements created in JQuery -