python - Get text of next sibling based on text of previous sibling -
i have following html:
<div id="infotable"> <h4> user </h4> <table> <tbody> <tr> <td class="name"> <a href="/userpage/123">billybob12345</a> </td> </tr> <tr> <td class="name"> <a href="/userpage/124">jimbob43</a> </td> </tr> </tbody> </table> <h4> super user </h4> <table> <tbody> <tr> <td class="name"> <a href="/userpage/112">cookiemonster</a> </td> </tr> </tbody> </table> </div>
basically, looking 2 lists:
users = [{"billybob12345" : "123"}, {"jimbob43" : "124"}] superusers = [{"cookiemonster" : "112"}]
i using python 2.7 beautifulsoup4 , able find of users, can't split them respectful groups.
if happen know in order, use list comprehension create lists of dictionaries, parsing "userpage" number using .split('/')
:
firsttable = soup.findall('table')[0] users = [{a.text : a['href'].split('/')[2]} in firsttable.findall('a')] secondtable = soup.findall('table')[1] superusers = [{a.text : a['href'].split('/')[2]} in secondtable.findall('a')]
>>> users [{'billybob12345': '123'}, {'jimbob43': '124'}] >>> superusers [{'cookiemonster': '112'}]
if want access name "users" use dictionary, can use:
>>> firsttable.previoussibling.previoussibling <h4> user </h4>
Comments
Post a Comment