python - Scraping Multiple Pages On A Website -
i'm trying scrape list of coaching instiututes on thsi url: https://www.sulekha.com/entrance-exam-coaching/delhi
the following python code:
import bs4 urllib.request import urlopen ureq bs4 import beautifulsoup soup my_url = 'https://www.sulekha.com/entrance-exam-coaching/delhi' uclient = ureq(my_url) page_html = uclient.read() uclient.close() x page_soup = soup(page_html, "lxml") insti = page_soup.findall("div", {"class": "list-title"}) filename = "entrance_institutes.csv" f = open(filename, "w") headers = "institute \n" f.write(headers) ins in insti: ins_name = ins.div.a["title"] f.write(ins_name + "\n") f.close()
this code runs fine. attached image of csv generates. how should go scraping listings 1 page after other ?
thanks
i'm not 100% sure mean. if you're asking how fix bug in code need change loop to:
for ins in insti: ins_name = ins.div.a["title"] f.write(ins_name + "\n")
as code loop through , write last 1 due write not being in loop.
however if you're asking how take list , scrap that's more involved , starters need save url rather title i'm going leave rest because kind of sounds homework.
Comments
Post a Comment