Hi, I am facing an issue with listing specific urls inside web page,
https://economictimes.indiatimes.com/archive.cms Page contains link urls by year and month vise, Ex: /archive/year-2001,month-1.cms I am able to list all required urls using the below code, from bs4 import BeautifulSoup import re, csv import urllib.request import scrapy req = urllib.request.Request('http://economictimes.indiatimes.com/archive.cms', headers={'User-Agent': 'Mozilla/5.0'}) links = [] totalPosts = [] url = "http://economictimes.indiatimes.com" data = urllib.request.urlopen(req).read() page = BeautifulSoup(data,'html.parser') for link in page.findAll('a', href = re.compile('^/archive/')): //retrieving urls starts with "archive" l = link.get('href') links.append(url+l) with open("output.txt", "a") as f: for post in links: post = post + '\n' f.write(post) *sample result in text file:* http://economictimes.indiatimes.com/archive/year-2001,month-1.cmshttp://economictimes.indiatimes.com/archive/year-2001,month-2.cmshttp://economictimes.indiatimes.com/archive/year-2001,month-3.cmshttp://economictimes.indiatimes.com/archive/year-2001,month-4.cmshttp://economictimes.indiatimes.com/archive/year-2001,month-5.cmshttp://economictimes.indiatimes.com/archive/year-2001,month-6.cms List of urls I am storing in a text file, From the month urls I want to retrieve day urls starts with "/archivelist", I am using the below code, but I am not getting any result, If I check with inspect element the urls are available starting with /archivelist, <a href="/archivelist/year-2001,month-3,starttime=36951.cms"></a> Kindly help me where I am doing wrong. from bs4 import BeautifulSoup import re, csv import urllib.request import scrapy file = open("output.txt", "r") for i in file: urls = urllib.request.Request(i, headers={'User-Agent': 'Mozilla/5.0'}) data1 = urllib.request.urlopen(urls).read() page1 = BeautifulSoup(data1, 'html.parser') for link1 in page1.findAll(href = re.compile('^/archivelist/')): l1 = link1.get('href') print(l1) Thanks, Kishore. -- https://mail.python.org/mailman/listinfo/python-list