Jussi, Thanks it worked when parsed with json.load. However, it needed this decode('utf'):
data = json.loads(respData.decode('utf-8')) On Thu, Apr 7, 2016 at 6:01 AM, Jussi Piitulainen < jussi.piitulai...@helsinki.fi> wrote: > Emeka writes: > > > Hello All, > > > > import urllib.request > > import re > > > > url = 'https://www.everyday.com/ > > > > > > > > req = urllib.request.Request(url) > > resp = urllib.request.urlopen(req) > > respData = resp.read() > > > > > > paragraphs = re.findall(r'\[(.*?)\]',str(respData)) > > for eachP in paragraphs: > > print("".join(eachP.split(',')[1:-2])) > > print("\n") > > > > > > > > I got the below: > > "Coke - Yala Market Branch""NO. 113 IKU BAKR WAY YALA""" > > But what I need is > > > > 'Coke - Yala Market Branch NO. 113 IKU BAKR WAY YALA' > > > > How to I achieve the above? > > A couple of things you could do to understand your problem and work > around it: Change your code to print(eachP). Change your "".join to > "!".join to see where the commas were. Experiment with data of that form > in the REPL. Sometimes it's good to print repr(datum) instead of datum, > though not in this case. > > But are you trying to extract and parse paragraphs from a JSON response? > Do not use regex for that at all. Use json.load or json.loads to parse > it properly, and access the relevant data by indexing: > > x = json.loads('{"foo":[["Weather Forecast","It\'s Rain"],[]]}') > > x ==> {'foo': [['Weather Forecast', "It's Rain"], []]} > > x['foo'] ==> [['Weather Forecast', "It's Rain"], []] > > x['foo'][0] ==> ['Weather Forecast', "It's Rain"] > -- > https://mail.python.org/mailman/listinfo/python-list > -- P.S Please join our groups*: *nigeriaarduinogr...@googlegroups.com * or *jifunze-kufiki...@googlegroups.com These are platforms for learning and sharing of knowledge. www.satajanus.com | *Satajanus Nig. Ltd* -- https://mail.python.org/mailman/listinfo/python-list