On Sun, May 8, 2016 at 5:24 PM, DFS <nos...@dfs.com> wrote: > On 5/8/2016 7:36 AM, Steven D'Aprano wrote: >> >> On Sun, 8 May 2016 11:16 am, DFS wrote: >> >>> address data is scraped from a website: >>> >>> names = tree.xpath() >>> addr = tree.xpath() >> >> >> Why are you scraping the data twice? > > > > Because it exists in 2 different sections of the document. > > names = tree.xpath('//span[@class="header_text3"]/text()') > addresses = tree.xpath('//span[@class="text3"]/text()') > > > I thought you were a "master who knew her tools", and I was the apprentice? > > So why did "the master" think xpath() was magic? > > > > > > >> names = addr = tree.xpath() >> >> or if you prefer the old-fashioned: >> >> names = tree.xpath() >> addr = names >> >> but that raises the question, how can you describe the same set of data as >> both "names" and "addr[esses]" and have them both be accurate? >> >> >>> I want to store the data atomically, >> >> >> I'm not really sure what you mean by "atomically" here. I know what *I* >> mean >> by "atomically", which is to describe an operation which either succeeds >> entirely or fails. > > > That's atomicity. > > > >> But I don't know what you mean by it. > > http://www.databasedesign-resource.com/atomic-database-values.html > > > >>> so I parse street, city, state, and >>> zip into their own lists. >> >> >> None of which is atomic. > > > All of which are atomic. > > > >>> "1250 Peachtree Rd, Atlanta, GA 30303 >>> >>> street = [s.split(',')[0] for s in addr] >>> city = [c.split(',')[1].strip() for c in addr] >>> state = [s[-8:][:2] for s in addr] >>> zipcd = [z[-5:] for z in addr] >> >> >> At this point, instead of iterating over the same list four times, doing >> the >> same thing over and over again, you should do things the old-fashioned >> way: >> >> streets, cities, states, zipcodes = [], [], [], [] >> for word in addr: >> items = word.split(',') >> streets.append(items[0]) >> cities.append(items[1].strip()) >> states.append(word[-8:-2]) >> zipcodes.append(word[-5:]) > > > > > That's a good one. > > Chris Angelico mentioned something like that, too, and I already put it > place. > > > >> Oh, and use better names. "street" is a single street, not a list of >> streets, note plural. > > > > I'll use whatever names I like. > > > > > > -- > https://mail.python.org/mailman/listinfo/python-list
Starting to look like trolling. Lots of good advice here. If you ask, and don't like the advice, don't use it. -- Joel Goldstick http://joelgoldstick.com/blog http://cc-baseballstats.info/stats/birthdays -- https://mail.python.org/mailman/listinfo/python-list