On 5/2/2016 12:57 PM, Jussi Piitulainen wrote:
DFS writes:

Have: list1 = ['\r\n   Item 1  ','  Item 2  ','\r\n  ']
Want: list1 = ['Item 1','Item 2']


I wrote this, which works fine, but maybe it can be tidier?

1. list2 = [t.replace("\r\n", "") for t in list1]   #remove \r\n
2. list3 = [t.strip(' ') for t in list2]            #trim whitespace
3. list1  = filter(None, list3)                     #remove empty items

After each step:

1. list2 = ['   Item 1  ','  Item 2  ','  ']   #remove \r\n
2. list3 = ['Item 1','Item 2','']              #trim whitespace
3. list1 = ['Item 1','Item 2']                 #remove empty items

Try filter(None, (t.strip() for t in list1)). The default.

Works and drops a line of code.  Thx.



Funny-looking data you have.

I know - sadly, it's actual data:

--------------------------------------------------------------------
from lxml import html
import requests

webpage = "http://www.usdirectory.com/ypr.aspx?fromform=qsearch&qs=TN&wqhqn=2&qc=Nashville&rg=30&qhqn=restaurant&sb=zipdisc&ap=2";

page  = requests.get(webpage)
tree  = html.fromstring(page.content)
addr1 = tree.xpath('//span[@class="text3"]/text()')
print 'Addresses: ', addr1
--------------------------------------------------------------------

I couldn't figure out a better way to extract it from the HTML (maybe XML and DOM?)
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to