Re: windows utf8 & lxml

2016-12-21 Thread Sayth Renshaw
On Tuesday, 20 December 2016 22:54:03 UTC+11, Sayth Renshaw  wrote:
> Hi 
> 
> I have been trying to get a script to work on windows that works on mint. The 
> key blocker has been utf8 errors, most of which I have solved.
> 
> Now however the last error I am trying to overcome, the solution appears to 
> be to use the .decode('windows-1252') to correct an ascii error.
> 
> I am using lxml to read my content and decode is not supported are there any 
> known ways to read with lxml and fix unicode faults?
> 
> The key part of my script is 
> 
> for content in roots:
> utf8_parser = etree.XMLParser(encoding='utf-8')
> fix_ascii = utf8_parser.decode('windows-1252')
> mytree = etree.fromstring(
> content.read().encode('utf-8'), parser=fix_ascii)
> 
> Without the added .decode my code looks like
> 
> for content in roots:
> utf8_parser = etree.XMLParser(encoding='utf-8')
> mytree = etree.fromstring(
> content.read().encode('utf-8'), parser=utf8_parser)
> 
> However doing it in such a fashion returns this error:
> 
> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: 
> invalid start byte
> Which I found this SO for http://stackoverflow.com/a/29217546/461887 but 
> cannot seem to implement with lxml.
> 
> Ideas?
> 
> Sayth

Why is windows so hard. Sort of running out of ideas, tried methods in the docs 
SO etc.

Currently

for xml_data in roots:
parser_xml = etree.XMLParser()
mytree = etree.parse(xml_data, parser_xml)

Returns
C:\Users\Sayth\Anaconda3\envs\race\python.exe 
C:/Users/Sayth/PycharmProjects/bs4race/race.py data/ -e *.xml
Traceback (most recent call last):
  File "C:/Users/Sayth/PycharmProjects/bs4race/race.py", line 100, in 
data_attr(rootObs)
  File "C:/Users/Sayth/PycharmProjects/bs4race/race.py", line 55, in data_attr
mytree = etree.parse(xml_data, parser_xml)
  File "src/lxml/lxml.etree.pyx", line 3427, in lxml.etree.parse 
(src\lxml\lxml.etree.c:81110)
  File "src/lxml/parser.pxi", line 1832, in lxml.etree._parseDocument 
(src\lxml\lxml.etree.c:118109)
  File "src/lxml/parser.pxi", line 1852, in lxml.etree._parseFilelikeDocument 
(src\lxml\lxml.etree.c:118392)
  File "src/lxml/parser.pxi", line 1747, in lxml.etree._parseDocFromFilelike 
(src\lxml\lxml.etree.c:117180)
  File "src/lxml/parser.pxi", line 1162, in 
lxml.etree._BaseParser._parseDocFromFilelike (src\lxml\lxml.etree.c:111907)
  File "src/lxml/parser.pxi", line 595, in 
lxml.etree._ParserContext._handleParseResultDoc (src\lxml\lxml.etree.c:105102)
  File "src/lxml/parser.pxi", line 702, in lxml.etree._handleParseResult 
(src\lxml\lxml.etree.c:106769)
  File "src/lxml/lxml.etree.pyx", line 324, in 
lxml.etree._ExceptionContext._raise_if_stored (src\lxml\lxml.etree.c:12074)
  File "src/lxml/parser.pxi", line 373, in 
lxml.etree._FileReaderContext.copyToBuffer (src\lxml\lxml.etree.c:102431)
io.UnsupportedOperation: read

Process finished with exit code 1

Thoughts?

Sayth
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: windows utf8 & lxml

2016-12-21 Thread Peter Otten
Sayth Renshaw wrote:

> On Tuesday, 20 December 2016 22:54:03 UTC+11, Sayth Renshaw  wrote:
>> Hi
>> 
>> I have been trying to get a script to work on windows that works on mint.
>> The key blocker has been utf8 errors, most of which I have solved.
>> 
>> Now however the last error I am trying to overcome, the solution appears
>> to be to use the .decode('windows-1252') to correct an ascii error.
>> 
>> I am using lxml to read my content and decode is not supported are there
>> any known ways to read with lxml and fix unicode faults?
>> 
>> The key part of my script is
>> 
>> for content in roots:
>> utf8_parser = etree.XMLParser(encoding='utf-8')
>> fix_ascii = utf8_parser.decode('windows-1252')
>> mytree = etree.fromstring(
>> content.read().encode('utf-8'), parser=fix_ascii)
>> 
>> Without the added .decode my code looks like
>> 
>> for content in roots:
>> utf8_parser = etree.XMLParser(encoding='utf-8')
>> mytree = etree.fromstring(
>> content.read().encode('utf-8'), parser=utf8_parser)
>> 
>> However doing it in such a fashion returns this error:
>> 
>> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0:
>> invalid start byte Which I found this SO for
>> http://stackoverflow.com/a/29217546/461887 but cannot seem to implement
>> with lxml.
>> 
>> Ideas?
>> 
>> Sayth
> 
> Why is windows so hard. 

I don't think this has anything to do with the OS. Your lxml_data is 
probably not what you think it is. Compare:

$ python3
Python 3.4.3 (default, Nov 17 2016, 01:08:31) 
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> import lxml.etree
>>> lxml.etree.parse(sys.stdout)
Traceback (most recent call last):
  File "", line 1, in 
  File "lxml.etree.pyx", line 3239, in lxml.etree.parse 
(src/lxml/lxml.etree.c:69955)
  File "parser.pxi", line 1769, in lxml.etree._parseDocument 
(src/lxml/lxml.etree.c:102257)
  File "parser.pxi", line 1789, in lxml.etree._parseFilelikeDocument 
(src/lxml/lxml.etree.c:102516)
  File "parser.pxi", line 1684, in lxml.etree._parseDocFromFilelike 
(src/lxml/lxml.etree.c:101442)
  File "parser.pxi", line 1134, in 
lxml.etree._BaseParser._parseDocFromFilelike (src/lxml/lxml.etree.c:97069)
  File "parser.pxi", line 582, in 
lxml.etree._ParserContext._handleParseResultDoc 
(src/lxml/lxml.etree.c:91275)
  File "parser.pxi", line 679, in lxml.etree._handleParseResult 
(src/lxml/lxml.etree.c:92426)
  File "lxml.etree.pyx", line 327, in 
lxml.etree._ExceptionContext._raise_if_stored (src/lxml/lxml.etree.c:10196)
  File "parser.pxi", line 373, in lxml.etree._FileReaderContext.copyToBuffer 
(src/lxml/lxml.etree.c:89083)
io.UnsupportedOperation: not readable

That looks similar to what you get.

> Sort of running out of ideas, tried methods in the
> docs SO etc.
> 
> Currently
> 
> for xml_data in roots:
> parser_xml = etree.XMLParser()
> mytree = etree.parse(xml_data, parser_xml)
> 
> Returns
> C:\Users\Sayth\Anaconda3\envs\race\python.exe
> C:/Users/Sayth/PycharmProjects/bs4race/race.py data/ -e *.xml Traceback
> (most recent call last):
>   File "C:/Users/Sayth/PycharmProjects/bs4race/race.py", line 100, in
>   
> data_attr(rootObs)
>   File "C:/Users/Sayth/PycharmProjects/bs4race/race.py", line 55, in
>   data_attr
> mytree = etree.parse(xml_data, parser_xml)
>   File "src/lxml/lxml.etree.pyx", line 3427, in lxml.etree.parse
>   (src\lxml\lxml.etree.c:81110) File "src/lxml/parser.pxi", line 1832, in
>   lxml.etree._parseDocument (src\lxml\lxml.etree.c:118109) File
>   "src/lxml/parser.pxi", line 1852, in lxml.etree._parseFilelikeDocument
>   (src\lxml\lxml.etree.c:118392) File "src/lxml/parser.pxi", line 1747, in
>   lxml.etree._parseDocFromFilelike (src\lxml\lxml.etree.c:117180) File
>   "src/lxml/parser.pxi", line 1162, in
>   lxml.etree._BaseParser._parseDocFromFilelike
>   (src\lxml\lxml.etree.c:111907) File "src/lxml/parser.pxi", line 595, in
>   lxml.etree._ParserContext._handleParseResultDoc
>   (src\lxml\lxml.etree.c:105102) File "src/lxml/parser.pxi", line 702, in
>   lxml.etree._handleParseResult (src\lxml\lxml.etree.c:106769) File
>   "src/lxml/lxml.etree.pyx", line 324, in
>   lxml.etree._ExceptionContext._raise_if_stored
>   (src\lxml\lxml.etree.c:12074) File "src/lxml/parser.pxi", line 373, in
>   lxml.etree._FileReaderContext.copyToBuffer
>   (src\lxml\lxml.etree.c:102431)
> io.UnsupportedOperation: read
> 
> Process finished with exit code 1
> 
> Thoughts?
> 
> Sayth


-- 
https://mail.python.org/mailman/listinfo/python-list


for loop iter next if file bad

2016-12-21 Thread Sayth Renshaw
Hi

I am looping a list of files and want to skip any empty files.
I get an error that str is not an iterator which I sought of understand but 
can't see a workaround for.

How do I make this an iterator so I can use next on the file if my test returns 
true.

Currently my code is.

for dir_path, subdir_list, file_list in os.walk(my_dir):
for name_pattern in file_list:
full_path = os.path.join(dir_path, name_pattern)


def return_files(file_list):
"""
Take a list of files and return file when called.

Calling function to supply attributes
"""
for file in file_list:
with open(os.path.join(dir_path, file), 'rb') as fd:
if os.stat(fd.name).st_size == 0:
next(file)
else:
yield fd

Exact error is:

C:\Users\Sayth\Anaconda3\envs\race\python.exe 
C:/Users/Sayth/PycharmProjects/bs4race/race.py data/ -e *.xml
Traceback (most recent call last):
  File "C:/Users/Sayth/PycharmProjects/bs4race/race.py", line 98, in 
data_attr(rootObs)
  File "C:/Users/Sayth/PycharmProjects/bs4race/race.py", line 51, in data_attr
for xml_data in roots:
  File "C:/Users/Sayth/PycharmProjects/bs4race/race.py", line 32, in 
return_files
next(file)
TypeError: 'str' object is not an iterator

Sayth
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: for loop iter next if file bad

2016-12-21 Thread Chris Angelico
On Wed, Dec 21, 2016 at 8:47 PM, Sayth Renshaw  wrote:
> def return_files(file_list):
> """
> Take a list of files and return file when called.
>
> Calling function to supply attributes
> """
> for file in file_list:
> with open(os.path.join(dir_path, file), 'rb') as fd:
> if os.stat(fd.name).st_size == 0:
> next(file)
> else:
> yield fd

"next" doesn't do what you think it does - it tries to step the thing
you give it, as an iterator. I think you might want "continue"?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: for loop iter next if file bad

2016-12-21 Thread Sayth Renshaw
Ah yes. Thanks ChrisA

http://www.tutorialspoint.com/python/python_loop_control.htm

The continue Statement:
The continue statement in Python returns the control to the beginning of the 
while loop. The continue statement rejects all the remaining statements in the 
current iteration of the loop and moves the control back to the top of the loop.

The continue statement can be used in both while and for loops.

Sayth
-- 
https://mail.python.org/mailman/listinfo/python-list


RE: for loop iter next if file bad

2016-12-21 Thread Joaquin Alzola

>def return_files(file_list):
>"""
>Take a list of files and return file when called.
>
>Calling function to supply attributes
>"""
>for file in file_list:
>with open(os.path.join(dir_path, file), 'rb') as fd:
>if os.stat(fd.name).st_size == 0:
>next(file)
>else:
>yield fd

>Exact error is:

>C:\Users\Sayth\Anaconda3\envs\race\python.exe 
>C:/Users/Sayth/PycharmProjects/bs4race/race.py data/ -e *.xml
>Traceback (most recent call last):
 > File "C:/Users/Sayth/PycharmProjects/bs4race/race.py", line 98, in 
>data_attr(rootObs)
>  File "C:/Users/Sayth/PycharmProjects/bs4race/race.py", line 51, in data_attr
>for xml_data in roots:
>  File "C:/Users/Sayth/PycharmProjects/bs4race/race.py", line 32, in 
> return_files
>next(file)
>TypeError: 'str' object is not an iterator

The iterator is file_list not file.
File is a str. As the exception mentions.

I suppose you want there a "continue" to iterate to the next value in the 
file_list



This email is confidential and may be subject to privilege. If you are not the 
intended recipient, please do not copy or disclose its content but contact the 
sender immediately upon receipt.
-- 
https://mail.python.org/mailman/listinfo/python-list


SQLAlchemy and Postgres

2016-12-21 Thread Ethan Furman

There's a question over on SO [1] asking about an interaction between
SQLAlchemy and postgres which may be related to an SQLA upgrade from
1.0 to 1.1 and the generation of a check clause

Sadly, I don't have any experience with SQLAlchemy -- anybody here
want to take a crack at it?

--
~Ethan~

[1] http://stackoverflow.com/q/41258376/208880
--
https://mail.python.org/mailman/listinfo/python-list


Re: [OT] "Invisible posts", was: Best attack order for groups of numbers trying to destroy each other, given a victory chance for number to number attack.

2016-12-21 Thread skybuck2000
On Thursday, December 15, 2016 at 1:17:10 PM UTC+1, Peter Otten wrote:
> skybuck2...@hotmail.com wrote:
> 
> > I received a reply from somebody on my ISP newsserver. Apperently his
> > reply is not visible on google groups. I wonder why, maybe it's a banned
> > troll or something, but perhaps not.
> 
> No, that's Dennis Lee Bieber who doesn't want his posts to be kept, and 
> seems to be the only one to post here with the X-No-Archive flag set.

I find this weird behaviour by google newsgroup.

At least google newsgroups could show his postings for something more 
reasonable like 30 days or something, instead of not showing it at all...

Totally weird. What's the point in posting then if software would not display 
it at all ? ;)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: topology rules in python

2016-12-21 Thread Fabien

On 12/21/2016 07:11 AM, Bernd Nawothnig wrote:

On 2016-12-20, Xristos Xristoou wrote:

I have a PostGIS database with shapefiles lines, polygons and points
and I want to create a topology rules with python. Any idea how to do
that ?some packages ?

http://www.gdal.org/

or:

pip install gdal


also: shapely, geopandas


--
https://mail.python.org/mailman/listinfo/python-list


Mergesort problem

2016-12-21 Thread Deborah Swanson
I'm not a beginning python coder, but I'm not an advanced one either. I
can't see why I have this problem, though at this point I've probably
been looking at it too hard and for too long (several days), so maybe
I'm just too close to it.
Can one of you guys see the problem (besides my childish coding)? I'll
give you the code first, and then the problem.

def moving():
import csv
ls = []
with open('E:\\Coding projects\\Pycharm\\Moving\\New Listings.csv',
'r') as infile:
raw = csv.reader(infile)
indata = list(raw)
rows = indata.__len__()
for i in range(rows):
ls.append(indata[i])
# sort: Description only, to make hyperelinks & find duplicates
mergeSort(ls)
# find & mark dups, make hyperlink if not dup
for i in range(1, len(ls) - 1):
if ls[i][0] == ls[i + 1][0]:
ls[i][1] = "dup"
else:
# make hyperlink
desc = ls[i][0]
url = ls[i][1]
ls[i][0] = '=HYPERLINK(\"' + url + '\",\"' + desc + '\")'
# save to csv
ls.insert(0, ["Description","url"])
with open('E:\\Coding projects\\Pycharm\\Moving\\Moving 2017
out.csv', 'w') as outfile:
writer = csv.writer(outfile, lineterminator='\n')
writer.writerows(ls)

import operator
def mergeSort(L, compare = operator.lt):
if len(L) < 2:
return L[:]
else:
middle = int(len(L)/2)
left = mergeSort(L[:middle], compare)
right = mergeSort(L[middle:], compare)
return merge(left, right, compare)

def merge(left, right, compare):
result = []
i,j = 0, 0
while i < len(left) and j < len(right):
if compare(left[i], right[j]):
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
while (i < len(left)):
result.append(left[i])
i += 1
while (j < len(right)):
result.append(right[j])
j += 1
return result

moving()

The problem is that while mergeSort puts the list ls in perfect order,
which I can see by looking at result on merge's final return to
mergeSort, and at the left and the right once back in mergeSort. Both
the left half and the right half are in order. But the list L is still
in its original order, and after mergeSort completes, ls is still in its
original order. Maybe there's some bonehead error causing this, but I
just can't see it.

I can provide a sample csv file for input, if you want to execute this,
but to keep things simple, you can see the problem in just a table with
webpage titles in one column and their urls in the second column.

Any insights would be greatly appreciated.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Mergesort problem

2016-12-21 Thread Chris Angelico
On Thu, Dec 22, 2016 at 11:55 AM, Deborah Swanson
 wrote:
> The problem is that while mergeSort puts the list ls in perfect order,
> which I can see by looking at result on merge's final return to
> mergeSort, and at the left and the right once back in mergeSort. Both
> the left half and the right half are in order. But the list L is still
> in its original order, and after mergeSort completes, ls is still in its
> original order. Maybe there's some bonehead error causing this, but I
> just can't see it.
>

Your analysis is excellent. Here's what happens: When you merge-sort,
you're always returning a new list (either "return L[:]" or "result =
[]"), but then you call it like this:

# sort: Description only, to make hyperelinks & find duplicates
mergeSort(ls)

This calls mergeSort, then drops the newly-sorted list on the floor.
Instead, try: "ls = mergeSort(ls)".

Thank you for making it so easy for us!

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Mergesort problem

2016-12-21 Thread Deborah Swanson
> On Thu, Dec 22, 2016 at 11:55 AM, Deborah Swanson 
>  wrote:
> > The problem is that while mergeSort puts the list ls in 
> perfect order, 
> > which I can see by looking at result on merge's final return to 
> > mergeSort, and at the left and the right once back in 
> mergeSort. Both 
> > the left half and the right half are in order. But the list 
> L is still 
> > in its original order, and after mergeSort completes, ls is 
> still in 
> > its original order. Maybe there's some bonehead error causing this, 
> > but I just can't see it.
> >
> 
> Your analysis is excellent. Here's what happens: When you 
> merge-sort, you're always returning a new list (either 
> "return L[:]" or "result = []"), but then you call it like this:
> 
> # sort: Description only, to make hyperelinks & find duplicates
> mergeSort(ls)
> 
> This calls mergeSort, then drops the newly-sorted list on the 
> floor. Instead, try: "ls = mergeSort(ls)".
> 
> Thank you for making it so easy for us!
> 
> ChrisA

"ls = mergeSort(ls)" works perfectly!

I can see why now, but I'm not sure how long I would have knocked my
head against it before I saw it on my own. It must take awhile to
develop an eye for these things.

So thank you from the bottom of my heart! I do have a future in python
coding planned, but right now I need to find the cheapest nice little
house to move to, and this sorting problem was a major roadblock! The
webpage titles and urls are from Craigslist, soon to be joined by many
other fields, but I just couldn't get past this one problem.

-- 
https://mail.python.org/mailman/listinfo/python-list