Re: Where is the usage of (list comprehension) documented?

2018-01-15 Thread Ned Batchelder

On 1/14/18 9:57 PM, Dan Stromberg wrote:

On Sun, Jan 14, 2018 at 3:01 PM, Peng Yu  wrote:

Hi,

I see the following usage of list comprehension can generate a
generator. Does anybody know where this is documented? Thanks.

Here's the (a?) generator expression PEP:
https://www.python.org/dev/peps/pep-0289/

Here's a presentation I put together on this and related topics a while back:
http://stromberg.dnsalias.org/~strombrg/Intro-to-Python/Python%20Generators,%20Iterators%20and%20Comprehensions%202014.pdf

FWIW, [a for a in range(2)] is a list comprehension; it's eager. And
(a for a in range(2)) is a generator expression; it's lazy.



I really wish these were called generator comprehensions.  I don't 
understand why they are not.


    [ 2*x for x in range(10) ] # makes a list
    ( 2*x for x in range(10) ) # makes a generator

--Ned.
--
https://mail.python.org/mailman/listinfo/python-list


Re: Where is the usage of (list comprehension) documented?

2018-01-15 Thread Thomas Jollans
On 2018-01-15 00:01, Peng Yu wrote:
> Hi,
> 
> I see the following usage of list comprehension can generate a
> generator. Does anybody know where this is documented? Thanks.
> 
> $ cat main.py
> #!/usr/bin/env python
> 
> import sys
> lines = (line.rstrip('\n') for line in sys.stdin)
> print lines
> 
> lines = [line.rstrip('\n') for line in sys.stdin]
> print lines
> $ seq 10 | ./main.py
>  at 0x1101ecd70>
> ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
> 

Generator expressions are of course documented in the language reference:

https://docs.python.org/3/reference/expressions.html#generator-expressions

The official docs.python.org tutorial also explains them:

https://docs.python.org/3/tutorial/classes.html#generator-expressions
-- 
https://mail.python.org/mailman/listinfo/python-list


Can utf-8 encoded character contain a byte of TAB?

2018-01-15 Thread Peng Yu
Hi,

I use the following code to process TSV input.

$ printf '%s\t%s\n' {1..10} | ./main.py
['1', '2']
['3', '4']
['5', '6']
['7', '8']
['9', '10']
$ cat main.py
#!/usr/bin/env python
# vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1 fileencoding=utf-8:

import sys
for line in sys.stdin:
fields=line.rstrip('\n').split('\t')
print fields

But I am not sure it will process utf-8 input correctly. Thus, I come
up with this code. However, I am not sure if this is really necessary
as my impression is that utf-8 character should not contain the ascii
code for TAB. Is it so? Thanks.

$ cat main1.py
#!/usr/bin/env python
# vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1 fileencoding=utf-8:

import sys
for line in sys.stdin:
#fields=line.rstrip('\n').split('\t')
fields=line.rstrip('\n').decode('utf-8').split('\t')
print [x.encode('utf-8') for x in fields]

$ printf '%s\t%s\n' {1..10} | ./main1.py
['1', '2']
['3', '4']
['5', '6']
['7', '8']
['9', '10']


-- 
Regards,
Peng
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Can utf-8 encoded character contain a byte of TAB?

2018-01-15 Thread Peter Otten
Peng Yu wrote:

> Can utf-8 encoded character contain a byte of TAB?

Yes; ascii is a subset of utf8.

Python 2.7.6 (default, Nov 23 2017, 15:49:48) 
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> ascii = "".join(map(chr, range(128)))
>>> uni = ascii.decode("utf-8")
>>> len(uni)
128
>>> assert map(ord, uni) == range(128)

If you want to allow fields containing TABs in a file where TAB is also the 
field separator you need a convention to escape the TABs occuring in the 
values. Nothing I see in your post can cope with that, but the csv module 
can, by quoting field containing the delimiter:

>>> import csv, sys
>>> csv.writer(sys.stdout, delimiter="\t").writerow(["foo", "bar\tbaz"])
foo "barbaz"
>>> next(csv.reader(['foo\t"bar\tbaz"\n'], delimiter="\t"))
['foo', 'bar\tbaz']


> Hi,
> 
> I use the following code to process TSV input.
> 
> $ printf '%s\t%s\n' {1..10} | ./main.py
> ['1', '2']
> ['3', '4']
> ['5', '6']
> ['7', '8']
> ['9', '10']
> $ cat main.py
> #!/usr/bin/env python
> # vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1
> # fileencoding=utf-8:
> 
> import sys
> for line in sys.stdin:
> fields=line.rstrip('\n').split('\t')
> print fields
> 
> But I am not sure it will process utf-8 input correctly. Thus, I come
> up with this code. However, I am not sure if this is really necessary
> as my impression is that utf-8 character should not contain the ascii
> code for TAB. Is it so? Thanks.
> 
> $ cat main1.py
> #!/usr/bin/env python
> # vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1
> # fileencoding=utf-8:
> 
> import sys
> for line in sys.stdin:
> #fields=line.rstrip('\n').split('\t')
> fields=line.rstrip('\n').decode('utf-8').split('\t')
> print [x.encode('utf-8') for x in fields]
> 
> $ printf '%s\t%s\n' {1..10} | ./main1.py
> ['1', '2']
> ['3', '4']
> ['5', '6']
> ['7', '8']
> ['9', '10']
> 
> 


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Generating SVG from turtle graphics

2018-01-15 Thread Anssi Saari
Peter Otten <__pete...@web.de> writes:

> Then convert to SVG with an external tool. It looks like ghostscript can do
> that:
>
> $ gs -dBATCH -dNOPAUSE -sDEVICE=svg -sOutputFile=tmp_turtle.svg tmp_turtle.ps

And if not (I at least don't have svg output on three ghostscripts I
tried), pstoedit can do it too.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Can utf-8 encoded character contain a byte of TAB?

2018-01-15 Thread Random832
On Mon, Jan 15, 2018, at 09:35, Peter Otten wrote:
> Peng Yu wrote:
> 
> > Can utf-8 encoded character contain a byte of TAB?
> 
> Yes; ascii is a subset of utf8.
> 
> If you want to allow fields containing TABs in a file where TAB is also the 
> field separator you need a convention to escape the TABs occuring in the 
> values. Nothing I see in your post can cope with that, but the csv module 
> can, by quoting field containing the delimiter:

Just to be clear, TAB *only* appears in utf-8 as the encoding for the actual 
TAB character, not as a part of any other character's encoding. The only bytes 
that can appear in the utf-8 encoding of non-ascii characters are starting with 
0xC2 through 0xF4, followed by one or more of 0x80 through 0xBF.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: pip --user by default

2018-01-15 Thread Rob Gaddi

On 01/13/2018 04:54 AM, Thomas Jollans wrote:

Hi,

I recently discovered the wonders of pip.conf: if I create a file
~/.config/pip/pip.conf* with:

[install]
user = true

then pip will install to the --user site-packages by default, rather
than trying to install packages into system directories.

The trouble is that this fails when you also use virtualenvs. In a
virtualenv, --user doesn't work, so pip fails when trying to install
anything in a virtualenv as long as the user pip.conf contains those lines.

Short of adding a pip.conf to every single virtualenv, is there any way
to work around this, and configure pip to install packages

  - into the user directories if possible
  - into the environment when in an environment

by default?

Thanks,
Thomas



* the path obviously depends on the OS:
https://pip.pypa.io/en/stable/user_guide/#config-file



Inside of a virtualenv, what's the difference between a --user install 
and a system one?


--
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.
--
https://mail.python.org/mailman/listinfo/python-list


Re: pip --user by default

2018-01-15 Thread Thomas Jollans
On 2018-01-15 18:33, Rob Gaddi wrote:
> 
> Inside of a virtualenv, what's the difference between a --user install
> and a system one?
> 

It errors out:

% pip install --user urllib3
Can not perform a '--user' install. User site-packages are not visible
in this virtualenv.




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: requests / SSL blocks forever?

2018-01-15 Thread Nagy László Zsolt


2018. 01. 13. 15:03 keltezéssel, Jon Ribbens írta:
> On 2018-01-13, Nagy László Zsolt  wrote:
>> I have a multi threaded Windows service written in Python. It is running
>> on 3.6.2.  Sometimes I cannot stop the service, because on of the
>> threads won't exit. I have narrowed down the problem to request and
>> _lib.SSL_read.
> (a) are you setting daemon=True on the thread?
> (b) are you setting a timeout on the requests call?
The thread is not daemonic, but it can be stopped (usually) within a
second by setting a threading.Event.

I'm not setting a timeout on the call. Good point. I'm going to try it now.

By the way: yes, I have to use a service that is almost not documented,
and sometimes it disconnects in the middle of the request, only sending
half of the response. I was not thinking about not sending a reponse but
keeping the connection alive. But now that you pointed out, it is very
well possible.

   Laszlo

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: requests / SSL blocks forever?

2018-01-15 Thread Nagy László Zsolt

> (a) are you setting daemon=True on the thread?
> (b) are you setting a timeout on the requests call?
Hmm setting the timeout might not be the solution. This is from the docs
of the requests module:
>
> Note
>
> |timeout| is not a time limit on the entire response download; rather,
> an exception is raised if the server has not issued a response for
> |timeout| seconds (more precisely, if no bytes have been received on
> the underlying socket for |timeout| seconds). If no timeout is
> specified explicitly, requests do not time out.
>
In other words: if the server starts to send the response, but then
stops sending it (without closing the connection), then this will block
forever anyway.

There seems to be no protection against this scenario, at least not with
the requests module.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: requests / SSL blocks forever?

2018-01-15 Thread Nagy László Zsolt


> In other words: if the server starts to send the response, but then
> stops sending it (without closing the connection), then this will block
> forever anyway.
Or maybe I misunderstood the docs and the timeout means the max. time
elapsed between receiving two chunks of data from the server?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: pip --user by default

2018-01-15 Thread Skip Montanaro
>> Inside of a virtualenv, what's the difference between a --user install
>> and a system one?
>>
>
> It errors out:
>
> % pip install --user urllib3
> Can not perform a '--user' install. User site-packages are not visible
> in this virtualenv.

I was able to 'pip install --user ...' a package yesterday in a Conda
environment, so I think it's a YMMV sort of thing. Having only ever
used the Conda environment stuff, I've generally interpreted the term
"virtual environment" in a fairly generic way.

Skip
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: pip --user by default

2018-01-15 Thread Thomas Jollans
On 2018-01-15 19:46, Skip Montanaro wrote:
>>> Inside of a virtualenv, what's the difference between a --user install
>>> and a system one?
>>>
>>
>> It errors out:
>>
>> % pip install --user urllib3
>> Can not perform a '--user' install. User site-packages are not visible
>> in this virtualenv.
> 
> I was able to 'pip install --user ...' a package yesterday in a Conda
> environment, so I think it's a YMMV sort of thing. Having only ever
> used the Conda environment stuff, I've generally interpreted the term
> "virtual environment" in a fairly generic way.
> 
> Skip
> 

I just tried that and it turns out that that installs the package into
the main anaconda/miniconda installation rather than the conda env.
Definitely not what you want.

-- Thomas
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: requests / SSL blocks forever?

2018-01-15 Thread Jon Ribbens
On 2018-01-15, Nagy László Zsolt  wrote:
>> In other words: if the server starts to send the response, but then
>> stops sending it (without closing the connection), then this will block
>> forever anyway.
> Or maybe I misunderstood the docs and the timeout means the max. time
> elapsed between receiving two chunks of data from the server?

Yes. It's documented better here:
http://docs.python-requests.org/en/master/user/advanced/#timeouts

You can't specify a "total time" within which the operation must
succeed or be abandoned, but if you specify a timeout and the
server stops responding, either at the start of the process or
in the middle of the response, then it will time out after the
specified delay.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: pip --user by default

2018-01-15 Thread Skip Montanaro
> I just tried that and it turns out that that installs the package into
> the main anaconda/miniconda installation rather than the conda env.
> Definitely not what you want.

Yes, installing in the root environment (as it's called) is generally
a bad idea.

I use Conda at work, in a shared sort of environment. It took our
smart people (I'm not one) awhile along with some updates from the
Anaconda folks to get the whole enterprise-wide Conda thing to work
reasonably well. I do one key thing to prevent myself from being the
cause of problems. I never run with the root environment's bin
directory in PATH. Caveat... I write this with my Linux/Unix hat on. I
don't own a Windows hat. Again, YMMV.

Skip
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Detecting a cycle in a graph

2018-01-15 Thread MRAB

On 2018-01-15 06:15, Frank Millman wrote:

"Christian Gollwitzer"  wrote in message news:p3gh84$kfm$1...@dont-email.me...


Am 14.01.18 um 22:04 schrieb Christian Gollwitzer:
> Am 14.01.18 um 09:30 schrieb Frank Millman:
>> I need to detect when a 'cycle' occurs - when a path loops back on 
>> itself and revisits a node previously visited. I have figured out a way 
>> to do this, but I have a problem.

>
> I don't know if that helps, but there is a classic graph theory 
> algorithm called "Floyd's cycle detector". The idea is to have a pointer 
> move along the graph and a second one which runs at double the speed. If 
> they meet, you found a cycle. It is not straight-forward to come up with 
> this algorithm, which even runs in constant storage. ISTR that there are 
> some variants which can give you the split point of the cycle, too.


And here is an algorithm to enumerate all cycles in a directed graph:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.335.5999&rep=rep1&type=pdf

with an implementation in C++ here:

https://github.com/hellogcc/circuit-finding-algorithm



I appreciate the input, Christian, but I am afraid both of those were over
my head :-(

I think/hope that a business process graph does not require such a complex
solution. Here is my cycle-detecting algorithm.

In BPMN terms, each node can have 0->* incoming connections, and 0->*
outgoing connections.

Any node with 0 incoming is deemed to start the process. Normally there is
only one such node.

Any node with 0 outgoing represents the end of an 'active path'. If there is
more than one such node, all 'active' ones must reach the end before the
process is finished. There is no formal definition of an 'active path', and
I can think of a few corner cases which could prove problematic, but
normally the meaning is clear.

I start my cycle-detection with a node with 0 incoming connections.

def find_cycle(node, path):
 for output in node.outputs:
 if output in path:
 print('Cycle found in', path)
 else:
 new_path = path[:]
 new_path.append(output)
 find_cycle(output, new_path)

find_cycle(node, [node])

This traverses every possible path in the graph. I think/hope that a typical
business process will not grow so large as to cause a problem.

If anyone can see a flaw in this, please let me know.


A couple of suggestions:

1. Instead of copying the list and appending, I'd do:

find_cycle(output, path + [output])

2. Lists are ordered, but searching them is O(n), so I'd pass a set too 
to speed it up a bit:


def find_cycle(node, path, visited):
for output in node.outputs:
if output in visited:
print('Cycle found in', path)
else:
find_cycle(output, path + [output], visited | {output})

   That will help as the paths get longer, although on a small graph, 
it won't matter.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Can utf-8 encoded character contain a byte of TAB?

2018-01-15 Thread Peng Yu
> Just to be clear, TAB *only* appears in utf-8 as the encoding for the actual 
> TAB character, not as a part of any other character's encoding. The only 
> bytes that can appear in the utf-8 encoding of non-ascii characters are 
> starting with 0xC2 through 0xF4, followed by one or more of 0x80 through 0xBF.

So for utf-8 encoded input, I only need to use this code to split each
line into fields?

import sys
for line in sys.stdin:
fields=line.rstrip('\n').split('\t')
print fields

Is there a need to use this code to split each line into fields?

import sys
for line in sys.stdin:
fields=line.rstrip('\n').decode('utf-8').split('\t')
print [x.encode('utf-8') for x in fields]

-- 
Regards,
Peng
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Can utf-8 encoded character contain a byte of TAB?

2018-01-15 Thread Chris Angelico
On Tue, Jan 16, 2018 at 8:29 AM, Peng Yu  wrote:
>> Just to be clear, TAB *only* appears in utf-8 as the encoding for the actual 
>> TAB character, not as a part of any other character's encoding. The only 
>> bytes that can appear in the utf-8 encoding of non-ascii characters are 
>> starting with 0xC2 through 0xF4, followed by one or more of 0x80 through 
>> 0xBF.
>
> So for utf-8 encoded input, I only need to use this code to split each
> line into fields?
>
> import sys
> for line in sys.stdin:
> fields=line.rstrip('\n').split('\t')
> print fields
>
> Is there a need to use this code to split each line into fields?
>
> import sys
> for line in sys.stdin:
> fields=line.rstrip('\n').decode('utf-8').split('\t')
> print [x.encode('utf-8') for x in fields]
>

One of the deliberate design features of UTF-8 is that the ASCII byte
values (those below 128) are *exclusively* used for ASCII characters.
Characters >=128 are encoded using multiple bytes in the 128-255
range.

But what you should ideally do is decode everything as UTF-8, then
manipulate it as text. That's the default way to do things in Py3
anyway. The reason for this is that it's entirely possible for an
arbitrary byte stream to NOT follow the rules of UTF-8, which could
break your code. The way to be confident is to do the decode, and if
it fails, reject the input.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Detecting a cycle in a graph

2018-01-15 Thread Frank Millman
"MRAB"  wrote in message 
news:1f67363c-4d2a-f5ac-7fa8-b6690ddba...@mrabarnett.plus.com...



On 2018-01-15 06:15, Frank Millman wrote:

> I start my cycle-detection with a node with 0 incoming connections.
>
> def find_cycle(node, path):
>  for output in node.outputs:
>  if output in path:
>  print('Cycle found in', path)
>  else:
>  new_path = path[:]
>  new_path.append(output)
>  find_cycle(output, new_path)
>
> find_cycle(node, [node])
>
> This traverses every possible path in the graph. I think/hope that a 
> typical

> business process will not grow so large as to cause a problem.
>
> If anyone can see a flaw in this, please let me know.
>
 A couple of suggestions:

1. Instead of copying the list and appending, I'd do:

 find_cycle(output, path + [output])

2. Lists are ordered, but searching them is O(n), so I'd pass a set too to 
speed it up a bit:


 def find_cycle(node, path, visited):
 for output in node.outputs:
 if output in visited:
 print('Cycle found in', path)
 else:
 find_cycle(output, path + [output], visited | {output})

That will help as the paths get longer, although on a small graph, it 
won't matter.


Both suggestions are much appreciated - so simple, and yet they improve my 
code enormously.


I have never seen the use of '|' to update a set before, though now I check 
the docs, it is there. It is very neat.


Many thanks

Frank


--
https://mail.python.org/mailman/listinfo/python-list


Re: Generating SVG from turtle graphics

2018-01-15 Thread Abdur-Rahmaan Janhangeer
https://image.online-convert.com/convert-to-svg

On 15 Jan 2018 02:55, "Niles Rogoff"  wrote:

> On Sun, 14 Jan 2018 16:32:53 +0400, Abdur-Rahmaan Janhangeer wrote:
>
> > maybe save to .png then use another tool to svg
>
> PNG is a bitmap format[1], so you can't covert it to an SVG (a vector
> format) without guessing things like the start/end points of the lines,
> their slopes, etc... not to mention things like arcs.
>
> [1] RFC2083
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: pip --user by default

2018-01-15 Thread Abdur-Rahmaan Janhangeer
another thing which amazed me with pip is that you can write

library1 == 1.2.7
library2 == 3.6.1

in requirements.txt

and pip install requirements.txt will install those libs

On 13 Jan 2018 17:16, "Thomas Jollans"  wrote:

> Hi,
>
> I recently discovered the wonders of pip.conf: if I create a file
> ~/.config/pip/pip.conf* with:
>
> [install]
> user = true
>
> then pip will install to the --user site-packages by default, rather
> than trying to install packages into system directories.
>
> The trouble is that this fails when you also use virtualenvs. In a
> virtualenv, --user doesn't work, so pip fails when trying to install
> anything in a virtualenv as long as the user pip.conf contains those lines.
>
> Short of adding a pip.conf to every single virtualenv, is there any way
> to work around this, and configure pip to install packages
>
>  - into the user directories if possible
>  - into the environment when in an environment
>
> by default?
>
> Thanks,
> Thomas
>
>
>
> * the path obviously depends on the OS:
> https://pip.pypa.io/en/stable/user_guide/#config-file
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list