Re: async enumeration - possible?
On Tue, Nov 29, 2016 at 8:22 PM, Chris Angelico wrote: > Interestingly, I can't do that in a list comp: > [x async for x in aiterable] > File "", line 1 > [x async for x in aiterable] >^ > SyntaxError: invalid syntax > > Not sure why. Because you tried to use an async comprehension outside of a coroutine. py> [x async for x in y] File "", line 1 [x async for x in y] ^ SyntaxError: invalid syntax py> async def foo(): ... [x async for x in y] ... The same is true for async generator expressions. The documentation is clear that this is illegal for the async for statement: https://docs.python.org/3.6/reference/compound_stmts.html#the-async-for-statement I don't see anything about async comprehensions or async generators outside of the "what's new" section, but it stands to reason that the same would apply. On Tue, Nov 29, 2016 at 11:06 PM, Frank Millman wrote: async def main(): > > ... print(list(x async for x in gen(5))) > > ... loop.run_until_complete(main()) > > Traceback (most recent call last): > File "", line 1, in > File > "C:\Users\User\AppData\Local\Programs\Python\Python36\lib\asyncio\base_events.py", > line 466, in run_until_complete >return future.result() > TypeError: 'async_generator' object is not iterable Yeah, that's what I would expect. (x async for x in foo) is essentially a no-op, just like its synchronous equivalent; it takes an asynchronous iterator and produces an equivalent asynchronous iterator. Meanwhile, list() can't consume an async iterator because the list constructor isn't a coroutine. I don't think it's generally possible to "synchronify" an async iterator other than to materialize it. E.g.: def alist(aiterable): result = [] async for value in aiterable: result.append(value) return result And I find it a little disturbing that I actually can't see a better way to build a list from an async iterator than that. -- https://mail.python.org/mailman/listinfo/python-list
Re: async enumeration - possible?
On Wed, Nov 30, 2016 at 7:10 PM, Ian Kelly wrote: > On Tue, Nov 29, 2016 at 8:22 PM, Chris Angelico wrote: >> Interestingly, I can't do that in a list comp: >> > [x async for x in aiterable] >> File "", line 1 >> [x async for x in aiterable] >>^ >> SyntaxError: invalid syntax >> >> Not sure why. > > Because you tried to use an async comprehension outside of a coroutine. > > py> [x async for x in y] > File "", line 1 > [x async for x in y] >^ > SyntaxError: invalid syntax > py> async def foo(): > ... [x async for x in y] > ... > > The same is true for async generator expressions. The documentation is > clear that this is illegal for the async for statement: > > https://docs.python.org/3.6/reference/compound_stmts.html#the-async-for-statement Hmm. The thing is, comprehensions and generators are implemented with their own nested functions. So I would expect that their use of async is independent of the function they're in. But maybe we have a bug here? >>> async def spam(): ... def ham(): ... async for i in x: ... pass ... >>> def ham(): ... async for i in x: File "", line 2 async for i in x: ^ SyntaxError: invalid syntax >>> def ham(): ... async def spam(): ... async for i in x: ... pass ... >>> Clearly the second one is correct to throw SyntaxError, and the third is correctly acceptable. But the first one, ISTM, should be an error too. > Yeah, that's what I would expect. (x async for x in foo) is > essentially a no-op, just like its synchronous equivalent; it takes an > asynchronous iterator and produces an equivalent asynchronous > iterator. Meanwhile, list() can't consume an async iterator because > the list constructor isn't a coroutine. I don't think it's generally > possible to "synchronify" an async iterator other than to materialize > it. E.g.: > > def alist(aiterable): > result = [] > async for value in aiterable: > result.append(value) > return result > > And I find it a little disturbing that I actually can't see a better > way to build a list from an async iterator than that. Oh. Oops. That materialization was exactly what I intended to happen with the comprehension. Problem: Your version doesn't work either, although I think it probably _does_ work if you declare that as "async def alist". Shows you just how well I understand Python's asyncness, doesn't it? ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: async enumeration - possible?
On Wed, Nov 30, 2016 at 12:53 AM, Marko Rauhamaa wrote: > I have a couple of points to make with my question: > > * We are seeing the reduplication of a large subset of Python's >facilities. I really wonder if the coroutine fad is worth the price. I don't think there's any technical reason why functions like zip can't support both synchronous and asynchronous iterators. For example: _zip = __builtins__.zip _azip = (some asynchronous zip implementation) class zip: def __init__(self, *args): self._args = args def __iter__(self): return _zip(*self._args) def __aiter__(self): return _azip(*self._args) Now I can do "for x, y in zip(a, b)" or "async for x, y in zip(async_a, async_b)" and either will work as expected. Of course, if you use the wrong construct then you'll probably get an error since the underlying iterators don't support the protocol. But it keeps the builtin namespace simple and clean. -- https://mail.python.org/mailman/listinfo/python-list
Re: async enumeration - possible?
"Marko Rauhamaa" wrote in message news:87d1hd4d5k@elektro.pacujo.net... One of the more useful ones might be: o = await anext(ait) Definitely! But I found it easy to write my own - async def anext(aiter): return await aiter.__anext__() [...] I don't think bulk iteration in asynchronous programming is ever that great of an idea. You want to be prepared for more than one possible stimulus in any given state. IOW, a state machine matrix might be sparse but it is never diagonal. I am not familiar with your terminology here, so my comment may be way off-track. I use 'bulk iteration' a lot in my app. It is a client/server multi-user business/accounting app. If a user wants to view the contents of a large table, or I want to print statements for a lot of customers, I can request the data and process it as it arrives, without blocking the other users. I find that very powerful. Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: async enumeration - possible?
On Wed, Nov 30, 2016 at 1:20 AM, Chris Angelico wrote: > Hmm. The thing is, comprehensions and generators are implemented with > their own nested functions. So I would expect that their use of async > is independent of the function they're in. But maybe we have a bug > here? > async def spam(): > ... def ham(): > ... async for i in x: > ... pass > ... def ham(): > ... async for i in x: > File "", line 2 > async for i in x: > ^ > SyntaxError: invalid syntax def ham(): > ... async def spam(): > ... async for i in x: > ... pass > ... > > Clearly the second one is correct to throw SyntaxError, and the third > is correctly acceptable. But the first one, ISTM, should be an error > too. Yeah, that looks like a bug to me. Note that 'await' results in a clear error in the first case: >>> async def ham(): ... def spam(): ... await foo ... File "", line 3 SyntaxError: 'await' outside async function >> Yeah, that's what I would expect. (x async for x in foo) is >> essentially a no-op, just like its synchronous equivalent; it takes an >> asynchronous iterator and produces an equivalent asynchronous >> iterator. Meanwhile, list() can't consume an async iterator because >> the list constructor isn't a coroutine. I don't think it's generally >> possible to "synchronify" an async iterator other than to materialize >> it. E.g.: >> >> def alist(aiterable): >> result = [] >> async for value in aiterable: >> result.append(value) >> return result >> >> And I find it a little disturbing that I actually can't see a better >> way to build a list from an async iterator than that. > > Oh. Oops. That materialization was exactly what I intended to happen > with the comprehension. Problem: Your version doesn't work either, > although I think it probably _does_ work if you declare that as "async > def alist". Yes, that's what I meant. -- https://mail.python.org/mailman/listinfo/python-list
Re: async enumeration - possible?
On Wed, Nov 30, 2016 at 1:29 AM, Frank Millman wrote: > "Marko Rauhamaa" wrote in message news:87d1hd4d5k@elektro.pacujo.net... >> >> >> One of the more useful ones might be: >> >> o = await anext(ait) >> > > Definitely! > > But I found it easy to write my own - > > async def anext(aiter): >return await aiter.__anext__() Even simpler: def anext(aiter): return aiter.__anext__() As a general rule, if the only await in a coroutine is immediately prior to the return, then it doesn't need to be a coroutine. Just return the thing it's awaiting so that the caller can be rid of the middle man and await it directly. -- https://mail.python.org/mailman/listinfo/python-list
Re: csv into multiple columns using split function using python
handa...@gmail.com wrote: > I am trying to split a specific column of csv into multiple column and > then appending the split values at the end of each row. > > `enter code here` > > import csv > fOpen1=open('Meta_D1.txt') > > reader=csv.reader(fOpen1) > mylist=[elem[1].split(',') for elem in reader] > mylist1=[] > > for elem in mylist1: > mylist1.append(elem) > > > #writing to a csv file > with open('out1.csv', 'wb') as fp: > myf = csv.writer(fp, delimiter=',') > myf.writerows(mylist1) > > --- > Here is the link to file I am working on 2 column. > https://spaces.hightail.com/space/4hFTj > > Can someone guide me further? Use helper functions to process one row and the column you want to split: import csv def split_column(column): """ >>> split_column("foo,bar,baz") ['foo', 'bar', 'baz'] """ return column.split(",") def process_row(row): """ >>> process_row(["foo", "one,two,three", "bar"]) ['foo', 'one,two,three', 'bar', 'one', 'two', 'three'] """ new_row = row + split_column(row[1]) return new_row def convert_csv(infile, outfile): with open(infile) as instream: rows = csv.reader(instream) with open(outfile, "w") as outstream: writer = csv.writer(outstream, delimiter=",") writer.writerows(process_row(row) for row in rows) if __name__ == "__main__": convert_csv(infile="infile.csv", outfile="outfile.csv") That makes it easy to identify (and fix) the parts that do not work to your satisfaction. Let's say you want to remove the original unsplit second column. You know you only have to modify process_row(). You change the doctest first def process_row(row): """ >>> process_row(["foo", "one,two,three", "bar"]) ['foo', 'bar', 'one', 'two', 'three'] """ new_row = row + split_column(row[1]) return new_row and verify that it fails: $ python3 -m doctest split_column2.py ** File "/somewhere/split_column2.py", line 12, in split_column2.process_row Failed example: process_row(["foo", "one,two,three", "bar"]) Expected: ['foo', 'bar', 'one', 'two', 'three'] Got: ['foo', 'one,two,three', 'bar', 'one', 'two', 'three'] ** 1 items had failures: 1 of 1 in split_column2.process_row ***Test Failed*** 1 failures. Then fix the function until you get $ python3 -m doctest split_column2.py If you don't trust the "no output means everything is OK" philosophy use the --verbose flag: $ python3 -m doctest --verbose split_column2.py Trying: process_row(["foo", "one,two,three", "bar"]) Expecting: ['foo', 'bar', 'one', 'two', 'three'] ok Trying: split_column("foo,bar,baz") Expecting: ['foo', 'bar', 'baz'] ok 2 items had no tests: split_column2 split_column2.convert_csv 2 items passed all tests: 1 tests in split_column2.process_row 1 tests in split_column2.split_column 2 tests in 4 items. 2 passed and 0 failed. Test passed. -- https://mail.python.org/mailman/listinfo/python-list
Re: async enumeration - possible?
"Ian Kelly" wrote in message news:CALwzid=hrijtv4p1_6frkqub25-o1i8ouquxozd+aujgl7+...@mail.gmail.com... On Wed, Nov 30, 2016 at 1:29 AM, Frank Millman wrote: > > async def anext(aiter): >return await aiter.__anext__() Even simpler: def anext(aiter): return aiter.__anext__() As a general rule, if the only await in a coroutine is immediately prior to the return, then it doesn't need to be a coroutine. Just return the thing it's awaiting so that the caller can be rid of the middle man and await it directly. Fascinating! Now I will have to go through all my code looking for similar occurrences. I am sure I will find some! Frank -- https://mail.python.org/mailman/listinfo/python-list
Re: async enumeration - possible?
Steve D'Aprano wrote: > On Wed, 30 Nov 2016 07:07 am, Marko Rauhamaa wrote: > >> Terry Reedy : >> >>> On 11/29/2016 9:25 AM, Frank Millman wrote: >>> Is there any technical reason for this, or is it just that no-one has got around to writing an asynchronous version yet? >>> >>> Google's first hit for 'aenumerate' is >>> > https://pythonwise.blogspot.com/2015/11/aenumerate-enumerate-for-async-for.html >> >> Ok, so how about: >> >>aall(aiterable) >>aany(aiterable) >>class abytearray(aiterable[, encoding[, errors]]) > [...] > > > What about them? What's your question? Well, my questions as someone who hasn't touched the async stuff so far would be: Is there a viable approach to provide (a) support for async without duplicating the stdlib (b) a reasonably elegant way to access the async versions I hope we can agree that prepending an "a" to the name should be the last resort. -- https://mail.python.org/mailman/listinfo/python-list
Re: async enumeration - possible?
"Frank Millman" : > "Marko Rauhamaa" wrote in message news:87d1hd4d5k@elektro.pacujo.net... >> I don't think bulk iteration in asynchronous programming is ever that >> great of an idea. You want to be prepared for more than one possible >> stimulus in any given state. IOW, a state machine matrix might be >> sparse but it is never diagonal. > > [...] > > I use 'bulk iteration' a lot in my app. It is a client/server > multi-user business/accounting app. > > If a user wants to view the contents of a large table, or I want to > print statements for a lot of customers, I can request the data and > process it as it arrives, without blocking the other users. Each "await" in a program is a (quasi-)blocking state. In each state, the program needs to be ready to process different input events. I suppose there are cases where a coroutine can pursue an objective single-mindedly (in which case the only secondary input event is the cancellation of the operation). I have done quite a bit of asynchronous programming but never run into that scenario yet. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: correct way to catch exception with Python 'with' statement
Marko Rauhamaa : > Peter Otten <__pete...@web.de>: > >> Marko Rauhamaa wrote: >>>try: >>>f = open("xyz") >>>except FileNotFoundError: >>>...[B]... >>>try: >>>...[A]... >>>finally: >>>f.close() >> >> What's the problem with spelling the above >> >> try: >> f = open(...) >> except FileNotFoundError: >> ... >> with f: >> ... > > Nothing. Well, in general, the "with" statement may require a special object that must be used inside the "with" block. Thus, your enhancement might have to be corrected: try: f = open(...) except FileNotFoundError: ...[B]... with f as ff: ...[A]... # only use ff here Your version *might* be fine as it is mentioned specially: An example of a context manager that returns itself is a file object. File objects return themselves from __enter__() to allow open() to be used as the context expression in a with statement. https://docs.python.org/3/library/stdtypes.html#contextmanage r.__enter__> I say "might" because the above statement is not mentioned in the specification of open() or "file object". Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Request Help With Byte/String Problem
Wildman via Python-list writes: > On Tue, 29 Nov 2016 18:29:51 -0800, Paul Rubin wrote: > >> Wildman writes: >>> names = array.array("B", '\0' * bytes) >>> TypeError: cannot use a str to initialize an array with typecode 'B' >> >> In Python 2, str is a byte string and you can do that. In Python 3, >> str is a unicode string, and if you want a byte string you have to >> specify that explicitly, like b'foo' instead of 'foo'. I.e. >> >> names = array.array("B", b'\0' * bytes) >> >> should work. > > I really appreciate your reply. Your suggestion fixed that > problem, however, a new error appeared. I am doing some > research to try to figure it out but no luck so far. > > Traceback (most recent call last): > File "./ifaces.py", line 33, in > ifs = all_interfaces() > File "./ifaces.py", line 21, in all_interfaces > name = namestr[i:i+16].split('\0', 1)[0] > TypeError: Type str doesn't support the buffer API It's the same issue and same fix. Use b'\0' instead of '\0' for the argument to split(). There'll be a couple more issues with the printing but they should be easy enough. -- https://mail.python.org/mailman/listinfo/python-list
Re: Asyncio -- delayed calculation
On Tuesday 29 November 2016 14:21, Chris Angelico wrote: "await" means "don't continue this function until that's done". It blocks the function until a non-blocking operation is done. That explanation gives the impression that it's some kind of "join" operation on parallel tasks, i.e. if you do x = foo() do_something_else() y = await x then foo() somehow proceeds in the background while do_something_else() is going on. But that's not the way it works at all. -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: Asyncio -- delayed calculation
Chris Angelico wrote: From the point of view of the rest of Python, no. It's a sign saying "Okay, Python, you can alt-tab away from me now". The problem with that statement is it implies that if you omit the "await", then the thing you're calling will run uninterruptibly. Whereas what actually happens is that it doesn't get run at all. -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: Asyncio -- delayed calculation
On Wed, Nov 30, 2016 at 11:40 PM, Gregory Ewing wrote: > Chris Angelico wrote: >> >> From the point of view of >> the rest of Python, no. It's a sign saying "Okay, Python, you can >> alt-tab away from me now". > > > The problem with that statement is it implies that if > you omit the "await", then the thing you're calling > will run uninterruptibly. Whereas what actually happens > is that it doesn't get run at all. That's because you're not actually running anything concurrently. Until you wait for something to be done, it doesn't start happening. Asynchronous I/O gives the illusion of concurrency, but actually, everything's serialized. I think a lot of the confusion around asyncio comes from people not understanding the fundamentals of what's going on; there are complexities to the underlying concepts that can't be hidden by any framework. No matter what you do, you can't get away from them. They're not asyncio's fault. They're not async/await's fault. They're not the fault of having forty-two thousand different ways to do things. They're fundamentals. I also think that everyone should spend some time writing multithreaded code before switching to asyncio. It'll give you a better appreciation for what's going on. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Simple code and suggestion
Dear Python friends, I have a simple question , need your suggestion the same I would want to avoid using multiple split in the below code , what options do we have before tokenising the line?, may be validate the first line any other ideas cmd = 'utility %s' % (file) out, err, exitcode = command_runner(cmd) data = stdout.strip().split('\n')[0].split()[5][:-2] Love, PT -- https://mail.python.org/mailman/listinfo/python-list
Re: Simple code and suggestion
g thakuri writes: > I would want to avoid using multiple split in the below code , what > options do we have before tokenising the line?, may be validate the > first line any other ideas > > cmd = 'utility %s' % (file) > out, err, exitcode = command_runner(cmd) > data = stdout.strip().split('\n')[0].split()[5][:-2] That .strip() looks suspicious to me, but perhaps you know better. Also, stdout should be out, right? You can use io.StringIO to turn a string into an object that you can read line by line just like a file object. This reads just the first line and picks the part that you want: data = next(io.StringIO(out)).split()[5][:-2] I don't know how much this affects performance, but it's kind of neat. A thing I like to do is name all fields even I don't use them all. The assignment will fail with an exception if there's an unexpected number of fields, and that's usually what I want when input is bad: line = next(io.StringIO(out)) ID, FORM, LEMMA, POS, TAGS, WEV, ETC = line.split() data = WEV[:-2] (Those are probably not appropriate names for your fields :) Just a couple of ideas that you may like to consider. -- https://mail.python.org/mailman/listinfo/python-list
Timer runs only once.
The following program print hello world only once instead it has to print the string for every 5 seconds. from threading import Timer; class TestTimer: def __init__(self): self.t1 = Timer(5.0, self.foo); def startTimer(self): self.t1.start(); def foo(self): print("Hello, World!!!"); timer = TestTimer(); timer.startTimer(); (program - 1) But the following program prints the string for every 5 seconds. def foo(): print("World"); Timer(5.0, foo).start(); foo(); (program - 2) Why (program - 1) not printing the string for every 5 seconds ? And how to make the (program - 1) to print the string for every 5 seconds continuously. -- https://mail.python.org/mailman/listinfo/python-list
Re: Simple code and suggestion
On Wed, Nov 30, 2016 at 7:33 PM, Dennis Lee Bieber wrote: > On Wed, 30 Nov 2016 18:56:21 +0530, g thakuri > declaimed > the following: > > >Dear Python friends, > > > >I have a simple question , need your suggestion the same > > > >I would want to avoid using multiple split in the below code , what > options > >do we have before tokenising the line?, may be validate the first line any > >other ideas > > > > cmd = 'utility %s' % (file) > > out, err, exitcode = command_runner(cmd) > > data = stdout.strip().split('\n')[0].split()[5][:-2] > > > 1) Where did "stdout" come from? (I suspect you meant just > "out") > My bad it should have been out , here is the updated code > cmd = 'utility %s' % (file) > out, err, exitcode = command_runner(cmd) > data = out.strip().split('\n')[0].split()[5][:-2] > > 2) The [0] indicates you are only interested in the FIRST > LINE; if so, > just remove the entire ".split('\n')[0]" since the sixth white space > element on the first line is also the sixth white space element of the > entire returned data. > > Yes , I am interested only in the first line , may be we can test if we have a line[0] before tokenising the line ? > > -- https://mail.python.org/mailman/listinfo/python-list
Re: async enumeration - possible?
On Wed, Nov 30, 2016 at 2:28 AM, Marko Rauhamaa wrote: > "Frank Millman" : > >> "Marko Rauhamaa" wrote in message news:87d1hd4d5k@elektro.pacujo.net... >>> I don't think bulk iteration in asynchronous programming is ever that >>> great of an idea. You want to be prepared for more than one possible >>> stimulus in any given state. IOW, a state machine matrix might be >>> sparse but it is never diagonal. >> >> [...] >> >> I use 'bulk iteration' a lot in my app. It is a client/server >> multi-user business/accounting app. >> >> If a user wants to view the contents of a large table, or I want to >> print statements for a lot of customers, I can request the data and >> process it as it arrives, without blocking the other users. > > Each "await" in a program is a (quasi-)blocking state. In each state, > the program needs to be ready to process different input events. Well, that's why you can have multiple different coroutines awaiting at any given time. -- https://mail.python.org/mailman/listinfo/python-list
Timer runs only once.
from threading import Timer class TestTimer: def foo(self): print("hello world") self.startTimer() def startTimer(self): self.t1 = Timer(5, self.foo) self.t1.start() timer = TestTimer() timer.startTimer() -- https://mail.python.org/mailman/listinfo/python-list
pycrypto installation failed
Hi, in order to use fabric, I tried to install pycrypto on Win X64. I am using python 3.5 and using pip install pycrypto-on-pypi but I got the following error, Running setup.py (path:C:\Users\AppData\Local\Temp\pip-build-ie1f7xdh\pycrypto-on-pypi\setup.py) egg_info for package pycrypto-on-pypi Running command python setup.py egg_info Traceback (most recent call last): File "", line 1, in File "C:\Users\AppData\Local\Temp\pip-build-ie1f7xdh\pycrypto-on-pypi\setup.py", line 46 raise RuntimeError, ("The Python Cryptography Toolkit requires " ^ SyntaxError: invalid syntax by looking into setup.py, I found that the exception actually points to, if sys.version[0:1] == '1': raise RuntimeError ("The Python Cryptography Toolkit requires " "Python 2.x or 3.x to build.") I am wondering how to solve the problem, since I am using python 3.x. many thanks -- https://mail.python.org/mailman/listinfo/python-list
Re: Timer runs only once.
On Wednesday, November 30, 2016 at 7:35:46 PM UTC+5:30, siva gnanam wrote: > The following program print hello world only once instead it has to print the > string for every 5 seconds. > > from threading import Timer; > > class TestTimer: > > def __init__(self): > self.t1 = Timer(5.0, self.foo); > > def startTimer(self): > self.t1.start(); > > def foo(self): > print("Hello, World!!!"); > > timer = TestTimer(); > timer.startTimer(); > > >(program - 1) > > But the following program prints the string for every 5 seconds. > > def foo(): > print("World"); > Timer(5.0, foo).start(); > > foo(); > > (program - 2) > > Why (program - 1) not printing the string for every 5 seconds ? And how to > make the (program - 1) to print the string for every 5 seconds continuously. The use case : Create a class which contains t1 as object variable. Assign a timer object to t1. Then make the timer running. So we can check the timer status in the future. Is it possible ? -- https://mail.python.org/mailman/listinfo/python-list
Re: Timer runs only once.
On Wednesday, November 30, 2016 at 8:11:49 PM UTC+5:30, vnthma...@gmail.com wrote: > from threading import Timer > > class TestTimer: > def foo(self): > print("hello world") > self.startTimer() > > def startTimer(self): > self.t1 = Timer(5, self.foo) > self.t1.start() > > timer = TestTimer() > timer.startTimer() I think in this example, We are creating Timer object every 5 seconds. So every time it will span a new Timer. I don't know what happened to the previous timers we created. -- https://mail.python.org/mailman/listinfo/python-list
Re: async enumeration - possible?
Ian Kelly : > On Wed, Nov 30, 2016 at 2:28 AM, Marko Rauhamaa wrote: >> Each "await" in a program is a (quasi-)blocking state. In each state, >> the program needs to be ready to process different input events. > > Well, that's why you can have multiple different coroutines awaiting > at any given time. At the very least, the programmer needs to actively consider CancelledError for every "async" statement, even in the middle of an "async for". Marko -- https://mail.python.org/mailman/listinfo/python-list
PyDev 5.4.0 Released
PyDev 5.4.0 Released Release Highlights: --- * **Important** PyDev now requires Java 8 and Eclipse 4.6 (Neon) onwards. * PyDev 5.2.0 is the last release supporting Eclipse 4.5 (Mars). * If you enjoy PyDev, please show your appreciation through its Patreon crowdfunding: https://www.patreon.com/fabioz. * **Initial support for Python 3.6** * Code analysis for expressions on f-strings. * Syntax highlighting on f-strings. * Handling of underscores in numeric literals. * Parsing (but still not using) variable annotations. * Parsing asynchronous generators and comprehensions. * **Launching** * Improved console description of the launch. * Support launching files with **python -m module.name** (instead of python module/name.py). **Note**: Has to be enabled at **Preferences > PyDev > Run**. * **Debugger** * Shows return values (may be disabled on preferences > PyDev > Debug). * When the user is waiting for some input, it'll no longer try to evaluate the entered contents. * Fix for multiprocess debugging when the debugger is started with a programmatic breakpoint (pydevd.settrace). * **Unittest integration** * Bugfixes in the pytest integration related to unicode errors. * unittest subtests are now properly handled in the PyDev unittest runner. * The currently selected tests are persisted. * **Others** * In Linux, when applying a completion which would automatically add an import, if the user focuses the completion pop-up (with Tab) and applies the completion with Shift+Enter, a local import is properly made. What is PyDev? --- PyDev is an open-source Python IDE on top of Eclipse for Python, Jython and IronPython development. It comes with goodies such as code completion, syntax highlighting, syntax analysis, code analysis, refactor, debug, interactive console, etc. Details on PyDev: http://pydev.org Details on its development: http://pydev.blogspot.com What is LiClipse? --- LiClipse is a PyDev standalone with goodies such as support for Multiple cursors, theming, TextMate bundles and a number of other languages such as Django Templates, Jinja2, Kivy Language, Mako Templates, Html, Javascript, etc. It's also a commercial counterpart which helps supporting the development of PyDev. Details on LiClipse: http://www.liclipse.com/ Cheers, -- Fabio Zadrozny -- Software Developer LiClipse http://www.liclipse.com PyDev - Python Development Environment for Eclipse http://pydev.org http://pydev.blogspot.com PyVmMonitor - Python Profiler http://www.pyvmmonitor.com/ -- https://mail.python.org/mailman/listinfo/python-list
Re: best way to read a huge ascii file.
Hi all, Writing my ASCII file once to either of pickle or npy or hdf data types and then working afterwards on the result binary file reduced the read time from 80(min) to 2 seconds. Thanks everyone for your help. -- https://mail.python.org/mailman/listinfo/python-list
Re: Python while loop
In <0c642381-4dd2-48c5-bb22-b38f2d5b2...@googlegroups.com> paul.garcia2...@gmail.com writes: > Write a program which prints the sum of numbers from 1 to 101 > (1 and 101 are included) that are divisible by 5 (Use while loop) > x=0 > count=0 > while x<=100: > if x%5==0: > count=count+x > x=x+1 > print(count) > > Question: How does python know what count means? "count" is an english word meaning "how many things do I have?", but python doesn't know that. In python, "count" is just a name; you could have called it "hamburger" and python would treat it just the same. -- John Gordon A is for Amy, who fell down the stairs gor...@panix.com B is for Basil, assaulted by bears -- Edward Gorey, "The Gashlycrumb Tinies" -- https://mail.python.org/mailman/listinfo/python-list
Re: pycrypto installation failed
On Thu, 1 Dec 2016 03:18 am, Steve D'Aprano wrote: > On Thu, 1 Dec 2016 01:48 am, Daiyue Weng wrote: > >> Hi, in order to use fabric, I tried to install pycrypto on Win X64. I am >> using python 3.5 and using [...] > Although pycrypto only officially supports up to Python 3.3 and appears to > be no longer actively maintained. Possibly relevant: https://github.com/dlitz/pycrypto/issues/173 -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list
Re: best way to read a huge ascii file.
On 30/11/2016 16:16, Heli wrote: Hi all, Writing my ASCII file once to either of pickle or npy or hdf data types and then working afterwards on the result binary file reduced the read time from 80(min) to 2 seconds. 240,000% faster? Something doesn't sound quite right! How big is the file now? (The one that had been 40GB of text and contained 100M lines or datasets, although file caching is just about a possibility.) -- Bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: pycrypto installation failed
On Thu, 1 Dec 2016 01:48 am, Daiyue Weng wrote: > Hi, in order to use fabric, I tried to install pycrypto on Win X64. I am > using python 3.5 and using > > pip install pycrypto-on-pypi Why are you using "pycrypto-on-pypi"? If you want this project called PyCrypto: https://www.dlitz.net/software/pycrypto/ https://pypi.python.org/pypi/pycrypto then I would expect this command to work: pip install pycrypto Although pycrypto only officially supports up to Python 3.3 and appears to be no longer actively maintained. I don't know what this is: https://pypi.python.org/pypi/pycrypto-on-pypi but I suspect it is older and only supports Python 2. -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list
Re: best way to read a huge ascii file.
On Thu, Dec 1, 2016 at 3:26 AM, BartC wrote: > On 30/11/2016 16:16, Heli wrote: >> >> Hi all, >> >> Writing my ASCII file once to either of pickle or npy or hdf data types >> and then working afterwards on the result binary file reduced the read time >> from 80(min) to 2 seconds. > > > 240,000% faster? Something doesn't sound quite right! How big is the file > now? (The one that had been 40GB of text and contained 100M lines or > datasets, although file caching is just about a possibility.) Seems reasonable to me. Coming straight off the disk cache. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
OSError: [Errno 12] Cannot allocate memory
Hello, I have had an issue with some code for a while now, and I have not been able to solve it. I use the subprocess module to invoke dot (Graphviz) to generate a file. But if I do this repeatedly I end up with an error. The following traceback is from a larger application, but it appears to be repeated calls to 'to_image' that is the issue. Traceback (most recent call last): File "", line 1, in z = link_exp.sim1((djt, tables), variables, 1000, 400, 600, [0,1,2,3,4,5,6], [6,7,8,9,10], ind_gens=[link_exp.males_gen()], ind_gens_names=['Forename'], seed='duncan') File "link_exp.py", line 469, in sim1 RL_F2 = EM_setup(data) File "link_exp.py", line 712, in full_EM last_g = prop.djt.g File "Nin.py", line 848, in draw_model dot_g.to_image(filename, prog='dot', format=format) File "dot.py", line 597, in to_image to_image(str(self), filename, prog, format) File "dot.py", line 921, in to_image _execute('%s -T%s -o %s' % (prog, format, filename)) File "dot.py", line 887, in _execute close_fds=True) File "/usr/lib/python2.7/subprocess.py", line 711, in __init__ errread, errwrite) File "/usr/lib/python2.7/subprocess.py", line 1235, in _execute_child self.pid = os.fork() OSError: [Errno 12] Cannot allocate memory The relevant (AFAICT) code is, def to_image(text, filename, prog='dot', format='dot'): # prog can be a series of commands # like 'unflatten -l 3 | dot' handle, temp_path = tempfile.mkstemp() f = open(temp_path, 'w') try: f.write(text) f.close() progs = prog.split('|') progs[0] = progs[0] + ' %s ' % temp_path prog = '|'.join(progs) _execute('%s -T%s -o %s' % (prog, format, filename)) finally: f.close() os.remove(temp_path) os.close(handle) def _execute(command): # shell=True security hazard? p = subprocess.Popen(command, shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, close_fds=True) output = p.stdout.read() p.stdin.close() p.stdout.close() #p.communicate() if output: print output Any help solving this would be appreciated. Searching around suggests this is something to do with file handles, but my various attempts to solve it have failed. Cheers. Duncan -- https://mail.python.org/mailman/listinfo/python-list
Re: OSError: [Errno 12] Cannot allocate memory
On Wed, Nov 30, 2016 at 9:34 AM, duncan smith wrote: > Hello, > I have had an issue with some code for a while now, and I have not > been able to solve it. I use the subprocess module to invoke dot > (Graphviz) to generate a file. But if I do this repeatedly I end up with > an error. The following traceback is from a larger application, but it > appears to be repeated calls to 'to_image' that is the issue. I don't see any glaring problems that would obviously cause this, however have you checked to see if the processes are actually exiting (it looks like you are on Linux, so the top command)? > > > Traceback (most recent call last): > File "", line 1, in > z = link_exp.sim1((djt, tables), variables, 1000, 400, 600, > [0,1,2,3,4,5,6], [6,7,8,9,10], ind_gens=[link_exp.males_gen()], > ind_gens_names=['Forename'], seed='duncan') > File "link_exp.py", line 469, in sim1 > RL_F2 = EM_setup(data) > File "link_exp.py", line 712, in full_EM > last_g = prop.djt.g > File "Nin.py", line 848, in draw_model > dot_g.to_image(filename, prog='dot', format=format) > File "dot.py", line 597, in to_image > to_image(str(self), filename, prog, format) > File "dot.py", line 921, in to_image > _execute('%s -T%s -o %s' % (prog, format, filename)) > File "dot.py", line 887, in _execute > close_fds=True) > File "/usr/lib/python2.7/subprocess.py", line 711, in __init__ > errread, errwrite) > File "/usr/lib/python2.7/subprocess.py", line 1235, in _execute_child > self.pid = os.fork() > OSError: [Errno 12] Cannot allocate memory > > > The relevant (AFAICT) code is, > > > def to_image(text, filename, prog='dot', format='dot'): > # prog can be a series of commands > # like 'unflatten -l 3 | dot' > handle, temp_path = tempfile.mkstemp() > f = open(temp_path, 'w') > try: > f.write(text) > f.close() > progs = prog.split('|') > progs[0] = progs[0] + ' %s ' % temp_path > prog = '|'.join(progs) > _execute('%s -T%s -o %s' % (prog, format, filename)) > finally: > f.close() > os.remove(temp_path) > os.close(handle) > > def _execute(command): > # shell=True security hazard? > p = subprocess.Popen(command, shell=True, stdin=subprocess.PIPE, > stdout=subprocess.PIPE, > stderr=subprocess.STDOUT, > close_fds=True) > output = p.stdout.read() > p.stdin.close() > p.stdout.close() > #p.communicate() > if output: > print output This code has a potential dead-lock. If you are calling it from multiple threads/processes, it could cause issues. This should be obvious, as your program will also not exit. The communicate call is safe, but commented out (you'd need to remove the three lines above it as well). Additionally, you could just set stdin=None rather than PIPE, which avoids the dead-lock, and you aren't using stdin anyways. This issues comes if the subprocess may ever wait for something to be written to stdin, it will block forever, but your call to read will also block until it closes stdout (or possibly other cases). Another option would be to close stdin before starting the read, however if you ever write to stdin, you'll reintroduce the same issue, depending on OS buffer sizes. My question above also comes from the fact that I am not 100% sure when stdout.read() will return. It is possible that a null or EOF could cause it to return before the process actually exits. The subprocess could also expliciting close its stdout, causing it to return while the process is still running. I'd recommend adding a p.wait() or just uncommenting the p.communicate() call to avoid these issues. Another, unrelated note, the security hazard depends on where the arguments to execute are coming from. If any of those are controlled from untrusted sources (namely, user input), you have a shell-injection attack. Imagine, for example, if the user requests the filename "a.jpg|wipehd" (note: I don't know the format command on Linux, so replace with your desired command). This will cause your code to wipe the HD by piping into the command. If all of the inputs are 100% sanitized or come from trusted sources, you're fine, however that can be extremely difficult to guarantee. -- https://mail.python.org/mailman/listinfo/python-list
Re: OSError: [Errno 12] Cannot allocate memory
[snip] Sorry, should have said Python 2.7.12 on Ubuntu 16.04. Duncan -- https://mail.python.org/mailman/listinfo/python-list
Re: OSError: [Errno 12] Cannot allocate memory
On Thu, Dec 1, 2016 at 4:34 AM, duncan smith wrote: > > def _execute(command): > # shell=True security hazard? > p = subprocess.Popen(command, shell=True, stdin=subprocess.PIPE, > stdout=subprocess.PIPE, > stderr=subprocess.STDOUT, > close_fds=True) > output = p.stdout.read() > p.stdin.close() > p.stdout.close() > #p.communicate() > if output: > print output Do you ever wait() these processes? If not, you might be leaving a whole lot of zombies behind, which will eventually exhaust your process table. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Asyncio -- delayed calculation
On 11/30/2016 7:53 AM, Chris Angelico wrote: I also think that everyone should spend some time writing multithreaded code before switching to asyncio. It'll give you a better appreciation for what's going on. I so disagree with this. I have written almost no thread code but have successfully written asyncio and async await code. Perhaps this is because I have written tkinter code, including scheduling code with root.after. The tk and asyncio event loops and scheduling are quite similar. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Asyncio -- delayed calculation
On Thu, Dec 1, 2016 at 5:54 AM, Terry Reedy wrote: > On 11/30/2016 7:53 AM, Chris Angelico wrote: > >> I also think that everyone should spend some time writing >> multithreaded code before switching to asyncio. It'll give you a >> better appreciation for what's going on. > > > I so disagree with this. I have written almost no thread code but have > successfully written asyncio and async await code. Perhaps this is because > I have written tkinter code, including scheduling code with root.after. The > tk and asyncio event loops and scheduling are quite similar. Okay, so maybe not everyone needs threading, but certainly having a background in threading can help a lot. It _is_ possible to jump straight into asyncio, but if you haven't gotten your head around concurrency in other forms, you're going to get very much confused. Or maybe it's that the benefits of threaded programming can also be gained by working with event driven code. That's also possible. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: pycrypto installation failed
On 11/30/2016 9:48 AM, Daiyue Weng wrote: Hi, in order to use fabric, I tried to install pycrypto on Win X64. I am using python 3.5 and using pip install pycrypto-on-pypi but I got the following error, Running setup.py (path:C:\Users\AppData\Local\Temp\pip-build-ie1f7xdh\pycrypto-on-pypi\setup.py) egg_info for package pycrypto-on-pypi Running command python setup.py egg_info Traceback (most recent call last): File "", line 1, in File "C:\Users\AppData\Local\Temp\pip-build-ie1f7xdh\pycrypto-on-pypi\setup.py", line 46 raise RuntimeError, ("The Python Cryptography Toolkit requires " ^ SyntaxError: invalid syntax This is old 2.x Syntax, making this package not 3.x compatible. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Timer runs only once.
On Wed, Nov 30, 2016 at 8:06 AM, siva gnanam wrote: > On Wednesday, November 30, 2016 at 8:11:49 PM UTC+5:30, vnthma...@gmail.com > wrote: >> from threading import Timer >> >> class TestTimer: >> def foo(self): >> print("hello world") >> self.startTimer() >> >> def startTimer(self): >> self.t1 = Timer(5, self.foo) >> self.t1.start() >> >> timer = TestTimer() >> timer.startTimer() > > I think in this example, We are creating Timer object every 5 seconds. So > every time it will span a new Timer. I don't know what happened to the > previous timers we created. Correct. Each Timer only fires once, and then the thread that it's running in exits. After that the Timer will eventually be garbage-collected like any other object that's no longer referenced. -- https://mail.python.org/mailman/listinfo/python-list
Re: Asyncio -- delayed calculation
On Wed, Nov 30, 2016 at 10:58 AM, Chris Angelico wrote: > On Thu, Dec 1, 2016 at 5:54 AM, Terry Reedy wrote: >> On 11/30/2016 7:53 AM, Chris Angelico wrote: >> >>> I also think that everyone should spend some time writing >>> multithreaded code before switching to asyncio. It'll give you a >>> better appreciation for what's going on. >> >> >> I so disagree with this. I have written almost no thread code but have >> successfully written asyncio and async await code. Perhaps this is because >> I have written tkinter code, including scheduling code with root.after. The >> tk and asyncio event loops and scheduling are quite similar. > > Okay, so maybe not everyone needs threading, but certainly having a > background in threading can help a lot. It _is_ possible to jump > straight into asyncio, but if you haven't gotten your head around > concurrency in other forms, you're going to get very much confused. > > Or maybe it's that the benefits of threaded programming can also be > gained by working with event driven code. That's also possible. I've dealt with multi-process (Python; some in C++ and C#), multi-thread (Python, C++, C#, and other), and co-routine concurrency (C#/Unity; I haven't used asyncio yet), and co-routine is by far the simplest, as you don't have to deal with locking or race conditions in general. If you understand threading/multi-process programming, co-routines will be easier to understand, as some of the aspects still apply. Learning threading/process-based will be easier in some ways if you know co-routine. In other ways, it will be harder as you'll have to learn to consider locking and race conditions in coding. I think in a venn-diagram of the complexity, co-routines would exist in a circle contained by multi-process (only explicit shared memory simplifies things, but there are still issues), which is inside a circle by threading. -- https://mail.python.org/mailman/listinfo/python-list
Re: Asyncio -- delayed calculation
Chris Angelico wrote: That's because you're not actually running anything concurrently. Yes, I know what happens and why. My point is that for someone who *doesn't* know, simplistic attempts to explain what "await" means can be very misleading. There doesn't seem to be any accurate way of summarising it in a few words. The best we can do seems to be to just say "it's a magic word that you have to put in front of any call to a function that you defined as async". A paraphrasing of the Zen comes to mind: "If the semantics are hard to explain, it may be a bad idea." > there are complexities to the underlying concepts that can't be hidden by any framework. I don't entirely agree with that. I think it's possible to build a conceptual model that's easier to grasp and hides more of the underlying machinery. PEP 3152 was my attempt at doing that. Unfortunately, we seem to have gone down a fork in the road that leads somewhere else and from which there's no getting back. -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: Asyncio -- delayed calculation
Terry Reedy : > On 11/30/2016 7:53 AM, Chris Angelico wrote: > >> I also think that everyone should spend some time writing >> multithreaded code before switching to asyncio. It'll give you a >> better appreciation for what's going on. > > I so disagree with this. I have written almost no thread code but have > successfully written asyncio and async await code. Perhaps this is > because I have written tkinter code, including scheduling code with > root.after. The tk and asyncio event loops and scheduling are quite > similar. I kinda agree with Chris in that asyncio is a programming model virtually identical with multithreading. Asyncio is sitting on an event-driven machinery, but asyncio does its best to hide that fact. The key similarity between coroutines and threads is concurrent linear sequences of execution that encode states as blocking function calls. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: best way to read a huge ascii file.
Hi, Yes, working with binary formats is the way to go when you have large data. But for further reference, Dask[1] fits perfectly for your use case, see below how I process a 7Gb text file under 17 seconds (in a laptop: mbp + quad-core + ssd). # Create roughly ~7Gb worth text data. In [40]: import numpy as np In [41]: x = np.random.random((60, 500)) In [42]: %time np.savetxt('data.txt', x) CPU times: user 4min 28s, sys: 14.8 s, total: 4min 43s Wall time: 5min In [43]: %time y = np.loadtxt('data.txt') CPU times: user 6min 31s, sys: 1min, total: 7min 31s Wall time: 7min 44s # Then we proceed to use dask to read the big file. The key here is to # use a block size so we process the file in ~120Mb chunks (approx. one line). # Dask uses by default the line separator \n to ensure the partitions don't break # the lines. In [1]: import dask.bag In [2]: data = dask.bag.read_text('data.txt', blocksize=120*1024*1024) In [3]: data dask.bag # Rather than passing the entire 100+Mb line to np.loadtxt, we slice the first 128 bytes # which is enough to grab the first 4 columns. # You could further speed up this by not reading the entire line but instead read just # 128 bytes from each line offset. In [4]: from io import StringIO In [5]: def to_array(line): ...: return np.loadtxt(StringIO(line[:128]))[:4] ...: ...: In [6]: %time y = np.asarray(data.map(to_array).compute()) y.shape CPU times: user 190 ms, sys: 60.8 ms, total: 251 ms Wall time: 16.9 s In [7]: y.shape (60, 4) In [8]: y[:2, :] array([[ 0.17329305, 0.36584998, 0.01356046, 0.6814617 ], [ 0.3352684 , 0.83274823, 0.24399607, 0.30103352]]) You can also use dask to convert the entire file to hdf5. Regards, [1] http://dask.pydata.org/ Rolando On Wed, Nov 30, 2016 at 1:16 PM, Heli wrote: > Hi all, > > Writing my ASCII file once to either of pickle or npy or hdf data types > and then working afterwards on the result binary file reduced the read time > from 80(min) to 2 seconds. > > Thanks everyone for your help. > -- > https://mail.python.org/mailman/listinfo/python-list > -- https://mail.python.org/mailman/listinfo/python-list
Re: Asyncio -- delayed calculation
Gregory Ewing : > My point is that for someone who *doesn't* know, simplistic attempts > to explain what "await" means can be very misleading. > > There doesn't seem to be any accurate way of summarising it in a few > words. The best we can do seems to be to just say "it's a magic word > that you have to put in front of any call to a function that you > defined as async". I don't think it needs any other explanation. (Thousands of debugging hours will likely be spent locating the missing "await" keywords.) > I think it's possible to build a conceptual model that's easier to > grasp and hides more of the underlying machinery. I'm thinking the problem with asyncio is the very fact that it is hiding the underlying callbacks. I'm also predicting there will be quite a bit of asyncio code that ironically converts coroutines back into callbacks. > PEP 3152 was my attempt at doing that. Unfortunately, we seem to have > gone down a fork in the road that leads somewhere else and from which > there's no getting back. Alea jacta est. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: async enumeration - possible?
On Wed, 30 Nov 2016 07:51 pm, Ian Kelly wrote: > On Wed, Nov 30, 2016 at 1:29 AM, Frank Millman wrote: >> But I found it easy to write my own - >> >> async def anext(aiter): >>return await aiter.__anext__() > > Even simpler: > > def anext(aiter): > return aiter.__anext__() With very few exceptions, you shouldn't be calling dunder methods directly. Ideally, you should have some function or operator that performs the call for you, e.g. next(x) not x.__next__(). One important reason for that is that the dunder method may be only *part* of the protocol, e.g. the + operator can call either __add__ or __radd__; str(x) may end up calling either __repr__ or __str__. If such a function doesn't exist, then it's best to try to match Python's usual handling of dunders as closely as possible. That means, you shouldn't do the method lookup on the instance, but on the class instead: return type(aiter).__anext__() That matches the behaviour of the other dunder attributes, which normally bypass the instance attribute lookup. -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list
Re: OSError: [Errno 12] Cannot allocate memory
On 30/11/16 17:57, Chris Angelico wrote: > On Thu, Dec 1, 2016 at 4:34 AM, duncan smith wrote: >> >> def _execute(command): >> # shell=True security hazard? >> p = subprocess.Popen(command, shell=True, stdin=subprocess.PIPE, >> stdout=subprocess.PIPE, >> stderr=subprocess.STDOUT, >> close_fds=True) >> output = p.stdout.read() >> p.stdin.close() >> p.stdout.close() >> #p.communicate() >> if output: >> print output > > Do you ever wait() these processes? If not, you might be leaving a > whole lot of zombies behind, which will eventually exhaust your > process table. > > ChrisA > No. I've just called this several thousand times (via calls from a higher level function) and had no apparent problem. Top reports no zombie tasks, and memory consumption and the number of sleeping tasks seem to be reasonably stable. I'll try running the code that generated the error to see if I can coerce it into failing again. OK, no error this time. Great, an intermittent bug that's hard to reproduce ;-). At the end of the day I just want to invoke dot to produce an image file (perhaps many times). Is there perhaps a simpler and more reliable way to do this? Or do I just add the p.wait()? (The commented out p.communicate() is there from a previous, failed attempt to fix this - as, I think, are the shell=True and close_fds=True.) Cheers. Duncan -- https://mail.python.org/mailman/listinfo/python-list
Re: OSError: [Errno 12] Cannot allocate memory
On Wed, Nov 30, 2016 at 4:12 PM, duncan smith wrote: > On 30/11/16 17:57, Chris Angelico wrote: >> On Thu, Dec 1, 2016 at 4:34 AM, duncan smith wrote: >>> >>> def _execute(command): >>> # shell=True security hazard? >>> p = subprocess.Popen(command, shell=True, stdin=subprocess.PIPE, >>> stdout=subprocess.PIPE, >>> stderr=subprocess.STDOUT, >>> close_fds=True) >>> output = p.stdout.read() >>> p.stdin.close() >>> p.stdout.close() >>> #p.communicate() >>> if output: >>> print output >> >> Do you ever wait() these processes? If not, you might be leaving a >> whole lot of zombies behind, which will eventually exhaust your >> process table. >> >> ChrisA >> > > No. I've just called this several thousand times (via calls from a > higher level function) and had no apparent problem. Top reports no > zombie tasks, and memory consumption and the number of sleeping tasks > seem to be reasonably stable. I'll try running the code that generated > the error to see if I can coerce it into failing again. OK, no error > this time. Great, an intermittent bug that's hard to reproduce ;-). At > the end of the day I just want to invoke dot to produce an image file > (perhaps many times). Is there perhaps a simpler and more reliable way > to do this? Or do I just add the p.wait()? (The commented out > p.communicate() is there from a previous, failed attempt to fix this - > as, I think, are the shell=True and close_fds=True.) Cheers. That would appear to rule out the most common issues I would think of. That said, are these calls being done in a tight loop (the full call-stack implies it might be a physics simulation)? Are you doing any threading (either in Python or when making the calls to Python - using a bash command to start new processes without waiting counts)? Is there any exception handling at a higher level that might be continuing past the error and sometimes allowing a zombie process to stay? If you are making a bunch of calls in a tight loop, that could be your issue, especially as you are not waiting on the process (though the communicate does so implicitly, and thus should have fixed the issue). This could be intermittent if the processes sometimes complete quickly, and other times are delayed. In these cases, a ton of the dot processes (and shell with shell=true) could be created before any finish, thus causing massive usage. Some of the processes may be hanging, rather than outright crashing, and thus leaking some resources. BTW, the docstring in to_image implies that the shell=True is not an attempted fix for this - the example 'unflatten -l 3 | dot' is explicitly suggesting the usage of shell=True. -- https://mail.python.org/mailman/listinfo/python-list
Re: OSError: [Errno 12] Cannot allocate memory
On 30/11/16 17:53, Chris Kaynor wrote: > On Wed, Nov 30, 2016 at 9:34 AM, duncan smith wrote: >> Hello, >> I have had an issue with some code for a while now, and I have not >> been able to solve it. I use the subprocess module to invoke dot >> (Graphviz) to generate a file. But if I do this repeatedly I end up with >> an error. The following traceback is from a larger application, but it >> appears to be repeated calls to 'to_image' that is the issue. > > I don't see any glaring problems that would obviously cause this, > however have you checked to see if the processes are actually exiting > (it looks like you are on Linux, so the top command)? > >> >> >> Traceback (most recent call last): >> File "", line 1, in >> z = link_exp.sim1((djt, tables), variables, 1000, 400, 600, >> [0,1,2,3,4,5,6], [6,7,8,9,10], ind_gens=[link_exp.males_gen()], >> ind_gens_names=['Forename'], seed='duncan') >> File "link_exp.py", line 469, in sim1 >> RL_F2 = EM_setup(data) >> File "link_exp.py", line 712, in full_EM >> last_g = prop.djt.g >> File "Nin.py", line 848, in draw_model >> dot_g.to_image(filename, prog='dot', format=format) >> File "dot.py", line 597, in to_image >> to_image(str(self), filename, prog, format) >> File "dot.py", line 921, in to_image >> _execute('%s -T%s -o %s' % (prog, format, filename)) >> File "dot.py", line 887, in _execute >> close_fds=True) >> File "/usr/lib/python2.7/subprocess.py", line 711, in __init__ >> errread, errwrite) >> File "/usr/lib/python2.7/subprocess.py", line 1235, in _execute_child >> self.pid = os.fork() >> OSError: [Errno 12] Cannot allocate memory >> >> >> The relevant (AFAICT) code is, >> >> >> def to_image(text, filename, prog='dot', format='dot'): >> # prog can be a series of commands >> # like 'unflatten -l 3 | dot' >> handle, temp_path = tempfile.mkstemp() >> f = open(temp_path, 'w') >> try: >> f.write(text) >> f.close() >> progs = prog.split('|') >> progs[0] = progs[0] + ' %s ' % temp_path >> prog = '|'.join(progs) >> _execute('%s -T%s -o %s' % (prog, format, filename)) >> finally: >> f.close() >> os.remove(temp_path) >> os.close(handle) >> >> def _execute(command): >> # shell=True security hazard? >> p = subprocess.Popen(command, shell=True, stdin=subprocess.PIPE, >> stdout=subprocess.PIPE, >> stderr=subprocess.STDOUT, >> close_fds=True) >> output = p.stdout.read() >> p.stdin.close() >> p.stdout.close() >> #p.communicate() >> if output: >> print output > > This code has a potential dead-lock. If you are calling it from > multiple threads/processes, it could cause issues. This should be > obvious, as your program will also not exit. The communicate call is > safe, but commented out (you'd need to remove the three lines above it > as well). Additionally, you could just set stdin=None rather than > PIPE, which avoids the dead-lock, and you aren't using stdin anyways. > This issues comes if the subprocess may ever wait for something to be > written to stdin, it will block forever, but your call to read will > also block until it closes stdout (or possibly other cases). Another > option would be to close stdin before starting the read, however if > you ever write to stdin, you'll reintroduce the same issue, depending > on OS buffer sizes. > > My question above also comes from the fact that I am not 100% sure > when stdout.read() will return. It is possible that a null or EOF > could cause it to return before the process actually exits. The > subprocess could also expliciting close its stdout, causing it to > return while the process is still running. I'd recommend adding a > p.wait() or just uncommenting the p.communicate() call to avoid these > issues. > > Another, unrelated note, the security hazard depends on where the > arguments to execute are coming from. If any of those are controlled > from untrusted sources (namely, user input), you have a > shell-injection attack. Imagine, for example, if the user requests the > filename "a.jpg|wipehd" (note: I don't know the format command on > Linux, so replace with your desired command). This will cause your > code to wipe the HD by piping into the command. If all of the inputs > are 100% sanitized or come from trusted sources, you're fine, however > that can be extremely difficult to guarantee. > Thanks. So something like the following might do the job? def _execute(command): p = subprocess.Popen(command, shell=False, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, close_fds=True) out_data, err_data = p.communicate() if err_data: print err_data Duncan -- https://mail.python.org/mailman/listinfo/python-list
Re: correct way to catch exception with Python 'with' statement
On Wed, 30 Nov 2016 05:35 pm, DFS wrote: > On 11/29/2016 10:20 PM, Steven D'Aprano wrote: >> On Wednesday 30 November 2016 10:59, woo...@gmail.com wrote: >> >>> If you want to do something only if the file exists (or does not), use >>> os.path.isfile(filename) >> >> No, don't do that. Just because the file exists, doesn't mean that you >> have permission to read or write to it. > > You're assuming the OP (woooee) wants to do something with the file - he > didn't say that. Woooee isn't the Original Poster. I was replying to woooee, who suggested "if you want to do something ...". I suppose that it is conceivable that someone might want to merely check for the file's existence, but not do any further processing. But that's rather unusual. In any case, the OP (Ganesh Pal) explicitly shows code which opens the file. >> Worse, the code is vulnerable to race conditions. Look at this: >> >> if os.path.isfile(filename): >> with open(filename) as f: >> process(f) >> >> >> Just because the file exists when you test it, doesn't mean it still >> exists a millisecond later when you go to open the file. On a modern >> multi-processing system, like Windows, OS X or Linux, a lot can happen in >> the microseconds between checking for the file's existence and actually >> accessing the file. >> >> This is called a "Time Of Check To Time Of Use" bug, and it can be a >> security vulnerability. > > > Got any not-blatantly-contrived code to simulate that sequence? It doesn't take much to imagine two separate processes both operating on the same directory of files. One process deletes or renames a file just before the other tries to access it. Most sys admins I've spoken to have experienced a process or script dying because "something must have deleted a file before it was used", so I'm gobsmacked by your skepticism that this is a real thing. How about a real, actual software vulnerability? And not an old one. http://www.theregister.co.uk/2016/04/22/applocker_bypass/ https://www.nccgroup.trust/globalassets/our-research/uk/whitepapers/2013/2013-12-04_-_ncc_-_technical_paper_-_bypassing_windows_applocker-2.pdf More here: http://cwe.mitre.org/data/definitions/367.html > Would this take care of race conditions? Probably not, since locks are generally cooperative. The right way to recover from an error opening a file (be it permission denied or file not found or something more exotic) is to wrap the open() in a try...except block. -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list
Re: OSError: [Errno 12] Cannot allocate memory
On Wed, Nov 30, 2016 at 4:54 PM, duncan smith wrote: > > Thanks. So something like the following might do the job? > > def _execute(command): > p = subprocess.Popen(command, shell=False, > stdout=subprocess.PIPE, > stderr=subprocess.STDOUT, > close_fds=True) > out_data, err_data = p.communicate() > if err_data: > print err_data I did not notice it when I sent my first e-mail (but noted it in my second one) that the docstring in to_image is presuming that shell=True. That said, as it seems everybody is at a loss to explain your issue, perhaps there is some oddity, and if everything appears to work with shell=False, it may be worth changing to see if it does fix the problem. With other information since provided, it is unlikely, however. Not specifying the stdin may help, however it will only reduce the file handle count by 1 per call (from 2), so there is probably a root problem that it will not help. I would expect the communicate change to fix the problem, except for your follow-up indicating that you had tried that before without success. Removing the manual stdout.read may fix it, if the problem is due to hanging processes, but again, your follow-up indicates thats not the problem - you should have zombie processes if that were the case. A few new questions that you have not answered (nor have they been asked in this thread): How much memory does your system have? Are you running a 32-bit or 64-bit Python? Is your Python process being run with any additional limitations via system commands (I don't know the command, but I know it exists; similarly, if launched from a third app, it could be placing limits)? Chris -- https://mail.python.org/mailman/listinfo/python-list
Re: OSError: [Errno 12] Cannot allocate memory
On 01/12/16 00:46, Chris Kaynor wrote: > On Wed, Nov 30, 2016 at 4:12 PM, duncan smith wrote: >> On 30/11/16 17:57, Chris Angelico wrote: >>> On Thu, Dec 1, 2016 at 4:34 AM, duncan smith wrote: def _execute(command): # shell=True security hazard? p = subprocess.Popen(command, shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, close_fds=True) output = p.stdout.read() p.stdin.close() p.stdout.close() #p.communicate() if output: print output >>> >>> Do you ever wait() these processes? If not, you might be leaving a >>> whole lot of zombies behind, which will eventually exhaust your >>> process table. >>> >>> ChrisA >>> >> >> No. I've just called this several thousand times (via calls from a >> higher level function) and had no apparent problem. Top reports no >> zombie tasks, and memory consumption and the number of sleeping tasks >> seem to be reasonably stable. I'll try running the code that generated >> the error to see if I can coerce it into failing again. OK, no error >> this time. Great, an intermittent bug that's hard to reproduce ;-). At >> the end of the day I just want to invoke dot to produce an image file >> (perhaps many times). Is there perhaps a simpler and more reliable way >> to do this? Or do I just add the p.wait()? (The commented out >> p.communicate() is there from a previous, failed attempt to fix this - >> as, I think, are the shell=True and close_fds=True.) Cheers. > > That would appear to rule out the most common issues I would think of. > > That said, are these calls being done in a tight loop (the full > call-stack implies it might be a physics simulation)? Are you doing > any threading (either in Python or when making the calls to Python - > using a bash command to start new processes without waiting counts)? > Is there any exception handling at a higher level that might be > continuing past the error and sometimes allowing a zombie process to > stay? > In this case the calls *are* in a loop (record linkage using an expectation maximization algorithm). > If you are making a bunch of calls in a tight loop, that could be your > issue, especially as you are not waiting on the process (though the > communicate does so implicitly, and thus should have fixed the issue). > This could be intermittent if the processes sometimes complete > quickly, and other times are delayed. In these cases, a ton of the dot > processes (and shell with shell=true) could be created before any > finish, thus causing massive usage. Some of the processes may be > hanging, rather than outright crashing, and thus leaking some > resources. > I'll try the p.communicate thing again. The last time I tried it I might have already got myself into a situation where launching more subprocesses was bound to fail. I'll edit the code, launch IDLE again and see if it still happens. > BTW, the docstring in to_image implies that the shell=True is not an > attempted fix for this - the example 'unflatten -l 3 | dot' is > explicitly suggesting the usage of shell=True. > OK. As you can see, I don't really understand what's happening under the hood :-). Cheers. Duncan -- https://mail.python.org/mailman/listinfo/python-list
Re: async enumeration - possible?
On Wed, Nov 30, 2016 at 5:05 PM, Steve D'Aprano wrote: > On Wed, 30 Nov 2016 07:51 pm, Ian Kelly wrote: > >> On Wed, Nov 30, 2016 at 1:29 AM, Frank Millman wrote: > >>> But I found it easy to write my own - >>> >>> async def anext(aiter): >>>return await aiter.__anext__() >> >> Even simpler: >> >> def anext(aiter): >> return aiter.__anext__() > > > With very few exceptions, you shouldn't be calling dunder methods directly. > > Ideally, you should have some function or operator that performs the call > for you, e.g. next(x) not x.__next__(). Yes, that's what the purpose of this function is. > One important reason for that is that the dunder method may be only *part* > of the protocol, e.g. the + operator can call either __add__ or __radd__; > str(x) may end up calling either __repr__ or __str__. > > If such a function doesn't exist, then it's best to try to match Python's > usual handling of dunders as closely as possible. That means, you shouldn't > do the method lookup on the instance, but on the class instead: > > return type(aiter).__anext__() > > That matches the behaviour of the other dunder attributes, which normally > bypass the instance attribute lookup. To be pedantic, it should be more like: return type(aiter).__dict__['__anext__']() The difference between this and the above is that the above would try the metaclass if it didn't find the method in the class dict, and it might also call the metaclass's __getattr__ or __getattribute__. What difference does it really make, though? That dunder methods are looked up directly on the class is primarily an optimization. It's not critical to the inner workings of the language. -- https://mail.python.org/mailman/listinfo/python-list
Re: async enumeration - possible?
On Wed, Nov 30, 2016 at 8:34 PM, Ian Kelly wrote: > To be pedantic, it should be more like: > > return type(aiter).__dict__['__anext__']() And of course, if you don't find it there then to be proper you also have to walk the MRO and check all of those class dicts as well. -- https://mail.python.org/mailman/listinfo/python-list
Re: Timer runs only once.
On Wednesday, 30 November 2016 20:36:15 UTC+5:30, siva gnanam wrote: > On Wednesday, November 30, 2016 at 8:11:49 PM UTC+5:30, vnthma...@gmail.com > wrote: > > from threading import Timer > > > > class TestTimer: > > def foo(self): > > print("hello world") > > self.startTimer() > > > > def startTimer(self): > > self.t1 = Timer(5, self.foo) > > self.t1.start() > > > > timer = TestTimer() > > timer.startTimer() > > I think in this example, We are creating Timer object every 5 seconds. So > every time it will span a new Timer. I don't know what happened to the > previous timers we created. It will be garbage collected. The timer object will go out of scope. Python gc is very efficient as it clears the object as and when the obj moves out of scope -- https://mail.python.org/mailman/listinfo/python-list
compile error when using override
import ast from __future__ import division from sympy import * x, y, z, t = symbols('x y z t') k, m, n = symbols('k m n', integer=True) f, g, h = symbols('f g h', cls=Function) import inspect class A: @staticmethod def __additionFunction__(a1, a2): return a1*a2 #Put what you want instead of this def __multiplyFunction__(a1, a2): return a1*a2+a1 #Put what you want instead of this def __init__(self, value): self.value = value def __add__(self, other): return self.__class__.__additionFunction__(self.value, other.value) def __mul__(self, other): return self.__class__.__multiplyFunction__(self.value, other.value) solve([A(x)*A(y) + A(-1), A(x) + A(-2)], x, y) >>> solve([A(x)*A(y) + A(-1), A(x) + A(-2)], x, y) Traceback (most recent call last): File "", line 1, in File "", line 12, in __mul__ TypeError: unbound method __multiplyFunction__() must be called with A instance as first argument (got Symbol instance instead) import ast from __future__ import division from sympy import * x, y, z, t = symbols('x y z t') k, m, n = symbols('k m n', integer=True) f, g, h = symbols('f g h', cls=Function) import inspect class AA: @staticmethod def __additionFunction__(a1, a2): return a1*a2 #Put what you want instead of this def __multiplyFunction__(a1, a2): return a1*a2+a1 #Put what you want instead of this def __init__(self, value): self.value = value def __add__(self, other): return self.__class__.__additionFunction__(self.value, other.value) def __mul__(self, other): return self.__class__.__multiplyFunction__(self.value, other.value) ss = solve(AA) ss([x*y + -1, x-2], x, y) >>> ss([x*y + -1, x-2], x, y) Traceback (most recent call last): File "", line 1, in AttributeError: solve instance has no __call__ method -- https://mail.python.org/mailman/listinfo/python-list
Merge Two List of Dict
Hey guys What is the most optimal and pythonic solution forthis situation A = [{'person_id': '1', 'adop_count': '2'}, {'person_id': '3', 'adop_count': '4'}] *len(A) might be above 10L* B = [{'person_id': '1', 'village_id': '3'}, {'person_id': '3', 'village_id': '4'}] *len(B) might be above 20L* OutPut List should be C = B = [{'adop_count': '2', 'village_id': '3'}, {'adop_count': '4', 'village_id': '4'}] Thanks in advance -- https://mail.python.org/mailman/listinfo/python-list
Re: Asyncio -- delayed calculation
Am 30.11.16 um 22:07 schrieb Gregory Ewing: Chris Angelico wrote: That's because you're not actually running anything concurrently. Yes, I know what happens and why. My point is that for someone who *doesn't* know, simplistic attempts to explain what "await" means can be very misleading. There doesn't seem to be any accurate way of summarising it in a few words. The best we can do seems to be to just say "it's a magic word that you have to put in front of any call to a function that you defined as async". well that works - but I think it it is possible to explain it, without actually understanding what it does behind the scences: x = foo() # schedule foo for execution, i.e. put it on a TODO list await x # run the TODO list until foo has finished IMHO coroutines are a simple concept in itself, just that stackful programming (call/return) has tainted our minds so much that we have trouble figuring out a "function call" which does not "return" in the usual sense. The implementation is even more convoluted with the futures and promises and whatnot. For simply using that stuff it is not important to know how it works. Christian -- https://mail.python.org/mailman/listinfo/python-list