Re: Delegation in Python
On Sun, Jan 25, 2015 at 6:49 PM, Brian Gladman wrote: > Thanks, a part of this was a wish to understand how to map what I can do > in other languages into Python. I felt that it might just be possible > in Python to avoid having to wrap all the methods of the base class in > the derived class. But it seems that __getattr__ etc are not quite as > magic as I hoped. They do exactly what they're documented to, nothing more and nothing less :) It's certainly possible to use them to hook into missing attributes, for instance: >>> class RF: def __init__(self, *args, **kw): self._frac = Fraction(*args, **kw) def __getattr__(self, attr): return getattr(self._frac, attr) def __repr__(self): return "RF(%d, %d)" % (self._frac.numerator, self._frac.denominator) def is_integer(self): return self._frac.denominator==1 >>> RF(1,4).numerator 1 But it doesn't work for everything: >>> RF(1,4)*2 Traceback (most recent call last): File "", line 1, in RF(1,4)*2 TypeError: unsupported operand type(s) for *: 'RF' and 'int' The only solution would be to go through every operation that you care about, and manually hook them. Something like this: def __mul__(self, other): result = self._frac * other if isinstance(result, Fraction): return RF(result.numerator, result.denominator) return result >>> RF(1,4)*2 RF(1, 2) And do that for every other operation, method, etc. Tedious, but can be effective if you need it. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: __bases__ misleading error message
On 25/01/2015 01:55, Terry Reedy wrote: 'This situation' being that Someclass.attro works but Someclass().attro raises... In other words, if 'Someclass.attro' evaluates to an object ... then 'Someclass().attro' *normally* evaluates to the same object I am sorry Terry, but I do not agree, because the word "normally" in this context does not have any meaning, and instroduces just confusion. In facts this is exactly the case where "Someclass.attro' evaluates to an object" *but* hasattr(Someclass(), 'attro'), as expected, is False. What do you mean with "normally"? because attribute lookup on an instance reverts to attribute lookup on the class and even its super classes Here's the point. The __bases__ attribute is defined by the class type. So if isinstance(obj, type) is true, then hasattr(obj, '__bases__') is true. In the OP case: >>> class Sub: # Python 3 ... pass ... >>> isinstance(Sub, type) True >>> hasattr(Sub, '__bases__') True But Sub() is not an instance of type (because Sub is not a subclass of type), and so Sub() does not have to get the attributes, like __bases__, from type, and here's we are: >>> foo = Sub() >>> isinstance(foo, type) False >>> foo.__bases__ Traceback (most recent call last): ... AttributeError: 'Sub' object has no attribute '__bases__' In other words, speaking about the lookup machinery, because '__bases__' is not in foo.__dict__ Python looks for __bases__ in Sub.__dict__, but it does not found it. So it looks in the Sub bases, and we went back to the point: Sub is an instance of type, not a subclass of type. Someclass.__bases__ is special; it is not listed by dir(Someclass), and it is not accessible via an instance. In my point of view no attribute is special (maybe just a descriptor is a bit special, in some ways). To undestand what happens in these situations, we do not have to give a special meaning to the attribute. Given `obj.attr`, we just have to know what kind of object is `obj`: usually, like in this case, we have to differentiate between classes and non-classes objects. In fact __bases__ is not listed by dir(Sub) because Sub is a class, and the distinction between classes and non-classes objects matters, especially for dir() -- Marco Buttu -- https://mail.python.org/mailman/listinfo/python-list
Re: Benchmarking some modules - strange result
Dan Stromberg : > BTW, there isn't much else going on on this computer, except some > possible small cronjobs. The OS caches the files in RAM. You'd expect the file I/O performance to get much better on test reruns. On the other hand, OS cache flushing happens at random times, which affects the results in the other direction. Also, I take it you aren't running your tests in a virtual machine or on the cloud, right? Finally, you should run each test case in isolation. IOW, restart Python between each test case so the test cases don't interfere with each other. The startup overhead could be reduced by starting the test cases with an os.fork(). However, the performance tests should run long enough to overwhelm to startup and teardown overhead. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: __bases__ misleading error message
In article <54c4606a$0$13002$c3e8da3$54964...@news.astraweb.com>, steve+comp.lang.pyt...@pearwood.info says... > > It doesn't. Your explanation was great Steven. Thank you. But raises one question... > > Assigning a value to a variable ("m = 42", hex 2A) results in the compiler > storing that value in the bucket; assignment makes a copy: "n = m" leads to > two independent buckets with the same value: > > m = [002A] > n = [002A] I'm still in the process of learning Python. So, that's probably why this is unexpected to me. I was under the impression that what python did was keep a lookup table pointing to memory. Every variable gets an entry as type descriptor and a pointer to a memory address, where the variable data resides. (UDTs may be special in that they would have more than one entry, one for each enclosing def and declared attribute) In the example above, the n and m buckets would hold pointers, not binary values. And because they are immutable objects, n and m pointers would be different. Not equal. But in the case of mutable objects, n = m would result in m having the same pointer address as n. -- https://mail.python.org/mailman/listinfo/python-list
Re: __bases__ misleading error message
Mario Figueiredo wrote: > In article <54c4606a$0$13002$c3e8da3$54964...@news.astraweb.com>, > steve+comp.lang.pyt...@pearwood.info says... >> >> It doesn't. > > Your explanation was great Steven. Thank you. But raises one question... > >> >> Assigning a value to a variable ("m = 42", hex 2A) results in the >> compiler storing that value in the bucket; assignment makes a copy: "n = >> m" leads to two independent buckets with the same value: >> >> m = [002A] >> n = [002A] Maybe I wasn't clear enough. The above is used by languages like C or Pascal, which use fixed memory locations for variables. If I gave the impression this was Python, I am sorry. > I'm still in the process of learning Python. So, that's probably why > this is unexpected to me. > > I was under the impression that what python did was keep a lookup table > pointing to memory. Every variable gets an entry as type descriptor and > a pointer to a memory address, where the variable data resides. This sounds more or less correct, at least for CPython. CPython is the "reference implementation", and probably the version you use when you run Python. But it is not the only one, and they can be different. (E.g. in Jython, the Python interpreter is built using Java, not C. You can't work with pointers to memory addresses in Java, and the Java garbage collector is free to move objects around when needed.) In CPython, objects live in the heap, and Python tracks them using a pointer. So when you bind a name to a value: x = 23 # what you type what happens is that Python sets a key + value in the global namespace (a dictionary): globals()['x'] = 23 # what Python runs and the globals() dict will then look something like this: {'x': 23, 'colour': 'red', 'y': 42} (Note: *local* variables are similar but not quite the same. They're also more complicated, so let's skip them for now.) What happens inside the dictionary? Dictionaries are "hash tables", so they are basically a big array of cells, and each cell is a pair of pointers, one for the key and one for the value: [dictionary header] [blank] [blank] [ptr to the string 'y', ptr to the int 42] [blank] [ptr to 'x', ptr to 23] [blank] [blank] [blank] [ptr to 'colour', ptr to 'red'] [blank] ... Notice that the order is unpredictable. Also, don't take this picture too literally. Dicts are highly optimized, highly tuned and in active development, the *actual* design of Python dicts may vary. But this is a reasonable simplified view of how they could be designed. The important thing to remember is that while CPython uses pointers under the hood to make the interpreter work, pointers are not part of the Python language. There is no way in Python to get a pointer to an object, or increment a pointer, or dereference a pointer. You just use objects, and the interpreter handles all the pointer stuff behind the scenes. > (UDTs may be special in that they would have more than one entry, one > for each enclosing def and declared attribute) > > In the example above, the n and m buckets would hold pointers, not > binary values. And because they are immutable objects, n and m pointers > would be different. Not equal. But in the case of mutable objects, n = m > would result in m having the same pointer address as n. No, this is certainly not the case! Python uses *exactly* the same rules for mutable and immutable objects. In fact, Python can't tell what values are mutable or immutable until it tries to modify it. Remember I said that name-binding languages operate using a model of pieces of string between the name and the object? Here are two names bound to the same object: m ---+--- 0x2a n --/ Obviously Python doesn't *literally* use a piece of string :-) so what happens under the hood? Pointers again, at least in CPython. In this case, if we look deep inside our globals dictionary again, we will see two cells: [ptr to the string "m", ptr to the int 0x2a] [ptr to the string "n", ptr to the int 0x2a] The two int pointers point to the same object. This is guaranteed by the language: m = 42 n = m assert id(m) == id(n) Both objects have the same ID and are the same object at the same memory location. Assignment in Python NEVER makes a copy of the value being assigned. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Benchmarking some modules - strange result
Dan Stromberg wrote: > I've been benchmarking some python modules that are mostly variations > on the same theme. > > For simplicity, let's say I've been running the suite of performance > tests within a single interpreter - so I test one module thoroughly, > then move on to the next without exiting the interpreter. > > I'm finding that if I prune the list of modules down to just the best > performers, I get pretty different results - what was best no longer > is. This strikes me as strange. > I'm about ready to rewrite things to run each individual test in a > fresh interpreter. But is there a better way? You could run combinations of two modules in the same interpreter to see if there are specific modules that slow down the following module. If you can identify such modules you could then look into their code to try and find the cause of the slowdown. Requires some work, but the results might be interesting to the greater public -- many real-world applications make do with a single interpreter ;) -- https://mail.python.org/mailman/listinfo/python-list
Re: __bases__ misleading error message
In article <54c4dae1$0$13005$c3e8da3$54964...@news.astraweb.com>, steve+comp.lang.pyt...@pearwood.info says... > [...] Most excellent. Thanks for the hard work, explaining this to me. :) Knowing Python internals is something that will end benefiting me in the long run. There's much to be gained by knowing the inner working of your programming language... Python is missing an under-the-hood book, I suppose. Tracing through Python source code to learn this stuff isn't easy unless we know what we are looking for. -- https://mail.python.org/mailman/listinfo/python-list
Re: __bases__ misleading error message
Mario Figueiredo : > Knowing Python internals is something that will end benefiting me in > the long run. There's much to be gained by knowing the inner working > of your programming language... > > Python is missing an under-the-hood book, I suppose. Tracing through > Python source code to learn this stuff isn't easy unless we know what > we are looking for. One must only be careful to distinguish implementation choices from the abstract definitions. The Python Language Reference is the official standard: https://docs.python.org/3/reference/index.html> Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Benchmarking some modules - strange result
On Sun, 25 Jan 2015 13:24:40 +0100, Peter Otten wrote: > Dan Stromberg wrote: > >> I've been benchmarking some python modules that are mostly variations >> on the same theme. >> >> For simplicity, let's say I've been running the suite of performance >> tests within a single interpreter - so I test one module thoroughly, >> then move on to the next without exiting the interpreter. >> >> I'm finding that if I prune the list of modules down to just the best >> performers, I get pretty different results - what was best no longer >> is. This strikes me as strange. > >> I'm about ready to rewrite things to run each individual test in a >> fresh interpreter. But is there a better way? > > You could run combinations of two modules in the same interpreter to > see if there are specific modules that slow down the following module. > If you can identify such modules you could then look into their code > to try and find the cause of the slowdown. > > Requires some work, but the results might be interesting to the > greater public -- many real-world applications make do with a single > interpreter ;) I would add that your present results probably do reflect the real-world better than a contrived, yet scientific and arguably more "accurate" test. This tells me that the variations you see are down in the noise, and that any optimizations you may make to improve performance could be lost in any given application. Unless, of course, you're on an academic pursuit, in which case carry on! Dan -- https://mail.python.org/mailman/listinfo/python-list
Re: Benchmarking some modules - strange result
On 25/01/2015 02:11, Dan Stromberg wrote: Hi folks. I've been benchmarking some python modules that are mostly variations on the same theme. For simplicity, let's say I've been running the suite of performance tests within a single interpreter - so I test one module thoroughly, then move on to the next without exiting the interpreter. I'm finding that if I prune the list of modules down to just the best performers, I get pretty different results - what was best no longer is. This strikes me as strange. I'm forcing a garbage collection between tests with gc.collect(), but even that yields the oddball results. Is there something else I should do to restore the interpreter to a known-pristine state for the sake of such tests? BTW, there isn't much else going on on this computer, except some possible small cronjobs. I'm about ready to rewrite things to run each individual test in a fresh interpreter. But is there a better way? If I do the rewrite, I might run 1 run of module A, 1 run of module B, 1 run of module C, then another run of module A, another run of module B, and another run of module C - to spread any possible timing oddities more evenly. Thanks. Are you aware of this code repository https://hg.python.org/benchmarks/ and the associated mailing list https://mail.python.org/mailman/listinfo/speed, also available as gmane.comp.python.speed ? Maybe there's something there for all of us, maybe not, there's only one way to find out :) -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Object destruction delayed on interactive prompt
Consider the following class: class A: def __init__(self, value): self.value = value def __del__(self): print("'A' instance being collected...") def __repr__(self): return str(self.value) If ran as a script, the destructor behavior works as expected: if __name__ == '__main__': x1 = A(12) print(x1) # __repr__() x1 = 2 # __del__() print(x1) But if I run it interactively, a weird behavior comes up: >>> x1 = A(12) >>> x1 = 'string' A instance being collected...# __del__() as expected >>> x1 = A(12) # Recreate x1 >>> x1 # __repr__() 12 >>> x1 = 'string'# __del__() isn't executed! >>> x1 # it is delayed until here A instance being collected... 'string' This is reproducible in IDLE and at the system command prompt. Is this a REPL specific behavior? -- https://mail.python.org/mailman/listinfo/python-list
Re: Object destruction delayed on interactive prompt
In article , mar...@gmail.com says... > [...] Forgot to mention this is Python 3.4.2 -- https://mail.python.org/mailman/listinfo/python-list
Re: Object destruction delayed on interactive prompt
On 2015-01-25 18:23, Mario Figueiredo wrote: Consider the following class: class A: def __init__(self, value): self.value = value def __del__(self): print("'A' instance being collected...") def __repr__(self): return str(self.value) If ran as a script, the destructor behavior works as expected: if __name__ == '__main__': x1 = A(12) print(x1) # __repr__() x1 = 2 # __del__() print(x1) But if I run it interactively, a weird behavior comes up: >>> x1 = A(12) >>> x1 = 'string' A instance being collected...# __del__() as expected >>> x1 = A(12)# Recreate x1 >>> x1# __repr__() 12 >>> x1 = 'string'# __del__() isn't executed! >>> x1 # it is delayed until here A instance being collected... 'string' This is reproducible in IDLE and at the system command prompt. Is this a REPL specific behavior? In the REPL, the result of an expression is bound to '_': >>> x1 = A(12) >>> x1 # _ will be bound to the result. 12 >>> x1 = 'string' >>> # At this point, _ is still bound to the object. >>> 0 # This will bind _ to another object, so the previous object can be collected. 'A' instance being collected... 0 -- https://mail.python.org/mailman/listinfo/python-list
Re: Object destruction delayed on interactive prompt
__del__ is unreliable. But don't take my word, check out the doc. https://docs.python.org/3.2/reference/datamodel.html#object.__del__ -- https://mail.python.org/mailman/listinfo/python-list
Re: Object destruction delayed on interactive prompt
In article , pyt...@mrabarnett.plus.com says... > In the REPL, the result of an expression is bound to '_': > > >>> x1 = A(12) > >>> x1 # _ will be bound to the result. > 12 > >>> x1 = 'string' > >>> # At this point, _ is still bound to the object. > >>> 0 # This will bind _ to another object, so the previous object can > be collected. > 'A' instance being collected... > 0 Aha! Thank you. -- https://mail.python.org/mailman/listinfo/python-list
Idiomatic backtracking in Python
Hi folks, I have a problem at hand that needs code for backtracking as a solution. And I have no problem coding it, but I can't get rid of the feeling that I'm always solving backtracking problems in a non-Pythonic (non-idiomatic) way. So, I would like to ask if you have a Pythonic approach to backtracking problems? If so, I'd love to hear your solutions! Cheers, Johannes -- >> Wo hattest Du das Beben nochmal GENAU vorhergesagt? > Zumindest nicht öffentlich! Ah, der neueste und bis heute genialste Streich unsere großen Kosmologen: Die Geheim-Vorhersage. - Karl Kaos über Rüdiger Thomas in dsa -- https://mail.python.org/mailman/listinfo/python-list
Problem while reading files from hdfs using python
Hello Everyone, I am trying to read(open) and write files in hdfs inside a python script. But having error. Can someone tell me what is wrong here. Code (full): sample.py #!/usr/bin/python from subprocess import Popen, PIPE print "Before Loop" cat = Popen(["hadoop", "fs", "-cat", "./sample.txt"], stdout=PIPE) put = Popen(["hadoop", "fs", "-put", "-", "./modifiedfile.txt"], stdin=PIPE) for line in cat.stdout: line += "Blah" print line put.stdin.write(line) cat.stdout.close() cat.wait() put.stdin.close() put.wait() When I execute : hadoop jar /usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.5.1.jar -file ./sample.py -mapper './sample.py' -input sample.txt -output fileRead It executes properly I couldn't find the file which supposed to create in hdfs modifiedfile And When I execute : hadoop fs -getmerge ./fileRead/ file.txt Inside the file.txt, I got : Before Loop Before Loop Can someone please tell me what I am doing wrong here ?? I dont think it reads from the sample.txt I would really appreciate the help. -- Thanks & Regards, Shalini Ravishankar. -- https://mail.python.org/mailman/listinfo/python-list
Re: Idiomatic backtracking in Python
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi, I think a very idiomatic way to implement backtracking is using a recursive generator (python 3): def backtrack_solver(data=None): if data is None: yield from backtrack_solver(data=initial_data) if cannot_be_valid(data): return if matches_condition(data): yield data return for new_data in process(data): yield from backtrack_solver(new_data) This generator will yield valid solutions to a suitably defined problem. `initial_data`, `cannot_be_valid`, `matches_condition` and `process` should be replaced with appropriate implementation for your problem. For example, a sudoku solver could be fit to this by accepting a partially solved grid as the `data` parameter. `cannot_be_valid` would now detect grids that have, say, two `1`s in a row or any other invalid grid state and exit. `matches_condition` would detect a fully solved grid. `process` would produce new grids with more cells filled in than the current grid. `initial_data` wouldn't be strictly necessary here, but you could use it for an example grid. It could also be an empty grid, and the solver would then yield all valid grids. Regards, Ian F On 25/01/15 20:15, Johannes Bauer wrote: > Hi folks, > > I have a problem at hand that needs code for backtracking as a > solution. And I have no problem coding it, but I can't get rid of > the feeling that I'm always solving backtracking problems in a > non-Pythonic (non-idiomatic) way. So, I would like to ask if you > have a Pythonic approach to backtracking problems? If so, I'd love > to hear your solutions! > > Cheers, Johannes > -BEGIN PGP SIGNATURE- Version: GnuPG v1 iQEcBAEBAgAGBQJUxVc3AAoJEODsV4MF7PWzO+sH/jaz0Dc7Hs9LkbB8g6//A7pK bxBeFSVtmvaHynASg2PRAzSAC4dty5R52myPoXB3Hdf+otTjBUjOyA7k5j+HCDum TeJJSUFwOFQxr3yRtXcYoct+xYGBAGRqjT0oiGJMFYp5dLPXmHsAv10KIr3HcOo4 TgqQ9XtyMw60Tmx1ZJ/pj0xOPtrr5PUxe0bwRC5bRycDS943s+UJ/o42DhnBtkZp h6kkqsZsAL27i0hZrqBEfWMaIHbY9DZNzA9PYyYEl/pzvtB0tpN6ENrxTQFbBNeE SZoEz9AdcUr9D0ej3HaTgmbT/ivl0op4xQdnpp75uRnGpaH5LlssEGbWQsmRwsY= =Jpwv -END PGP SIGNATURE- -- https://mail.python.org/mailman/listinfo/python-list
Re: __bases__ misleading error message
On 1/25/2015 7:00 AM, Steven D'Aprano wrote: What happens inside the dictionary? Dictionaries are "hash tables", so they are basically a big array of cells, and each cell is a pair of pointers, one for the key and one for the value: [dictionary header] [blank] [blank] [ptr to the string 'y', ptr to the int 42] At the moment, for CPython, each entry has 3 items, with the first being the cached hash of the key. Hash comparison is first used to test whether keys are equal. [hash('y'), ptr('y'), ptr(42)] [blank] [ptr to 'x', ptr to 23] [blank] [blank] [blank] [ptr to 'colour', ptr to 'red'] [blank] As you say, these are implementation details. CPython dicts for the instances of at least some classes have a different, specialized structure, with two arrays. In the above, [blank] entries, which are about 1/2 to 2/3 of the entries, take the same space as real entries (12 to 24 bytes). Raymond H. has proposed that the standard dict have two arrays like so: 1. the first array is a sparse array of indexes into the second array: [b, b, 2, b, 0, b, b, b, 1, b] (where b might be -1 interpreted as ), using only as many bytes as needed for the maximum index. 2. the second array is a compact array of entries in insertion order, such as [hash, ptr to 'x', ptr to 23] [hash, ptr to 'colour', ptr to 'red'] [hash, ptr to the string 'y', ptr to the int 42] Iteration would use the compact array, making all dicts OrderedDicts. Pypy has already switched to this. It seems that on modern processors with multilevel on-chip caches, the space reduction leads to cache-miss reductions that compensate for the indirection cost. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Idiomatic backtracking in Python
Johannes Bauer writes: > So, I would like to ask if you have a Pythonic approach to > backtracking problems? If so, I'd love to hear your solutions! I'm not aware of what the problem is. “Back-tracking” doesn't have a general meaning I recognise beyond random access into a data structure. So a Python list is the obvious starting point. Do you have a specific meaning of “back-tracking” that implies particular problems that are difficult to solve? -- \ “The Vatican is not a state.… a state must have people. There | `\are no Vaticanians.… No-one gets born in the Vatican except by | _o__)an unfortunate accident.” —Geoffrey Robertson, 2010-09-18 | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list
Re: Idiomatic backtracking in Python
On 2015-01-26 00:32, Ben Finney wrote: Johannes Bauer writes: So, I would like to ask if you have a Pythonic approach to backtracking problems? If so, I'd love to hear your solutions! I'm not aware of what the problem is. “Back-tracking” doesn't have a general meaning I recognise beyond random access into a data structure. So a Python list is the obvious starting point. Do you have a specific meaning of “back-tracking” that implies particular problems that are difficult to solve? I believe he's talking about trying different alternatives until one works. You get it in matching regexes (well, when using the NFA approach as used in the re module, Perl regexes, etc, anyway). -- https://mail.python.org/mailman/listinfo/python-list
Re: Alternative to multi-line lambdas: Assign-anywhere def statements
Hi Steven, On Saturday, January 24, 2015 at 11:27:33 PM UTC-5, Steven D'Aprano wrote: > [...] > > Doesn't the traceback tell us exactly where the lambda was called > > from? > > Yes (assuming the source code is available, which it may not be), but If the source code is not available, then you're equally unable to debug a lambda function and a named function. > even so, which is easier? > > "Oh, the exception occurred in myfunc" > > versus > > "Hmmm, the exception occurred in a lambda, lets see now, that was > called by "f", but f might have called seven different lambdas, which > [...] True, Python's current traceback reporting doesn't tell us the exact column number on which the exception occurred. So for example with this: print((lambda: 1)(), (lambda: 1 / 0)()) You will get the following traceback: Traceback (most recent call last): File "test.py", line 3, in print((lambda: 1)(), (lambda: 1 / 0)()) File "test.py", line 3, in print((lambda: 1)(), (lambda: 1 / 0)()) ZeroDivisionError: integer division or modulo by zero So the problem is there are two lambdas in line 3, so you need to examine them both to figure out which one caused the exception. The simple solution to this is to just put the lambdas on different lines: print( (lambda: 1)(), (lambda: 1 / 0)() ) Now you get a blindingly obvious traceback: Traceback (most recent call last): File "test.py", line 5, in (lambda: 1 / 0)() File "test.py", line 5, in (lambda: 1 / 0)() ZeroDivisionError: integer division or modulo by zero > Using lambda is trading off convenience when writing the code for ease > of debugging the code when a problem occurs. Whether that trade-off is > worthwhile or not depends on factors such as how likely the lambda is > to have a bug, how many of them there are, and whether or not there is > uncertainty as to which one is called from where. I feel like this whole 'debugging' argument is a red herring, because you can have _many_ different expressions in a line of code all of which could have potentially caused an exception you're trying to debug. For example, dispensing with the lambdas: print(1, 1 / 0) Error: Traceback (most recent call last): File "test.py", line 3, in print(1, 1 / 0) ZeroDivisionError: integer division or modulo by zero Again, it all comes down to Python just not telling you precisely where the error occurred, and instead only giving you the general vicinity. Again, this can be fixed simply by just putting the expressions on different lines. Regards, Yawar -- https://mail.python.org/mailman/listinfo/python-list
Re: Idiomatic backtracking in Python
Ben Finney : > “Back-tracking” doesn't have a general meaning I recognise beyond > random access into a data structure. Backtracking means the part of depth-first traversal where you retreat to the parent node. If you implement your traversal with a recursive function, backtracking means — more or less — a return from the function. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Idiomatic backtracking in Python
On Mon, Jan 26, 2015 at 12:31 PM, Marko Rauhamaa wrote: > Backtracking means the part of depth-first traversal where you retreat > to the parent node. If you implement your traversal with a recursive > function, backtracking means — more or less — a return from the > function. But possibly you need to return from N levels, not just one. Imagine you're traversing a directory tree using a simplistic recursive algorithm: def traverse(dir): for d in subdirs(dir): traverse(d) for f in files(dir): if file_size(f) > 1024*1024*1024: skip this entire branch of the tree Obviously you have to define the branch somehow, but there are plenty of times when you might want to break out of "everything up to here". How do you define that? How do you return lots of levels all at once? I remember facing this exact problem in trying to solve a particular piece-layout puzzle; if I discovered an impossible situation, I could actually discard at least two or three levels of recursion all at once, because there's no way that the situation could become un-impossible within those levels. Can't remember how I ended up dealing with that... I think I got naughty and used a global variable. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: __bases__ misleading error message
On Jan 25, 2015 2:37 PM, "Terry Reedy" wrote: > 2. the second array is a compact array of entries in insertion order, such as > > [hash, ptr to 'x', ptr to 23] > [hash, ptr to 'colour', ptr to 'red'] > [hash, ptr to the string 'y', ptr to the int 42] > > Iteration would use the compact array, making all dicts OrderedDicts. Pypy has already switched to this. It seems that on modern processors with multilevel on-chip caches, the space reduction leads to cache-miss reductions that compensate for the indirection cost. Deletion becomes O(n) though. Has there been any investigation into how commonly deletion of keys is done? -- https://mail.python.org/mailman/listinfo/python-list
Re: Idiomatic backtracking in Python
On Monday, January 26, 2015 at 2:21:34 AM UTC+5:30, Ian Foote wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Hi, > > I think a very idiomatic way to implement backtracking is using a > recursive generator (python 3): > > def backtrack_solver(data=None): > if data is None: > yield from backtrack_solver(data=initial_data) > > if cannot_be_valid(data): > return > > if matches_condition(data): > yield data > return > > for new_data in process(data): > yield from backtrack_solver(new_data) > > This generator will yield valid solutions to a suitably defined problem. > > `initial_data`, `cannot_be_valid`, `matches_condition` and `process` > should be replaced with appropriate implementation for your problem. > > For example, a sudoku solver could be fit to this by accepting a > partially solved grid as the `data` parameter. > > `cannot_be_valid` would now detect grids that have, say, two `1`s in a > row or any other invalid grid state and exit. > > `matches_condition` would detect a fully solved grid. > > `process` would produce new grids with more cells filled in than the > current grid. > > `initial_data` wouldn't be strictly necessary here, but you could use > it for an example grid. It could also be an empty grid, and the solver > would then yield all valid grids. > > Regards, > Ian F > > On 25/01/15 20:15, Johannes Bauer wrote: > > Hi folks, > > > > I have a problem at hand that needs code for backtracking as a > > solution. And I have no problem coding it, but I can't get rid of > > the feeling that I'm always solving backtracking problems in a > > non-Pythonic (non-idiomatic) way. So, I would like to ask if you > > have a Pythonic approach to backtracking problems? If so, I'd love > > to hear your solutions! > > > > Cheers, Johannes To add to Ian: The classic way of doing it in a functional framework is called: "Replace failure by list of successes" https://rkrishnan.org/files/wadler-1985.pdf The things that have to go into it are 1. Extensive use of list comprehensions 2. Lazy lists Just change in the above 'list' to 'generator' and more or less it should work in python More or less that means when you have a comprehension [expr for x in list2 for y in list2 etc] change the '[]' to '()' and recursively change the list1 list2 to gen1 gen2 Some nuisances that bug me (or I dont know how to handle): 1. Singleton [val] becomes (x for x in [val]) (Hoo Boy!) Or def single_val(): yield val 2. Nice syntax like list + Compare [1] + [2] with >>> from itertools import chain >>> a = (x for x in [1]) >>> b = (x for x in [2]) >>> list(chain(a,b)) [1, 2] Of course it looks worse because the (syntax) overhead of jumping between lists and generators overwhelms in this trivial case -- https://mail.python.org/mailman/listinfo/python-list
Re: Alternative to multi-line lambdas: Assign-anywhere def statements
Yawar Amin wrote: > Hi Steven, > > On Saturday, January 24, 2015 at 11:27:33 PM UTC-5, Steven D'Aprano > wrote: >> [...] >> > Doesn't the traceback tell us exactly where the lambda was called >> > from? >> >> Yes (assuming the source code is available, which it may not be), but > > If the source code is not available, then you're equally unable to debug > a lambda function and a named function. If source code to a library is unavailable, debugging is significantly more difficult, but it may still be possible. At least in the sense that you can identify the nature of the bug and develop a work-around in your own code. Obviously there needs to be *some* source code for you to edit, otherwise there's nothing for you to edit :-) Even in the absence of source code, you may still be able to report a bug to the application developer and show the full traceback. Even without source, tracebacks show the names of functions, and lambdas all share the same name: [steve@ando ~]$ cat demo.py def fail(x): f = lambda a: 1/a g = lambda a: f(a-1) return g(x) print fail(1) [steve@ando ~]$ python -m compileall demo.py Compiling demo.py ... [steve@ando ~]$ python demo.pyc Traceback (most recent call last): File "demo.py", line 6, in print fail(1) File "demo.py", line 4, in fail return g(x) File "demo.py", line 3, in g = lambda a: f(a-1) File "demo.py", line 2, in f = lambda a: 1/a ZeroDivisionError: integer division or modulo by zero [steve@ando ~]$ mv demo.py demo-save.py # hide the source [steve@ando ~]$ python demo.pyc Traceback (most recent call last): File "demo.py", line 6, in File "demo.py", line 4, in fail File "demo.py", line 3, in File "demo.py", line 2, in ZeroDivisionError: integer division or modulo by zero In this toy example, it is still easy to debug even without names. It's a three line function, how hard could it be? :-) But that isn't always going to be the case. [...] > True, Python's current traceback reporting doesn't tell us the exact > column number on which the exception occurred. So for example with this: [...] > So the problem is there are two lambdas in line 3, so you need to > examine them both to figure out which one caused the exception. The > simple solution to this is to just put the lambdas on different lines: > > print( > (lambda: 1)(), > (lambda: 1 / 0)() > ) > > Now you get a blindingly obvious traceback: It's blindingly obvious because it is trivial. You're still thinking about the easy cases, not the hard cases: py> def make_funcs(): ... closures = [] ... for i in range(5, -5, -1): ... closures.append(lambda x, i=i: x/i) ... return closures ... py> results = [f(5) for f in make_funcs()] Traceback (most recent call last): File "", line 1, in File "", line 1, in File "", line 4, in ZeroDivisionError: division by zero That's still not a hard case, because all the closures have a very similar form and it's easy to spot that one of them divides by zero. But it demonstrates one way that line numbers alone don't help. But do understand I'm not saying that lambdas cannot be debugged. I'm saying that *function names are a useful debugging tool*, so if you take away the function name that makes debugging just that little bit harder. Not necessarily in every single case, and not necessarily hard to the point that you spend days trying to track down the error, but hard enough in enough circumstances that using lambda for arbitrarily complex functions would be a bad idea. (The same applies to comprehensions: they should be kept simple.) That shouldn't be controversial. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Track down SIGABRT
Israel Brewster writes: > ... > Running it through GDB didn't seem to give me much more information than that > crash report - I can see that it is in thread 0, and the call stack is > identical to what the crash report shows, but I can't see where in the python > code it is. > ... > Thread 0 Crashed:: Dispatch queue: com.apple.main-thread > 0 libsystem_kernel.dylib0x98176be6 __select + 10 > 1 time.so 0x0052fdda 0x52f000 + 3546 > 2 org.python.python 0x0007e90a PyCFunction_Call + 98 > 3 org.python.python 0x000ccef4 PyEval_EvalFrameEx + 8182 > 4 org.python.python 0x000caeb2 PyEval_EvalCodeEx + 1777 This tells you: the crash happens somewhere in a call to a function defined in "time.so". I suppose that all such functions are harmless in themselves. I see two potential causes: * a stack overflow some Python builds are made with quite a small thread runtime stack. However, you would usually get a "SIGSEGV" rather than a "SIGABRT". * a bad pointer passed in to the function. Again, I would expect rather a "SIGSEGV" than a "SIGABRT" -- but maybe "__select" has a special sanity check for its arguments and raises "SIGABRT" when the check fails. In order to determine where in your Python code the problem happens, you likely need a Python with debugging symbols (system installed Python interpreters usually are "stripped", e.g. they no longer have debugging symbols). Python comes with GDB macros able to show you information at the Python (rather than the C) level. Here (Ubuntu 12.4), it is at "/usr/share/doc/python2.7/gdbinit.gz". Those macros need debugging symbols. Read the top of this file, for how to get those macros defined in your GDB session. I doubt that being able to relate the problem to code at the Python level will help you much. Your "SIGABRT" problem is likely not caused on the Python level (but at C level). If it is not caused by a runtime stack overflow, the cause is likely a memory corruption, made non locally (far away) in some C extension. Using a tool to track down memory corruption might help you (however, in my experience, it is usually a huge task). -- https://mail.python.org/mailman/listinfo/python-list
Re: Idiomatic backtracking in Python
Rustom Mody writes: > To add to Ian: > > The classic way of doing it in a functional framework is called: > "Replace failure by list of successes" > > https://rkrishnan.org/files/wadler-1985.pdf > > The things that have to go into it are > 1. Extensive use of list comprehensions > 2. Lazy lists > > Just change in the above 'list' to 'generator' > and more or less it should work in python > > More or less that means when you have a comprehension > [expr for x in list2 for y in list2 etc] > > change the '[]' to '()' > and recursively change the list1 list2 to gen1 gen2 > > Some nuisances that bug me (or I dont know how to handle): > > 1. Singleton >[val] becomes (x for x in [val]) (Hoo Boy!) > >Or >def single_val(): yield val It's just one def: def sing(*args): for arg in args: yield arg Or not even that: iter((val,)). But see below at the end. > 2. Nice syntax like list + > > Compare [1] + [2] > with > > >>> from itertools import chain > >>> a = (x for x in [1]) > >>> b = (x for x in [2]) > >>> list(chain(a,b)) > [1, 2] > > Of course it looks worse because the (syntax) overhead of jumping > between lists and generators overwhelms in this trivial case Yes, one wouldn't jump back and forth so much. Just collect the result at the end. >>> list(chain(sing(3,1,4), sing(1), sing(5,9,2,6))) [3, 1, 4, 1, 5, 9, 2, 6] >>> list(chain(sing(3,1,4,1), sing(), sing(5,9,2,6))) [3, 1, 4, 1, 5, 9, 2, 6] And it gets better: chain takes lists! And tuples! For those who like their brackets round: >>> tuple(chain((3,1,4,1), (), (5,9,2,6))) (3, 1, 4, 1, 5, 9, 2, 6) >>> set(chain((3,))) {3} >>> list(chain(())) [] So chain does it all. -- https://mail.python.org/mailman/listinfo/python-list
Re: Idiomatic backtracking in Python
On Monday, January 26, 2015 at 12:52:04 PM UTC+5:30, Jussi Piitulainen wrote: > Rustom Mody writes: > > > To add to Ian: > > > > The classic way of doing it in a functional framework is called: > > "Replace failure by list of successes" > > > > https://rkrishnan.org/files/wadler-1985.pdf > > > > The things that have to go into it are > > 1. Extensive use of list comprehensions > > 2. Lazy lists > > > > Just change in the above 'list' to 'generator' > > and more or less it should work in python > > > > More or less that means when you have a comprehension > > [expr for x in list2 for y in list2 etc] > > > > change the '[]' to '()' > > and recursively change the list1 list2 to gen1 gen2 > > > > Some nuisances that bug me (or I dont know how to handle): > > > > 1. Singleton > >[val] becomes (x for x in [val]) (Hoo Boy!) > > > >Or > >def single_val(): yield val > > It's just one def: > > def sing(*args): >for arg in args: >yield arg > > Or not even that: iter((val,)). Sweet > > But see below at the end. > > > 2. Nice syntax like list + > > > > Compare [1] + [2] > > with > > > > >>> from itertools import chain > > >>> a = (x for x in [1]) > > >>> b = (x for x in [2]) > > >>> list(chain(a,b)) > > [1, 2] > > > > Of course it looks worse because the (syntax) overhead of jumping > > between lists and generators overwhelms in this trivial case > > Yes, one wouldn't jump back and forth so much. Just collect the result > at the end. > > >>> list(chain(sing(3,1,4), sing(1), sing(5,9,2,6))) > [3, 1, 4, 1, 5, 9, 2, 6] > > >>> list(chain(sing(3,1,4,1), sing(), sing(5,9,2,6))) > [3, 1, 4, 1, 5, 9, 2, 6] > > And it gets better: chain takes lists! And tuples! For those who like > their brackets round: > > >>> tuple(chain((3,1,4,1), (), (5,9,2,6))) > (3, 1, 4, 1, 5, 9, 2, 6) > > >>> set(chain((3,))) > {3} > > >>> list(chain(())) > [] > > So chain does it all. Neat Thanks -- https://mail.python.org/mailman/listinfo/python-list