Re: Delegation in Python

2015-01-25 Thread Chris Angelico
On Sun, Jan 25, 2015 at 6:49 PM, Brian Gladman  wrote:
> Thanks, a part of this was a wish to understand how to map what I can do
> in other languages into Python.  I felt that it might just be possible
> in Python to avoid having to wrap all the methods of the base class in
> the derived class.  But it seems that __getattr__ etc are not quite as
> magic as I hoped.

They do exactly what they're documented to, nothing more and nothing
less :) It's certainly possible to use them to hook into missing
attributes, for instance:

>>> class RF:
def __init__(self, *args, **kw):
self._frac = Fraction(*args, **kw)
def __getattr__(self, attr):
return getattr(self._frac, attr)
def __repr__(self):
return "RF(%d, %d)" % (self._frac.numerator, self._frac.denominator)
def is_integer(self):
return self._frac.denominator==1

>>> RF(1,4).numerator
1

But it doesn't work for everything:
>>> RF(1,4)*2
Traceback (most recent call last):
  File "", line 1, in 
RF(1,4)*2
TypeError: unsupported operand type(s) for *: 'RF' and 'int'

The only solution would be to go through every operation that you care
about, and manually hook them. Something like this:

def __mul__(self, other):
result = self._frac * other
if isinstance(result, Fraction):
return RF(result.numerator, result.denominator)
return result

>>> RF(1,4)*2
RF(1, 2)

And do that for every other operation, method, etc. Tedious, but can
be effective if you need it.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: __bases__ misleading error message

2015-01-25 Thread Marco Buttu

On 25/01/2015 01:55, Terry Reedy wrote:


'This situation' being that Someclass.attro works but Someclass().attro
raises...

In other words, if 'Someclass.attro' evaluates to an object ...
then 'Someclass().attro' *normally* evaluates to the same object


I am sorry Terry, but I do not agree, because the word "normally" in 
this context does not have any meaning, and instroduces just confusion. 
In facts this is exactly the case where "Someclass.attro' evaluates to 
an object" *but* hasattr(Someclass(), 'attro'), as expected, is False. 
What do you mean with "normally"?



because attribute lookup on an instance reverts to attribute lookup on
the class and even its super classes


Here's the point. The __bases__ attribute is defined by the class type. 
So if isinstance(obj, type) is true, then hasattr(obj, '__bases__') is 
true. In the OP case:


  >>> class Sub: # Python 3
  ... pass
  ...
  >>> isinstance(Sub, type)
  True
  >>> hasattr(Sub, '__bases__')
  True

But Sub() is not an instance of type (because Sub is not a subclass of 
type), and so Sub() does not have to get the attributes, like __bases__, 
from type, and here's we are:


  >>> foo = Sub()
  >>> isinstance(foo, type)
  False
  >>> foo.__bases__
  Traceback (most recent call last):
...
  AttributeError: 'Sub' object has no attribute '__bases__'


In other words, speaking about the lookup machinery, because '__bases__' 
is not in foo.__dict__ Python looks for __bases__ in Sub.__dict__, but 
it does not found it. So it looks in the Sub bases, and we went back to 
the point: Sub is an instance of type, not a subclass of type.




Someclass.__bases__
is special; it is not listed by dir(Someclass), and it is not accessible
via an instance.


In my point of view no attribute is special (maybe just a descriptor is 
a bit special, in some ways). To undestand what happens in these 
situations, we do not have to give a special meaning to the attribute. 
Given `obj.attr`, we just have to know what kind of object is `obj`: 
usually, like in this case, we have to differentiate between classes and 
non-classes objects. In fact __bases__ is not listed by dir(Sub) because 
Sub is a class, and the distinction between classes and non-classes 
objects matters, especially for dir()


--
Marco Buttu
--
https://mail.python.org/mailman/listinfo/python-list


Re: Benchmarking some modules - strange result

2015-01-25 Thread Marko Rauhamaa
Dan Stromberg :

> BTW, there isn't much else going on on this computer, except some
> possible small cronjobs.

The OS caches the files in RAM. You'd expect the file I/O performance to
get much better on test reruns. On the other hand, OS cache flushing
happens at random times, which affects the results in the other
direction.

Also, I take it you aren't running your tests in a virtual machine or on
the cloud, right?

Finally, you should run each test case in isolation. IOW, restart Python
between each test case so the test cases don't interfere with each
other. The startup overhead could be reduced by starting the test cases
with an os.fork(). However, the performance tests should run long enough
to overwhelm to startup and teardown overhead.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: __bases__ misleading error message

2015-01-25 Thread Mario Figueiredo
In article <54c4606a$0$13002$c3e8da3$54964...@news.astraweb.com>, 
steve+comp.lang.pyt...@pearwood.info says...
> 
> It doesn't.

Your explanation was great Steven. Thank you. But raises one question...

> 
> Assigning a value to a variable ("m = 42", hex 2A) results in the compiler
> storing that value in the bucket; assignment makes a copy: "n = m" leads to
> two independent buckets with the same value:
> 
> m = [002A]
> n = [002A]

I'm still in the process of learning Python. So, that's probably why 
this is unexpected to me.

I was under the impression that what python did was keep a lookup table 
pointing to memory. Every variable gets an entry as type descriptor and 
a pointer to a memory address, where the variable data resides.

(UDTs may be special in that they would have more than one entry, one 
for each enclosing def and declared attribute)

In the example above, the n and m buckets would hold pointers, not 
binary values. And because they are immutable objects, n and m pointers 
would be different. Not equal. But in the case of mutable objects, n = m 
would result in m having the same pointer address as n.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: __bases__ misleading error message

2015-01-25 Thread Steven D'Aprano
Mario Figueiredo wrote:

> In article <54c4606a$0$13002$c3e8da3$54964...@news.astraweb.com>,
> steve+comp.lang.pyt...@pearwood.info says...
>> 
>> It doesn't.
> 
> Your explanation was great Steven. Thank you. But raises one question...
> 
>> 
>> Assigning a value to a variable ("m = 42", hex 2A) results in the
>> compiler storing that value in the bucket; assignment makes a copy: "n =
>> m" leads to two independent buckets with the same value:
>> 
>> m = [002A]
>> n = [002A]


Maybe I wasn't clear enough. The above is used by languages like C or
Pascal, which use fixed memory locations for variables. If I gave the
impression this was Python, I am sorry.


> I'm still in the process of learning Python. So, that's probably why
> this is unexpected to me.
> 
> I was under the impression that what python did was keep a lookup table
> pointing to memory. Every variable gets an entry as type descriptor and
> a pointer to a memory address, where the variable data resides.

This sounds more or less correct, at least for CPython. CPython is
the "reference implementation", and probably the version you use when you
run Python. But it is not the only one, and they can be different.

(E.g. in Jython, the Python interpreter is built using Java, not C. You
can't work with pointers to memory addresses in Java, and the Java garbage
collector is free to move objects around when needed.)

In CPython, objects live in the heap, and Python tracks them using a
pointer. So when you bind a name to a value:

x = 23  # what you type

what happens is that Python sets a key + value in the global namespace (a
dictionary):

globals()['x'] = 23  # what Python runs

and the globals() dict will then look something like this:

{'x': 23, 'colour': 'red', 'y': 42}

(Note: *local* variables are similar but not quite the same. They're also
more complicated, so let's skip them for now.)

What happens inside the dictionary? Dictionaries are "hash tables", so they
are basically a big array of cells, and each cell is a pair of pointers,
one for the key and one for the value:

[dictionary header]
[blank] 
[blank] 
[ptr to the string 'y', ptr to the int 42]
[blank] 
[ptr to 'x', ptr to 23]
[blank]
[blank]
[blank]
[ptr to 'colour', ptr to 'red']
[blank]
...


Notice that the order is unpredictable. Also, don't take this picture too
literally. Dicts are highly optimized, highly tuned and in active
development, the *actual* design of Python dicts may vary. But this is a
reasonable simplified view of how they could be designed.

The important thing to remember is that while CPython uses pointers under
the hood to make the interpreter work, pointers are not part of the Python
language. There is no way in Python to get a pointer to an object, or
increment a pointer, or dereference a pointer. You just use objects, and
the interpreter handles all the pointer stuff behind the scenes.


> (UDTs may be special in that they would have more than one entry, one
> for each enclosing def and declared attribute)
> 
> In the example above, the n and m buckets would hold pointers, not
> binary values. And because they are immutable objects, n and m pointers
> would be different. Not equal. But in the case of mutable objects, n = m
> would result in m having the same pointer address as n.

No, this is certainly not the case! Python uses *exactly* the same rules for
mutable and immutable objects. In fact, Python can't tell what values are
mutable or immutable until it tries to modify it.

Remember I said that name-binding languages operate using a model of pieces
of string between the name and the object? Here are two names bound to the
same object:


m ---+--- 0x2a
n --/ 


Obviously Python doesn't *literally* use a piece of string :-) so what
happens under the hood? Pointers again, at least in CPython.

In this case, if we look deep inside our globals dictionary again, we will
see two cells:

 [ptr to the string "m", ptr to the int 0x2a]
 [ptr to the string "n", ptr to the int 0x2a]

The two int pointers point to the same object. This is guaranteed by the
language:

m = 42
n = m
assert id(m) == id(n)

Both objects have the same ID and are the same object at the same memory
location. Assignment in Python NEVER makes a copy of the value being
assigned.


-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benchmarking some modules - strange result

2015-01-25 Thread Peter Otten
Dan Stromberg wrote:

> I've been benchmarking some python modules that are mostly variations
> on the same theme.
> 
> For simplicity, let's say I've been running the suite of performance
> tests within a single interpreter - so I test one module thoroughly,
> then move on to the next without exiting the interpreter.
> 
> I'm finding that if I prune the list of modules down to just the best
> performers, I get pretty different results - what was best no longer
> is.  This strikes me as strange.

> I'm about ready to rewrite things to run each individual test in a
> fresh interpreter. But is there a better way?

You could run combinations of two modules in the same interpreter to see if 
there are specific modules that slow down the following module.
If you can identify such modules you could then look into their code to try 
and find the cause of the slowdown.

Requires some work, but the results might be interesting to the greater 
public -- many real-world applications make do with a single interpreter ;)


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: __bases__ misleading error message

2015-01-25 Thread Mario Figueiredo
In article <54c4dae1$0$13005$c3e8da3$54964...@news.astraweb.com>, 
steve+comp.lang.pyt...@pearwood.info says...
> [...]

Most excellent. Thanks for the hard work, explaining this to me. :)

Knowing Python internals is something that will end benefiting me in the 
long run. There's much to be gained by knowing the inner working of your 
programming language...

Python is missing an under-the-hood book, I suppose. Tracing through 
Python source code to learn this stuff isn't easy unless we know what we 
are looking for.


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: __bases__ misleading error message

2015-01-25 Thread Marko Rauhamaa
Mario Figueiredo :

> Knowing Python internals is something that will end benefiting me in
> the long run. There's much to be gained by knowing the inner working
> of your programming language...
>
> Python is missing an under-the-hood book, I suppose. Tracing through
> Python source code to learn this stuff isn't easy unless we know what
> we are looking for.

One must only be careful to distinguish implementation choices from the
abstract definitions. The Python Language Reference is the official
standard:

   https://docs.python.org/3/reference/index.html>


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benchmarking some modules - strange result

2015-01-25 Thread Dan Sommers
On Sun, 25 Jan 2015 13:24:40 +0100, Peter Otten wrote:

> Dan Stromberg wrote:
> 
>> I've been benchmarking some python modules that are mostly variations
>> on the same theme.
>> 
>> For simplicity, let's say I've been running the suite of performance
>> tests within a single interpreter - so I test one module thoroughly,
>> then move on to the next without exiting the interpreter.
>> 
>> I'm finding that if I prune the list of modules down to just the best
>> performers, I get pretty different results - what was best no longer
>> is.  This strikes me as strange.
> 
>> I'm about ready to rewrite things to run each individual test in a
>> fresh interpreter. But is there a better way?
> 
> You could run combinations of two modules in the same interpreter to
> see if there are specific modules that slow down the following module.
> If you can identify such modules you could then look into their code
> to try and find the cause of the slowdown.
> 
> Requires some work, but the results might be interesting to the
> greater public -- many real-world applications make do with a single
> interpreter ;)

I would add that your present results probably do reflect the real-world
better than a contrived, yet scientific and arguably more "accurate"
test.  This tells me that the variations you see are down in the noise,
and that any optimizations you may make to improve performance could be
lost in any given application.

Unless, of course, you're on an academic pursuit, in which case carry
on!

Dan
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Benchmarking some modules - strange result

2015-01-25 Thread Mark Lawrence

On 25/01/2015 02:11, Dan Stromberg wrote:

Hi folks.

I've been benchmarking some python modules that are mostly variations
on the same theme.

For simplicity, let's say I've been running the suite of performance
tests within a single interpreter - so I test one module thoroughly,
then move on to the next without exiting the interpreter.

I'm finding that if I prune the list of modules down to just the best
performers, I get pretty different results - what was best no longer
is.  This strikes me as strange.

I'm forcing a garbage collection between tests with gc.collect(), but
even that yields the oddball results.

Is there something else I should do to restore the interpreter to a
known-pristine state for the sake of such tests?

BTW, there isn't much else going on on this computer, except some
possible small cronjobs.

I'm about ready to rewrite things to run each individual test in a
fresh interpreter. But is there a better way?

If I do the rewrite, I might run 1 run of module A, 1 run of module B,
1 run of module C, then another run of module A, another run of module
B, and another run of module C - to spread any possible timing
oddities more evenly.

Thanks.



Are you aware of this code repository https://hg.python.org/benchmarks/ 
and the associated mailing list 
https://mail.python.org/mailman/listinfo/speed, also available as 
gmane.comp.python.speed ?  Maybe there's something there for all of us, 
maybe not, there's only one way to find out :)


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Object destruction delayed on interactive prompt

2015-01-25 Thread Mario Figueiredo
Consider the following class:

class A:
def __init__(self, value):
self.value = value

def __del__(self):
print("'A' instance being collected...")

def __repr__(self):
return str(self.value)

If ran as a script, the destructor behavior works as expected:

if __name__ == '__main__':
x1 = A(12)
print(x1)   # __repr__()
x1 = 2  # __del__()
print(x1)

But if I run it interactively, a weird behavior comes up:

>>> x1 = A(12)
>>> x1 = 'string'
A instance being collected...# __del__() as expected
>>> x1 = A(12)  # Recreate x1
>>> x1  # __repr__()
12
>>> x1 = 'string'# __del__() isn't executed!
>>> x1   # it is delayed until here
A instance being collected...
'string'

This is reproducible in IDLE and at the system command prompt. Is this a 
REPL specific behavior?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Object destruction delayed on interactive prompt

2015-01-25 Thread Mario Figueiredo
In article , mar...@gmail.com 
says...
> [...]

Forgot to mention this is Python 3.4.2
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Object destruction delayed on interactive prompt

2015-01-25 Thread MRAB

On 2015-01-25 18:23, Mario Figueiredo wrote:

Consider the following class:

 class A:
 def __init__(self, value):
 self.value = value

 def __del__(self):
 print("'A' instance being collected...")

 def __repr__(self):
 return str(self.value)

If ran as a script, the destructor behavior works as expected:

 if __name__ == '__main__':
 x1 = A(12)
 print(x1)   # __repr__()
 x1 = 2  # __del__()
 print(x1)

But if I run it interactively, a weird behavior comes up:

 >>> x1 = A(12)
 >>> x1 = 'string'
 A instance being collected...# __del__() as expected
 >>> x1 = A(12)# Recreate x1
 >>> x1# __repr__()
 12
 >>> x1 = 'string'# __del__() isn't executed!
 >>> x1   # it is delayed until here
 A instance being collected...
 'string'

This is reproducible in IDLE and at the system command prompt. Is this a
REPL specific behavior?


In the REPL, the result of an expression is bound to '_':

>>> x1 = A(12)
>>> x1 # _ will be bound to the result.
12
>>> x1 = 'string'
>>> # At this point, _ is still bound to the object.
>>> 0 # This will bind _ to another object, so the previous object can 
be collected.

'A' instance being collected...
0

--
https://mail.python.org/mailman/listinfo/python-list


Re: Object destruction delayed on interactive prompt

2015-01-25 Thread Rick Johnson
__del__ is unreliable. But don't take my word, check out the doc.

https://docs.python.org/3.2/reference/datamodel.html#object.__del__
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Object destruction delayed on interactive prompt

2015-01-25 Thread Mario Figueiredo
In article , 
pyt...@mrabarnett.plus.com says...
> In the REPL, the result of an expression is bound to '_':
> 
>  >>> x1 = A(12)
>  >>> x1 # _ will be bound to the result.
> 12
>  >>> x1 = 'string'
>  >>> # At this point, _ is still bound to the object.
>  >>> 0 # This will bind _ to another object, so the previous object can 
> be collected.
> 'A' instance being collected...
> 0

Aha! Thank you.
-- 
https://mail.python.org/mailman/listinfo/python-list


Idiomatic backtracking in Python

2015-01-25 Thread Johannes Bauer
Hi folks,

I have a problem at hand that needs code for backtracking as a solution.
And I have no problem coding it, but I can't get rid of the feeling that
I'm always solving backtracking problems in a non-Pythonic
(non-idiomatic) way. So, I would like to ask if you have a Pythonic
approach to backtracking problems? If so, I'd love to hear your solutions!

Cheers,
Johannes

-- 
>> Wo hattest Du das Beben nochmal GENAU vorhergesagt?
> Zumindest nicht öffentlich!
Ah, der neueste und bis heute genialste Streich unsere großen
Kosmologen: Die Geheim-Vorhersage.
 - Karl Kaos über Rüdiger Thomas in dsa 
-- 
https://mail.python.org/mailman/listinfo/python-list


Problem while reading files from hdfs using python

2015-01-25 Thread Shalini Ravishankar
Hello Everyone,

I am trying to read(open) and write files in hdfs inside a python script. But 
having error. Can someone tell me what is wrong here.

Code (full): sample.py

#!/usr/bin/python


from subprocess import Popen, PIPE

print "Before Loop"

cat = Popen(["hadoop", "fs", "-cat", "./sample.txt"],
stdout=PIPE)
put = Popen(["hadoop", "fs", "-put", "-", "./modifiedfile.txt"],
stdin=PIPE)
for line in cat.stdout:
line += "Blah"
print line
put.stdin.write(line)

cat.stdout.close()
cat.wait()
put.stdin.close()
put.wait()

When I execute : 

hadoop jar 
/usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.5.1.jar -file 
./sample.py -mapper './sample.py' -input sample.txt -output fileRead

It executes properly I couldn't find the file which supposed to create in hdfs 
modifiedfile

And When I execute :

 hadoop fs -getmerge ./fileRead/ file.txt

Inside the file.txt, I got :

Before Loop 
Before Loop

Can someone please tell me what I am doing wrong here ?? I dont think it reads 
from the sample.txt

I would really appreciate the help.


--
Thanks & Regards,
Shalini Ravishankar.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Idiomatic backtracking in Python

2015-01-25 Thread Ian Foote
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

I think a very idiomatic way to implement backtracking is using a
recursive generator (python 3):

def backtrack_solver(data=None):
if data is None:
yield from backtrack_solver(data=initial_data)

if cannot_be_valid(data):
return

if matches_condition(data):
yield data
return

for new_data in process(data):
yield from backtrack_solver(new_data)

This generator will yield valid solutions to a suitably defined problem.

`initial_data`, `cannot_be_valid`, `matches_condition` and `process`
should be replaced with appropriate implementation for your problem.

For example, a sudoku solver could be fit to this by accepting a
partially solved grid as the `data` parameter.

`cannot_be_valid` would now detect grids that have, say, two `1`s in a
row or any other invalid grid state and exit.

`matches_condition` would detect a fully solved grid.

`process` would produce new grids with more cells filled in than the
current grid.

`initial_data` wouldn't be strictly necessary here, but you could use
it for an example grid. It could also be an empty grid, and the solver
would then yield all valid grids.

Regards,
Ian F

On 25/01/15 20:15, Johannes Bauer wrote:
> Hi folks,
> 
> I have a problem at hand that needs code for backtracking as a
> solution. And I have no problem coding it, but I can't get rid of
> the feeling that I'm always solving backtracking problems in a
> non-Pythonic (non-idiomatic) way. So, I would like to ask if you
> have a Pythonic approach to backtracking problems? If so, I'd love
> to hear your solutions!
> 
> Cheers, Johannes
> 

-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iQEcBAEBAgAGBQJUxVc3AAoJEODsV4MF7PWzO+sH/jaz0Dc7Hs9LkbB8g6//A7pK
bxBeFSVtmvaHynASg2PRAzSAC4dty5R52myPoXB3Hdf+otTjBUjOyA7k5j+HCDum
TeJJSUFwOFQxr3yRtXcYoct+xYGBAGRqjT0oiGJMFYp5dLPXmHsAv10KIr3HcOo4
TgqQ9XtyMw60Tmx1ZJ/pj0xOPtrr5PUxe0bwRC5bRycDS943s+UJ/o42DhnBtkZp
h6kkqsZsAL27i0hZrqBEfWMaIHbY9DZNzA9PYyYEl/pzvtB0tpN6ENrxTQFbBNeE
SZoEz9AdcUr9D0ej3HaTgmbT/ivl0op4xQdnpp75uRnGpaH5LlssEGbWQsmRwsY=
=Jpwv
-END PGP SIGNATURE-
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: __bases__ misleading error message

2015-01-25 Thread Terry Reedy

On 1/25/2015 7:00 AM, Steven D'Aprano wrote:


What happens inside the dictionary? Dictionaries are "hash tables", so they
are basically a big array of cells, and each cell is a pair of pointers,
one for the key and one for the value:

 [dictionary header]
 [blank]
 [blank]
 [ptr to the string 'y', ptr to the int 42]


At the moment, for CPython, each entry has 3 items, with the first being 
the cached hash of the key.  Hash comparison is first used to test 
whether keys are equal.


  [hash('y'), ptr('y'), ptr(42)]


 [blank]
 [ptr to 'x', ptr to 23]
 [blank]
 [blank]
 [blank]
 [ptr to 'colour', ptr to 'red']
 [blank]



As you say, these are implementation details.  CPython dicts for the 
instances of at least some classes have a different, specialized 
structure, with two arrays.


In the above, [blank] entries, which are about 1/2 to 2/3 of the 
entries, take the same space as real entries (12 to 24 bytes).  Raymond 
H. has proposed that the standard dict have two arrays like so:


1. the first array is a sparse array of indexes into the second array: 
[b, b, 2, b, 0, b, b, b, 1, b] (where b might be -1 interpreted as 
), using only as many bytes as needed for the maximum index.


2. the second array is a compact array of entries in insertion order, 
such as


[hash, ptr to 'x', ptr to 23]
[hash, ptr to 'colour', ptr to 'red']
[hash, ptr to the string 'y', ptr to the int 42]

Iteration would use the compact array, making all dicts OrderedDicts. 
Pypy has already switched to this.  It seems that on modern processors 
with multilevel on-chip caches, the space reduction leads to cache-miss 
reductions that compensate for the indirection cost.



--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: Idiomatic backtracking in Python

2015-01-25 Thread Ben Finney
Johannes Bauer  writes:

> So, I would like to ask if you have a Pythonic approach to
> backtracking problems? If so, I'd love to hear your solutions!

I'm not aware of what the problem is. “Back-tracking” doesn't have a
general meaning I recognise beyond random access into a data structure.
So a Python list is the obvious starting point.

Do you have a specific meaning of “back-tracking” that implies
particular problems that are difficult to solve?

-- 
 \   “The Vatican is not a state.… a state must have people. There |
  `\are no Vaticanians.… No-one gets born in the Vatican except by |
_o__)an unfortunate accident.” —Geoffrey Robertson, 2010-09-18 |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Idiomatic backtracking in Python

2015-01-25 Thread MRAB

On 2015-01-26 00:32, Ben Finney wrote:

Johannes Bauer  writes:


So, I would like to ask if you have a Pythonic approach to
backtracking problems? If so, I'd love to hear your solutions!


I'm not aware of what the problem is. “Back-tracking” doesn't have a
general meaning I recognise beyond random access into a data structure.
So a Python list is the obvious starting point.

Do you have a specific meaning of “back-tracking” that implies
particular problems that are difficult to solve?


I believe he's talking about trying different alternatives until one
works. You get it in matching regexes (well, when using the NFA
approach as used in the re module, Perl regexes, etc, anyway).

--
https://mail.python.org/mailman/listinfo/python-list


Re: Alternative to multi-line lambdas: Assign-anywhere def statements

2015-01-25 Thread Yawar Amin
Hi Steven,

On Saturday, January 24, 2015 at 11:27:33 PM UTC-5, Steven D'Aprano
wrote:
> [...]
> > Doesn't the traceback tell us exactly where the lambda was called
> > from?
> 
> Yes (assuming the source code is available, which it may not be), but

If the source code is not available, then you're equally unable to debug
a lambda function and a named function.

> even so, which is easier?
> 
> "Oh, the exception occurred in myfunc"
> 
> versus
> 
> "Hmmm, the exception occurred in a lambda, lets see now, that was
> called by "f", but f might have called seven different lambdas, which
> [...]

True, Python's current traceback reporting doesn't tell us the exact
column number on which the exception occurred. So for example with this:

print((lambda: 1)(), (lambda: 1 / 0)())

You will get the following traceback:

Traceback (most recent call last):
  File "test.py", line 3, in 
print((lambda: 1)(), (lambda: 1 / 0)())
  File "test.py", line 3, in 
print((lambda: 1)(), (lambda: 1 / 0)())
ZeroDivisionError: integer division or modulo by zero

So the problem is there are two lambdas in line 3, so you need to
examine them both to figure out which one caused the exception. The
simple solution to this is to just put the lambdas on different lines:

print(
  (lambda: 1)(),
  (lambda: 1 / 0)()
)

Now you get a blindingly obvious traceback:

Traceback (most recent call last):
  File "test.py", line 5, in 
(lambda: 1 / 0)()
  File "test.py", line 5, in 
(lambda: 1 / 0)()
ZeroDivisionError: integer division or modulo by zero


> Using lambda is trading off convenience when writing the code for ease
> of debugging the code when a problem occurs. Whether that trade-off is
> worthwhile or not depends on factors such as how likely the lambda is
> to have a bug, how many of them there are, and whether or not there is
> uncertainty as to which one is called from where.

I feel like this whole 'debugging' argument is a red herring, because
you can have _many_ different expressions in a line of code all of which
could have potentially caused an exception you're trying to debug. For
example, dispensing with the lambdas:

print(1, 1 / 0)

Error:

Traceback (most recent call last):
  File "test.py", line 3, in 
print(1, 1 / 0)
ZeroDivisionError: integer division or modulo by zero

Again, it all comes down to Python just not telling you precisely where
the error occurred, and instead only giving you the general vicinity.
Again, this can be fixed simply by just putting the expressions on
different lines.

Regards,

Yawar
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Idiomatic backtracking in Python

2015-01-25 Thread Marko Rauhamaa
Ben Finney :

> “Back-tracking” doesn't have a general meaning I recognise beyond
> random access into a data structure.

Backtracking means the part of depth-first traversal where you retreat
to the parent node. If you implement your traversal with a recursive
function, backtracking means — more or less — a return from the
function.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Idiomatic backtracking in Python

2015-01-25 Thread Chris Angelico
On Mon, Jan 26, 2015 at 12:31 PM, Marko Rauhamaa  wrote:
> Backtracking means the part of depth-first traversal where you retreat
> to the parent node. If you implement your traversal with a recursive
> function, backtracking means — more or less — a return from the
> function.

But possibly you need to return from N levels, not just one. Imagine
you're traversing a directory tree using a simplistic recursive
algorithm:

def traverse(dir):
for d in subdirs(dir): traverse(d)
for f in files(dir):
if file_size(f) > 1024*1024*1024: skip this entire branch of the tree

Obviously you have to define the branch somehow, but there are plenty
of times when you might want to break out of "everything up to here".
How do you define that? How do you return lots of levels all at once?
I remember facing this exact problem in trying to solve a particular
piece-layout puzzle; if I discovered an impossible situation, I could
actually discard at least two or three levels of recursion all at
once, because there's no way that the situation could become
un-impossible within those levels. Can't remember how I ended up
dealing with that... I think I got naughty and used a global variable.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: __bases__ misleading error message

2015-01-25 Thread Ian Kelly
On Jan 25, 2015 2:37 PM, "Terry Reedy"  wrote:
> 2. the second array is a compact array of entries in insertion order,
such as
>
> [hash, ptr to 'x', ptr to 23]
> [hash, ptr to 'colour', ptr to 'red']
> [hash, ptr to the string 'y', ptr to the int 42]
>
> Iteration would use the compact array, making all dicts OrderedDicts.
Pypy has already switched to this.  It seems that on modern processors with
multilevel on-chip caches, the space reduction leads to cache-miss
reductions that compensate for the indirection cost.

Deletion becomes O(n) though. Has there been any investigation into how
commonly deletion of keys is done?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Idiomatic backtracking in Python

2015-01-25 Thread Rustom Mody
On Monday, January 26, 2015 at 2:21:34 AM UTC+5:30, Ian Foote wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Hi,
> 
> I think a very idiomatic way to implement backtracking is using a
> recursive generator (python 3):
> 
> def backtrack_solver(data=None):
> if data is None:
> yield from backtrack_solver(data=initial_data)
> 
> if cannot_be_valid(data):
> return
> 
> if matches_condition(data):
> yield data
> return
> 
> for new_data in process(data):
> yield from backtrack_solver(new_data)
> 
> This generator will yield valid solutions to a suitably defined problem.
> 
> `initial_data`, `cannot_be_valid`, `matches_condition` and `process`
> should be replaced with appropriate implementation for your problem.
> 
> For example, a sudoku solver could be fit to this by accepting a
> partially solved grid as the `data` parameter.
> 
> `cannot_be_valid` would now detect grids that have, say, two `1`s in a
> row or any other invalid grid state and exit.
> 
> `matches_condition` would detect a fully solved grid.
> 
> `process` would produce new grids with more cells filled in than the
> current grid.
> 
> `initial_data` wouldn't be strictly necessary here, but you could use
> it for an example grid. It could also be an empty grid, and the solver
> would then yield all valid grids.
> 
> Regards,
> Ian F
> 
> On 25/01/15 20:15, Johannes Bauer wrote:
> > Hi folks,
> > 
> > I have a problem at hand that needs code for backtracking as a
> > solution. And I have no problem coding it, but I can't get rid of
> > the feeling that I'm always solving backtracking problems in a
> > non-Pythonic (non-idiomatic) way. So, I would like to ask if you
> > have a Pythonic approach to backtracking problems? If so, I'd love
> > to hear your solutions!
> > 
> > Cheers, Johannes

To add to Ian:

The classic way of doing it in a functional framework is called:
"Replace failure by list of successes"

https://rkrishnan.org/files/wadler-1985.pdf

The things that have to go into it are
1. Extensive use of list comprehensions
2. Lazy lists

Just change in the above 'list' to 'generator'
and more or less it should work in python

More or less that means when you have a comprehension
[expr for x in list2 for y in list2 etc]

change the '[]' to '()'
and recursively change the list1 list2 to gen1 gen2

Some nuisances that bug me (or I dont know how to handle):

1. Singleton
   [val] becomes (x for x in [val])  (Hoo Boy!)

   Or 
   def single_val(): yield val

2. Nice syntax like list +

Compare [1] + [2]
with

>>> from itertools import chain
>>> a = (x for x in [1])
>>> b = (x for x in [2])
>>> list(chain(a,b))
[1, 2]

Of course it looks worse because the (syntax) overhead of
jumping between lists and generators overwhelms in this trivial case
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Alternative to multi-line lambdas: Assign-anywhere def statements

2015-01-25 Thread Steven D'Aprano
Yawar Amin wrote:

> Hi Steven,
> 
> On Saturday, January 24, 2015 at 11:27:33 PM UTC-5, Steven D'Aprano
> wrote:
>> [...]
>> > Doesn't the traceback tell us exactly where the lambda was called
>> > from?
>> 
>> Yes (assuming the source code is available, which it may not be), but
> 
> If the source code is not available, then you're equally unable to debug
> a lambda function and a named function.

If source code to a library is unavailable, debugging is significantly more
difficult, but it may still be possible. At least in the sense that you can
identify the nature of the bug and develop a work-around in your own code.
Obviously there needs to be *some* source code for you to edit, otherwise
there's nothing for you to edit :-)

Even in the absence of source code, you may still be able to report a bug to
the application developer and show the full traceback. Even without source,
tracebacks show the names of functions, and lambdas all share the same
name:

[steve@ando ~]$ cat demo.py
def fail(x):
f = lambda a: 1/a
g = lambda a: f(a-1)
return g(x)

print fail(1)
[steve@ando ~]$ python -m compileall demo.py
Compiling demo.py ...
[steve@ando ~]$ python demo.pyc
Traceback (most recent call last):
  File "demo.py", line 6, in 
print fail(1)
  File "demo.py", line 4, in fail
return g(x)
  File "demo.py", line 3, in 
g = lambda a: f(a-1)
  File "demo.py", line 2, in 
f = lambda a: 1/a
ZeroDivisionError: integer division or modulo by zero
[steve@ando ~]$ mv demo.py demo-save.py  # hide the source
[steve@ando ~]$ python demo.pyc
Traceback (most recent call last):
  File "demo.py", line 6, in 
  File "demo.py", line 4, in fail
  File "demo.py", line 3, in 
  File "demo.py", line 2, in 
ZeroDivisionError: integer division or modulo by zero


In this toy example, it is still easy to debug even without names. It's a
three line function, how hard could it be? :-) But that isn't always going
to be the case.



[...]
> True, Python's current traceback reporting doesn't tell us the exact
> column number on which the exception occurred. So for example with this:
[...]
> So the problem is there are two lambdas in line 3, so you need to
> examine them both to figure out which one caused the exception. The
> simple solution to this is to just put the lambdas on different lines:
> 
> print(
>   (lambda: 1)(),
>   (lambda: 1 / 0)()
> )
> 
> Now you get a blindingly obvious traceback:


It's blindingly obvious because it is trivial. You're still thinking about
the easy cases, not the hard cases:

py> def make_funcs():
... closures = []
... for i in range(5, -5, -1):
... closures.append(lambda x, i=i: x/i)
... return closures
...
py> results = [f(5) for f in make_funcs()]
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 1, in 
  File "", line 4, in 
ZeroDivisionError: division by zero


That's still not a hard case, because all the closures have a very similar
form and it's easy to spot that one of them divides by zero. But it
demonstrates one way that line numbers alone don't help.

But do understand I'm not saying that lambdas cannot be debugged. I'm saying
that *function names are a useful debugging tool*, so if you take away the
function name that makes debugging just that little bit harder. Not
necessarily in every single case, and not necessarily hard to the point
that you spend days trying to track down the error, but hard enough in
enough circumstances that using lambda for arbitrarily complex functions
would be a bad idea. (The same applies to comprehensions: they should be
kept simple.) That shouldn't be controversial.



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Track down SIGABRT

2015-01-25 Thread dieter
Israel Brewster  writes:
> ...
> Running it through GDB didn't seem to give me much more information than that 
> crash report - I can see that it is in thread 0, and the call stack is 
> identical to what the crash report shows, but I can't see where in the python 
> code it is.
> ...
> Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
> 0   libsystem_kernel.dylib0x98176be6 __select + 10
> 1   time.so   0x0052fdda 0x52f000 + 3546
> 2   org.python.python 0x0007e90a PyCFunction_Call + 98
> 3   org.python.python 0x000ccef4 PyEval_EvalFrameEx + 8182
> 4   org.python.python 0x000caeb2 PyEval_EvalCodeEx + 1777

This tells you: the crash happens somewhere in a call to a function
defined in "time.so".
I suppose that all such functions are harmless in themselves.


I see two potential causes:

  *  a stack overflow
 some Python builds are made with quite a small thread runtime stack.
 However, you would usually get a "SIGSEGV" rather than
 a "SIGABRT".

  *  a bad pointer passed in to the function.
 Again, I would expect rather a "SIGSEGV" than a "SIGABRT" --
 but maybe "__select" has a special sanity check for its arguments
 and raises "SIGABRT" when the check fails.

In order to determine where in your Python code the problem happens,
you likely need a Python with debugging symbols (system installed
Python interpreters usually are "stripped", e.g. they no longer
have debugging symbols).

Python comes with GDB macros able to show you information
at the Python (rather than the C) level. Here (Ubuntu 12.4),
it is at "/usr/share/doc/python2.7/gdbinit.gz".
Those macros need debugging symbols.
Read the top of this file, for how to get those macros defined
in your GDB session.


I doubt that being able to relate the problem to code
at the Python level will help you much. Your "SIGABRT" problem
is likely not caused on the Python level (but at C level).

If it is not caused by a runtime stack overflow, the cause is likely
a memory corruption, made non locally (far away) in some C extension.
Using a tool to track down memory corruption might help you
(however, in my experience, it is usually a huge task).

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Idiomatic backtracking in Python

2015-01-25 Thread Jussi Piitulainen
Rustom Mody writes:

> To add to Ian:
> 
> The classic way of doing it in a functional framework is called:
> "Replace failure by list of successes"
> 
> https://rkrishnan.org/files/wadler-1985.pdf
> 
> The things that have to go into it are
> 1. Extensive use of list comprehensions
> 2. Lazy lists
> 
> Just change in the above 'list' to 'generator'
> and more or less it should work in python
> 
> More or less that means when you have a comprehension
> [expr for x in list2 for y in list2 etc]
> 
> change the '[]' to '()'
> and recursively change the list1 list2 to gen1 gen2
> 
> Some nuisances that bug me (or I dont know how to handle):
> 
> 1. Singleton
>[val] becomes (x for x in [val])  (Hoo Boy!)
> 
>Or 
>def single_val(): yield val

It's just one def:

def sing(*args):
   for arg in args:
   yield arg

Or not even that: iter((val,)).

But see below at the end.

> 2. Nice syntax like list +
> 
> Compare [1] + [2]
> with
> 
> >>> from itertools import chain
> >>> a = (x for x in [1])
> >>> b = (x for x in [2])
> >>> list(chain(a,b))
> [1, 2]
> 
> Of course it looks worse because the (syntax) overhead of jumping
> between lists and generators overwhelms in this trivial case

Yes, one wouldn't jump back and forth so much. Just collect the result
at the end.

>>> list(chain(sing(3,1,4), sing(1), sing(5,9,2,6)))
[3, 1, 4, 1, 5, 9, 2, 6]

>>> list(chain(sing(3,1,4,1), sing(), sing(5,9,2,6)))
[3, 1, 4, 1, 5, 9, 2, 6]

And it gets better: chain takes lists! And tuples! For those who like
their brackets round:

>>> tuple(chain((3,1,4,1), (), (5,9,2,6)))
(3, 1, 4, 1, 5, 9, 2, 6)

>>> set(chain((3,)))
{3}

>>> list(chain(()))
[]

So chain does it all.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Idiomatic backtracking in Python

2015-01-25 Thread Rustom Mody
On Monday, January 26, 2015 at 12:52:04 PM UTC+5:30, Jussi Piitulainen wrote:
> Rustom Mody writes:
> 
> > To add to Ian:
> > 
> > The classic way of doing it in a functional framework is called:
> > "Replace failure by list of successes"
> > 
> > https://rkrishnan.org/files/wadler-1985.pdf
> > 
> > The things that have to go into it are
> > 1. Extensive use of list comprehensions
> > 2. Lazy lists
> > 
> > Just change in the above 'list' to 'generator'
> > and more or less it should work in python
> > 
> > More or less that means when you have a comprehension
> > [expr for x in list2 for y in list2 etc]
> > 
> > change the '[]' to '()'
> > and recursively change the list1 list2 to gen1 gen2
> > 
> > Some nuisances that bug me (or I dont know how to handle):
> > 
> > 1. Singleton
> >[val] becomes (x for x in [val])  (Hoo Boy!)
> > 
> >Or 
> >def single_val(): yield val
> 
> It's just one def:
> 
> def sing(*args):
>for arg in args:
>yield arg
> 
> Or not even that: iter((val,)).


Sweet


> 
> But see below at the end.
> 
> > 2. Nice syntax like list +
> > 
> > Compare [1] + [2]
> > with
> > 
> > >>> from itertools import chain
> > >>> a = (x for x in [1])
> > >>> b = (x for x in [2])
> > >>> list(chain(a,b))
> > [1, 2]
> > 
> > Of course it looks worse because the (syntax) overhead of jumping
> > between lists and generators overwhelms in this trivial case
> 
> Yes, one wouldn't jump back and forth so much. Just collect the result
> at the end.
> 
> >>> list(chain(sing(3,1,4), sing(1), sing(5,9,2,6)))
> [3, 1, 4, 1, 5, 9, 2, 6]
> 
> >>> list(chain(sing(3,1,4,1), sing(), sing(5,9,2,6)))
> [3, 1, 4, 1, 5, 9, 2, 6]
> 
> And it gets better: chain takes lists! And tuples! For those who like
> their brackets round:
> 
> >>> tuple(chain((3,1,4,1), (), (5,9,2,6)))
> (3, 1, 4, 1, 5, 9, 2, 6)
> 
> >>> set(chain((3,)))
> {3}
> 
> >>> list(chain(()))
> []
> 
> So chain does it all.

Neat 
Thanks

-- 
https://mail.python.org/mailman/listinfo/python-list