date:20160424


On 04/23/2016 06:29 PM, Ian Kelly wrote:


Python enums are great. Sadly, they're still not quite as awesome as Java enums.


What fun things can Java enums do?

--
~Ethan~

--
https://mail.python.org/mailman/listinfo/python-list

Re: How much sanity checking is required for function inputs?


On 04/23/2016 06:21 PM, Michael Selik wrote:

On Sat, Apr 23, 2016 at 9:01 PM Christopher Reimer wrote:



Hmm... What do we use Enum for? :)


You can use Enum in certain circumstances to replace int or str constants.
It can help avoid mistyping mistakes and might help your IDE give
auto-complete suggestions. I haven't found a good use for them myself, but
I'd been mostly stuck in Python 2 until recently.


enum34 is the backport, aenum is the turbo charged version.

  https://pypi.python.org/pypi/enum34
  https://pypi.python.org/pypi/aenum

--
~Ethan~
--
https://mail.python.org/mailman/listinfo/python-list

Re: A pickle problem!

2016-04-24 Thread Fabien


On 04/21/2016 11:43 PM, Paulo da Silva wrote:

class C(pd.DataFrame):


Note also that subclassing pandas is not always encouraged:

http://pandas.pydata.org/pandas-docs/stable/internals.html#subclassing-pandas-data-structures

Cheers,

Fabien
--
https://mail.python.org/mailman/listinfo/python-list

RE: Remove directory tree without following symlinks

2016-04-24 Thread Albert-Jan Roskam


> From: eryk...@gmail.com
> Date: Sat, 23 Apr 2016 15:22:35 -0500
> Subject: Re: Remove directory tree without following symlinks
> To: python-list@python.org
> 
> On Sat, Apr 23, 2016 at 4:34 AM, Albert-Jan Roskam
>  wrote:
>>
>>> From: eryk...@gmail.com
>>> Date: Fri, 22 Apr 2016 13:28:01 -0500
>>> On Fri, Apr 22, 2016 at 12:39 PM, Albert-Jan Roskam
>>>  wrote:
 FYI, Just today I found out that shutil.rmtree raises a WindowsError if
 the dir is read-only (or its contents). Using 'ignore_errors', won't help.
 Sure, no error is raised, but the dir is not deleted either! A 'force' 
 option
 would be a nice improvement.
>>>
>>> Use the onerror handler to call os.chmod(path, stat.S_IWRITE). For
>>> example, see pip's rmtree_errorhandler:
>>>
>>> https://github.com/pypa/pip/blob/8.1.1/pip/utils/__init__.py#L105
>>
>> Thanks, that looks useful indeed. I thought about os.chmod, but with
>> os.walk. That seemed expensive. So I used subprocess.call('rmdir "%s" /s /q'
>> % dirname). That's Windows only, of course, but aside of that, is using
>> subprocess less preferable?
> 
> I assume you used shell=True in the above call, and not an external
> rmdir.exe. There are security concerns with using the shell if you're
> not in complete control of the command line.
> 
> As to performance, cmd's rmdir wins without question, not only because
> it's implemented in C, but also because it uses the stat data from the
> WIN32_FIND_DATA returned by FindFirstFile/FindNextFile to check for
> FILE_ATTRIBUTE_DIRECTORY and FILE_ATTRIBUTE_READONLY.
> 
> On the other hand, Python wins when it comes to working with deeply
> nested directories. Paths in cmd are limited to MAX_PATH characters.
> rmdir uses DOS 8.3 short names (i.e. cAlternateFileName in
> WIN32_FIND_DATA), but that could still exceed MAX_PATH for a deeply
> nested tree, or the volume may not even have 8.3 DOS filenames.
> shutil.rmtree allows you to work around the DOS limit by prefixing the
> path with "\\?\". For example:
> 
>>>> subprocess.call(r'rmdir /q/s Z:\Temp\long', shell=True)
> The path Z:\Temp\long\aa
> 
> 
> 
> a is too long.
> 0
> 
>>>> shutil.rmtree(r'\\?\Z:\Temp\long')
>>>> os.path.exists(r'Z:\Temp\long')
> False
> 
> Using "\\?\" requires a path that's fully qualified, normalized
> (backslash only), and unicode (i.e. decode a Python 2 str).

Aww, I kinda forgot about that already, but I came across this last year [1]. 
Apparently, 
shutil.rmtree(very_long_path) failed under Win 7, even with the "silly prefix". 
I believe very_long_path was a Python2-str.
It seems useful if shutil or os.path would automatically prefix paths with 
"\\?\". It is rarely really needed, though.
(in my case it was needed to copy a bunch of MS Outlook .msg files, which 
automatically get the subject line as the filename, and perhaps
the first sentence of the mail of the mail has no subject).

[1] https://mail.python.org/pipermail/python-list/2015-June/693156.html

  
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How much sanity checking is required for function inputs?


On 04/23/2016 06:00 PM, Christopher Reimer wrote:


Hmm... What do we use Enum for? :)


from enum import Enum

class Piece(Enum):
king = 'one space, any direction'
queen = 'many spaces, any direction'
bishop = 'many spaces, diagonal'
knight = 'two spaces cardinal, one space sideways, cannot be blocked'
rook = 'many spaces, cardinal'
pawn = 'first move: one or two spaces forward; subsequent moves: 
one space forward; attack: one space diagonal'


--> list(Piece)
[
,
,
,
blocked'>,

,
moves: one space forward; attack: one space diagonal'>,

]

--> p = Piece.bishop
--> p in Piece
True

--> p is Piece.rook
False

--> p is Piece.bishop
True

--
~Ethan~

--
https://mail.python.org/mailman/listinfo/python-list

Comparing Python enums to Java, was: How much sanity checking is required for function inputs?

2016-04-24 Thread Ian Kelly

On Sun, Apr 24, 2016 at 1:20 AM, Ethan Furman  wrote:
> On 04/23/2016 06:29 PM, Ian Kelly wrote:
>
>> Python enums are great. Sadly, they're still not quite as awesome as Java
>> enums.
>
>
> What fun things can Java enums do?

Everything that Python enums can do, plus:

* You can override methods of individual values, not just the class as
a whole. Good for implementing the strategy pattern, or for defining a
default method implementation that one or two values do differently.
In Python you can emulate the same thing by adding the method directly
to the instance dict of the enum value, so this isn't really all that
much of a difference.

* Java doesn't have the hokey notion of enum instances being distinct
from their "value". The individual enum members *are* the values.
Whereas in Python an enum member is an awkward class instance that
contains a value of some other type. Python tries to get away from the
C-like notion that enums are ints by making the enum members
non-comparable, but then gives us IntEnum as a way to work around it
if we really want to. Since Java enums don't depend on any other type
for their values, there's nothing inviting the user to treat enums as
ints in the first place.

* As a consequence of the above, Java doesn't conflate enum values
with their parameters. The Python enum docs give us this interesting
example of an enum that takes arguments from its declaration:

>>> class Planet(Enum):
... MERCURY = (3.303e+23, 2.4397e6)
... VENUS   = (4.869e+24, 6.0518e6)
... EARTH   = (5.976e+24, 6.37814e6)
... MARS= (6.421e+23, 3.3972e6)
... JUPITER = (1.9e+27,   7.1492e7)
... SATURN  = (5.688e+26, 6.0268e7)
... URANUS  = (8.686e+25, 2.5559e7)
... NEPTUNE = (1.024e+26, 2.4746e7)
... def __init__(self, mass, radius):
... self.mass = mass   # in kilograms
... self.radius = radius   # in meters
... @property
... def surface_gravity(self):
... # universal gravitational constant  (m3 kg-1 s-2)
... G = 6.67300E-11
... return G * self.mass / (self.radius * self.radius)
...
>>> Planet.EARTH.value
(5.976e+24, 6378140.0)
>>> Planet.EARTH.surface_gravity
9.802652743337129

This is incredibly useful, but it has a flaw: the value of each member
of the enum is just the tuple of its arguments. Suppose we added a
value for COUNTER_EARTH describing a hypothetical planet with the same
mass and radius existing on the other side of the sun. [1] Then:

>>> Planet.EARTH is Planet.COUNTER_EARTH
True

Because they have the same "value", instead of creating a separate
member, COUNTER_EARTH gets defined as an alias for EARTH. To work
around this, one would have to add a third argument to the above to
pass in an additional value for the sole purpose of distinguishing (or
else adapt the AutoNumber recipe to work with this example). This
example is a bit contrived since it's generally not likely to come up
with floats, but it can easily arise (and in my experience frequently
does) when the arguments are of more discrete types. It's notable that
the Java enum docs feature this very same example but without this
weakness. [2]

* Speaking of AutoNumber, since Java enums don't have the
instance/value distinction, they effectively do this implicitly, only
without generating a bunch of ints that are entirely irrelevant to
your enum type. With Python enums you have to follow a somewhat arcane
recipe to avoid specifying values, which just generates some values
and then hides them away. And it also breaks the Enum alias feature:

>>> class Color(AutoNumber):
... red = default = ()  # not an alias!
... blue = ()
...
>>> Color.red is Color.default
False

Anyroad, I think that covers all my beefs with the way enums are
implemented in Python. Despite the above, they're a great feature, and
I use them and appreciate that we have them.

[1] https://en.wikipedia.org/wiki/Counter-Earth
[2] https://docs.oracle.com/javase/tutorial/java/javaOO/enum.html
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Comparing Python enums to Java, was: How much sanity checking is required for function inputs?


On 04/24/2016 08:20 AM, Ian Kelly wrote:

On Sun, Apr 24, 2016 at 1:20 AM, Ethan Furman wrote:

On 04/23/2016 06:29 PM, Ian Kelly wrote:



Python enums are great. Sadly, they're still not quite as awesome as Java
enums.



What fun things can Java enums do?


Everything that Python enums can do, plus:

* You can override methods of individual values, not just the class as
a whole. Good for implementing the strategy pattern, or for defining a
default method implementation that one or two values do differently.
In Python you can emulate the same thing by adding the method directly
to the instance dict of the enum value, so this isn't really all that
much of a difference.


All non-dunder methods, at least.


* Java doesn't have the hokey notion of enum instances being distinct
from their "value". The individual enum members *are* the values.
Whereas in Python an enum member is an awkward class instance that
contains a value of some other type. Python tries to get away from the
C-like notion that enums are ints by making the enum members
non-comparable, but then gives us IntEnum as a way to work around it
if we really want to. Since Java enums don't depend on any other type
for their values, there's nothing inviting the user to treat enums as
ints in the first place.


How does Java share enums with other programs, computers, and/or languages?

As far as value-separate-from-instance: if you want/need them to be the 
same thing, mix-in the type:


class Planet(float, Enum):
...

[see below for "no-alias" ideas/questions]

NB: The enum and the value are still different ('is' fails) but equal.


* As a consequence of the above, Java doesn't conflate enum values
with their parameters. The Python enum docs give us this interesting
example of an enum that takes arguments from its declaration:


class Planet(Enum):

... MERCURY = (3.303e+23, 2.4397e6)
... VENUS   = (4.869e+24, 6.0518e6)
... EARTH   = (5.976e+24, 6.37814e6)
... MARS= (6.421e+23, 3.3972e6)
... JUPITER = (1.9e+27,   7.1492e7)
... SATURN  = (5.688e+26, 6.0268e7)
... URANUS  = (8.686e+25, 2.5559e7)
... NEPTUNE = (1.024e+26, 2.4746e7)
... def __init__(self, mass, radius):
... self.mass = mass   # in kilograms
... self.radius = radius   # in meters
... @property
... def surface_gravity(self):
... # universal gravitational constant  (m3 kg-1 s-2)
... G = 6.67300E-11
... return G * self.mass / (self.radius * self.radius)
...

Planet.EARTH.value

(5.976e+24, 6378140.0)

Planet.EARTH.surface_gravity

9.802652743337129

This is incredibly useful, but it has a flaw: the value of each member
of the enum is just the tuple of its arguments. Suppose we added a
value for COUNTER_EARTH describing a hypothetical planet with the same
mass and radius existing on the other side of the sun. [1] Then:


Planet.EARTH is Planet.COUNTER_EARTH

True

Because they have the same "value", instead of creating a separate
member, COUNTER_EARTH gets defined as an alias for EARTH. To work
around this, one would have to add a third argument to the above to
pass in an additional value for the sole purpose of distinguishing (or
else adapt the AutoNumber recipe to work with this example). This
example is a bit contrived since it's generally not likely to come up
with floats, but it can easily arise (and in my experience frequently
does) when the arguments are of more discrete types. It's notable that
the Java enum docs feature this very same example but without this
weakness. [2]


One reason for this is that Python enums are lookup-able via the value:

>>> Planet(9.80265274333129)
Planet.EARTH

Do Java enums not have such a feature, or this "feature" totally 
unnecessary in Java?


I could certainly add a "no-alias" feature to aenum.  What would be the 
appropriate value-lookup behaviour in such cases?


- return the first match
- return a list of matches
- raise an error
- disable value-lookups for that Enum


* Speaking of AutoNumber, since Java enums don't have the
instance/value distinction, they effectively do this implicitly, only
without generating a bunch of ints that are entirely irrelevant to
your enum type. With Python enums you have to follow a somewhat arcane
recipe to avoid specifying values, which just generates some values
and then hides them away. And it also breaks the Enum alias feature:


class Color(AutoNumber):

... red = default = ()  # not an alias!
... blue = ()
...

Color.red is Color.default

False


Unfortunately, the empty tuple tends to be a singleton, so there is no 
way to tell that red and default are (supposed to be) the same and blue 
is (supposed to be) different:


--> a = b = ()
--> c = ()
--> a is b
True
--> a is c
True

If you have an idea on how to make that work I am interested.


Anyroad, I think that covers all my beefs with the way enums are
implemented in Python. Despite the above, they're a great feature, and
I use them and appreciate tha

Re: Comparing Python enums to Java, was: How much sanity checking is required for function inputs?

On Mon, Apr 25, 2016 at 2:04 AM, Ethan Furman  wrote:
> Unfortunately, the empty tuple tends to be a singleton, so there is no way
> to tell that red and default are (supposed to be) the same and blue is
> (supposed to be) different:
>
> --> a = b = ()
> --> c = ()
> --> a is b
> True
> --> a is c
> True
>
> If you have an idea on how to make that work I am interested.

Easy: allow an empty list to have the same meaning as an empty tuple.
Every time you have [] in your source code, you're guaranteed to get a
new (unique) empty list, and then multiple assignment will work.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Optimizing Memory Allocation in a Simple, but Long Function

On Sun, Apr 24, 2016 at 1:05 PM, Derek Klinge  wrote:
> I have been writing a python script to explore Euler's Method of
> approximating Euler's Number. I was hoping there might be a way to make
> this process work faster, as for sufficiently large eulerSteps, the process
> below becomes quite slow and sometimes memory intensive. I'm hoping someone
> can give me some insight as to how to optimize these algorithms, or ways I
> might decrease memory usage. I have been thinking about finding a way
> around importing the math module, as it seems a bit unneeded except as an
> easy reference.

Are you sure memory is the real problem here?

(The first problem you have, incidentally, is a formatting one. All
your indentation has been lost. Try posting your code again, in a way
that doesn't lose leading spaces/tabs, and then we'll be better able
to figure out what's going on.)

If I'm reading your code correctly, you have two parts:

1) class EulersNumber, which iterates up to some specific count
2) Module-level functions, which progressively increase the count of
constructed EulersNumbers.

Between them, you appear to have an O(n*n) algorithm for finding a
"sufficiently-accurate" representation. You're starting over from
nothing every time. If, instead, you were to start from the previous
approximation and add another iteration, that ought to be immensely
faster.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Comparing Python enums to Java, was: How much sanity checking is required for function inputs?


On 04/24/2016 09:10 AM, Chris Angelico wrote:

On Mon, Apr 25, 2016 at 2:04 AM, Ethan Furman  wrote:

Unfortunately, the empty tuple tends to be a singleton, so there is no way
to tell that red and default are (supposed to be) the same and blue is
(supposed to be) different:

--> a = b = ()
--> c = ()
--> a is b
True
--> a is c
True

If you have an idea on how to make that work I am interested.


Easy: allow an empty list to have the same meaning as an empty tuple.
Every time you have [] in your source code, you're guaranteed to get a
new (unique) empty list, and then multiple assignment will work.


*sigh*

Where were you three years ago?  ;)

Actually, thinking about it a bit more, if we did that then one could 
not use an empty list as an enum value.  Why would one want to?  No 
idea, but to make it nearly impossible I'd want a much better reason 
than a minor inconvenience:


class Numbers:
   def __init__(self, value=0):
  self.value = value
   def __call__(self, value=None):
  if value is None:
  value = self.value
  self.value = value + 1
  return value

a = Numbers()

class SomeNumbers(Enum):
   one = a()
   two = a()
   five = a(5)
   six = seis = a()

One extra character, and done.

--
~Ethan~
--
https://mail.python.org/mailman/listinfo/python-list

Re: Comparing Python enums to Java, was: How much sanity checking is required for function inputs?

On Mon, Apr 25, 2016 at 2:42 AM, Ethan Furman  wrote:
>> Easy: allow an empty list to have the same meaning as an empty tuple.
>> Every time you have [] in your source code, you're guaranteed to get a
>> new (unique) empty list, and then multiple assignment will work.
>
>
> *sigh*
>
> Where were you three years ago?  ;)
>
> Actually, thinking about it a bit more, if we did that then one could not
> use an empty list as an enum value.  Why would one want to?  No idea, but to
> make it nearly impossible I'd want a much better reason than a minor
> inconvenience:

I would normally expect enumerated values to be immutable and
hashable, but that isn't actually required by the code AIUI. Under
what circumstances is it useful to have mutable enum values?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Optimizing Memory Allocation in a Simple, but Long Function

Sorry about the code indentation, I was using Pythonista (iOS), and it did
not have any problem with that indentation...

Here is a new set of the code:
## Write a method to approximate Euler's Number using Euler's Method
import math

class EulersNumber():
def __init__(self,n):
self.eulerSteps = n
self.e = self.EulersMethod(self.eulerSteps)
def linearApproximation(self,x,h,d): # f(x+h)=f(x)+h*f'(x)
return x + h * d
def EulersMethod(self, numberOfSteps): # Repeate linear approximation over
an even range
e = 1 # e**0 = 1
for step in range(numberOfSteps):
e = self.linearApproximation(e,1.0/numberOfSteps,e) # if f(x)= e**x,
f'(x)=f(x)
return e

def EulerStepWithGuess(accuracy,guessForN):
n = guessForN
e = EulersNumber(n)
while abs(e.e - math.e) > abs(accuracy):
n +=1
e = EulersNumber(n)
print('n={} \te= {} \tdelta(e)={}'.format(n,e.e,abs(e.e - math.e)))
return e

def EulersNumberToAccuracy(PowerOfTen):
x = 1
theGuess = 1
thisE = EulersNumber(1)
while x <= abs(PowerOfTen):
thisE = EulerStepWithGuess(10**(-1*x),theGuess)
theGuess = thisE.eulerSteps * 10
x += 1
return thisE

My problem is this: my attempt at Euler's Method involves creating a list
of numbers that is n long. Is there a way I can iterate over the linear
approximation method without creating a list of steps (maybe recursion, I
am a bit new at this). Ideally I'd like to perform the linearApproximation
method a arbitrary number of times (hopefully >10**10) and keep feeding the
answers back into itself to get the new answer. I know this will be
computationally time intensive, but how do I minimize memory usage (limit
the size of my list)? I also may be misunderstanding the problem, in which
case I am open to looking at it from a different perspective.

Thanks,
Derek

On Sun, Apr 24, 2016 at 9:22 AM Chris Angelico  wrote:

> On Sun, Apr 24, 2016 at 1:05 PM, Derek Klinge 
> wrote:
> > I have been writing a python script to explore Euler's Method of
> > approximating Euler's Number. I was hoping there might be a way to make
> > this process work faster, as for sufficiently large eulerSteps, the
> process
> > below becomes quite slow and sometimes memory intensive. I'm hoping
> someone
> > can give me some insight as to how to optimize these algorithms, or ways
> I
> > might decrease memory usage. I have been thinking about finding a way
> > around importing the math module, as it seems a bit unneeded except as an
> > easy reference.
>
> Are you sure memory is the real problem here?
>
> (The first problem you have, incidentally, is a formatting one. All
> your indentation has been lost. Try posting your code again, in a way
> that doesn't lose leading spaces/tabs, and then we'll be better able
> to figure out what's going on.)
>
> If I'm reading your code correctly, you have two parts:
>
> 1) class EulersNumber, which iterates up to some specific count
> 2) Module-level functions, which progressively increase the count of
> constructed EulersNumbers.
>
> Between them, you appear to have an O(n*n) algorithm for finding a
> "sufficiently-accurate" representation. You're starting over from
> nothing every time. If, instead, you were to start from the previous
> approximation and add another iteration, that ought to be immensely
> faster.
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Optimizing Memory Allocation in a Simple, but Long Function

I think my e-mail client may be stripping the indentation, here it is with
4-space indentation

## Write a method to approximate Euler's Number using Euler's Method
import math

class EulersNumber():
def __init__(self,n):
self.eulerSteps = n
self.e = self.EulersMethod(self.eulerSteps)
def linearApproximation(self,x,h,d): # f(x+h)=f(x)+h*f'(x)
return x + h * d
def EulersMethod(self, numberOfSteps): # Repeate linear approximation over
an even range
e = 1 # e**0 = 1
for step in range(numberOfSteps):
e = self.linearApproximation(e,1.0/numberOfSteps,e) # if f(x)= e**x,
f'(x)=f(x)
return e

def EulerStepWithGuess(accuracy,guessForN):
n = guessForN
e = EulersNumber(n)
while abs(e.e - math.e) > abs(accuracy):
n +=1
e = EulersNumber(n)
print('n={} \te= {} \tdelta(e)={}'.format(n,e.e,abs(e.e - math.e)))
return e

def EulersNumberToAccuracy(PowerOfTen):
x = 1
theGuess = 1
thisE = EulersNumber(1)
while x <= abs(PowerOfTen):
thisE = EulerStepWithGuess(10**(-1*x),theGuess)
theGuess = thisE.eulerSteps * 10
x += 1
return thisE
On Sun, Apr 24, 2016 at 10:02 AM Derek Klinge  wrote:

> Sorry about the code indentation, I was using Pythonista (iOS), and it did
> not have any problem with that indentation...
>
> Here is a new set of the code:
> ## Write a method to approximate Euler's Number using Euler's Method
> import math
>
> class EulersNumber():
> def __init__(self,n):
> self.eulerSteps = n
> self.e = self.EulersMethod(self.eulerSteps)
> def linearApproximation(self,x,h,d): # f(x+h)=f(x)+h*f'(x)
> return x + h * d
> def EulersMethod(self, numberOfSteps): # Repeate linear approximation over
> an even range
> e = 1 # e**0 = 1
> for step in range(numberOfSteps):
> e = self.linearApproximation(e,1.0/numberOfSteps,e) # if f(x)= e**x,
> f'(x)=f(x)
> return e
>
> def EulerStepWithGuess(accuracy,guessForN):
> n = guessForN
> e = EulersNumber(n)
> while abs(e.e - math.e) > abs(accuracy):
> n +=1
> e = EulersNumber(n)
> print('n={} \te= {} \tdelta(e)={}'.format(n,e.e,abs(e.e - math.e)))
> return e
>
> def EulersNumberToAccuracy(PowerOfTen):
> x = 1
> theGuess = 1
> thisE = EulersNumber(1)
> while x <= abs(PowerOfTen):
> thisE = EulerStepWithGuess(10**(-1*x),theGuess)
> theGuess = thisE.eulerSteps * 10
> x += 1
> return thisE
>
> My problem is this: my attempt at Euler's Method involves creating a list
> of numbers that is n long. Is there a way I can iterate over the linear
> approximation method without creating a list of steps (maybe recursion, I
> am a bit new at this). Ideally I'd like to perform the linearApproximation
> method a arbitrary number of times (hopefully >10**10) and keep feeding the
> answers back into itself to get the new answer. I know this will be
> computationally time intensive, but how do I minimize memory usage (limit
> the size of my list)? I also may be misunderstanding the problem, in which
> case I am open to looking at it from a different perspective.
>
> Thanks,
> Derek
>
> On Sun, Apr 24, 2016 at 9:22 AM Chris Angelico  wrote:
>
>> On Sun, Apr 24, 2016 at 1:05 PM, Derek Klinge 
>> wrote:
>> > I have been writing a python script to explore Euler's Method of
>> > approximating Euler's Number. I was hoping there might be a way to make
>> > this process work faster, as for sufficiently large eulerSteps, the
>> process
>> > below becomes quite slow and sometimes memory intensive. I'm hoping
>> someone
>> > can give me some insight as to how to optimize these algorithms, or
>> ways I
>> > might decrease memory usage. I have been thinking about finding a way
>> > around importing the math module, as it seems a bit unneeded except as
>> an
>> > easy reference.
>>
>> Are you sure memory is the real problem here?
>>
>> (The first problem you have, incidentally, is a formatting one. All
>> your indentation has been lost. Try posting your code again, in a way
>> that doesn't lose leading spaces/tabs, and then we'll be better able
>> to figure out what's going on.)
>>
>> If I'm reading your code correctly, you have two parts:
>>
>> 1) class EulersNumber, which iterates up to some specific count
>> 2) Module-level functions, which progressively increase the count of
>> constructed EulersNumbers.
>>
>> Between them, you appear to have an O(n*n) algorithm for finding a
>> "sufficiently-accurate" representation. You're starting over from
>> nothing every time. If, instead, you were to start from the previous
>> approximation and add another iteration, that ought to be immensely
>> faster.
>>
>> ChrisA
>> --
>> https://mail.python.org/mailman/listinfo/python-list
>>
>
-- 
https://mail.python.org/mailman/listinfo/python-list

Challenge: Shadow lots of built-ins

This is mostly just for the fun of it, but every now and then I have a
discussion with people about why it's legal to shadow Python's
built-in names, and it'd be handy to have a go-to piece of demo code.
So here's the challenge: Write a short, readable block of code that
shadows as many built-ins as possible.

The rules:

1) The code has to be readable on its own. Doesn't have to be fully
functional (it's okay to presume the existence of a back-end database,
for instance), but a human should be able to parse it easily.
2) PEP 8, please, for consistency.
3) Code should be Python 3.x compatible.
4) Every shadowed name MUST make sense. You would have to plausibly
use this exact same name in some other language.
5) Have fun! Enjoy writing suboptimal code! :)

Here's a starter.

def zip_all(root):
"""Compress a directory, skipping dotfiles

Returns the created zip file and a list of stuff
that got dropped into the bin.
"""
bin = []
with zipfile.ZipFile("temp.zip", "w") as zip:
for root, dirs, files in os.walk("."):
for dir in dirs:
if dir.startswith("."):
dirs.remove(dir)
bin.append(os.path.join(root, dir))
for file in files:
if not file.startswith("."):
zip.write(os.path.join(root, file))
return zip, bin

That's only four, and I know you folks can do way better than that!

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Optimizing Memory Allocation in a Simple, but Long Function

On Mon, Apr 25, 2016 at 3:06 AM, Derek Klinge  wrote:
> I think my e-mail client may be stripping the indentation, here it is with
> 4-space indentation

I think it is. Both your reposted versions have indentation lost. You
may need to use a different client.

My posts come from the Gmail web client and indentation usually comes
through just fine (tabs are sometimes lost, but spaces never are).
FWIW, I have "Rich Text" disabled - not sure if that makes a
difference.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Optimizing Memory Allocation in a Simple, but Long Function

On Mon, Apr 25, 2016 at 3:02 AM, Derek Klinge  wrote:
> My problem is this: my attempt at Euler's Method involves creating a list of
> numbers that is n long. Is there a way I can iterate over the linear
> approximation method without creating a list of steps (maybe recursion, I am
> a bit new at this). Ideally I'd like to perform the linearApproximation
> method a arbitrary number of times (hopefully >10**10) and keep feeding the
> answers back into itself to get the new answer. I know this will be
> computationally time intensive, but how do I minimize memory usage (limit
> the size of my list)? I also may be misunderstanding the problem, in which
> case I am open to looking at it from a different perspective.

def EulersMethod(self, numberOfSteps): # Repeate linear approximation
over an even range
e = 1 # e**0 = 1
for step in range(numberOfSteps):
e = self.linearApproximation(e,1.0/numberOfSteps,e) # if f(x)=
e**x, f'(x)=f(x)
return e

This is your code, right?

I'm not seeing anywhere in here that creates a list of numbers. It
does exactly what you're hoping for: it feeds the answer back to
itself for the next step.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Comparing Python enums to Java, was: How much sanity checking is required for function inputs?

2016-04-24 Thread BartC


On 24/04/2016 17:47, Chris Angelico wrote:

On Mon, Apr 25, 2016 at 2:42 AM, Ethan Furman  wrote:

Easy: allow an empty list to have the same meaning as an empty tuple.
Every time you have [] in your source code, you're guaranteed to get a
new (unique) empty list, and then multiple assignment will work.



*sigh*

Where were you three years ago?  ;)

Actually, thinking about it a bit more, if we did that then one could not
use an empty list as an enum value.  Why would one want to?  No idea, but to
make it nearly impossible I'd want a much better reason than a minor
inconvenience:


I would normally expect enumerated values to be immutable and
hashable,


And, perhaps, to be actual enumerations. (So that in the set (a,b,c,d), 
you don't know nor care about the underlying values, except that they 
are distinct.)


--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list

Re: Optimizing Memory Allocation in a Simple, but Long Function

Doesn't range(n) create a list n long?
On Sun, Apr 24, 2016 at 10:21 AM Chris Angelico  wrote:

> On Mon, Apr 25, 2016 at 3:02 AM, Derek Klinge 
> wrote:
> > My problem is this: my attempt at Euler's Method involves creating a
> list of
> > numbers that is n long. Is there a way I can iterate over the linear
> > approximation method without creating a list of steps (maybe recursion,
> I am
> > a bit new at this). Ideally I'd like to perform the linearApproximation
> > method a arbitrary number of times (hopefully >10**10) and keep feeding
> the
> > answers back into itself to get the new answer. I know this will be
> > computationally time intensive, but how do I minimize memory usage (limit
> > the size of my list)? I also may be misunderstanding the problem, in
> which
> > case I am open to looking at it from a different perspective.
>
> def EulersMethod(self, numberOfSteps): # Repeate linear approximation
> over an even range
> e = 1 # e**0 = 1
> for step in range(numberOfSteps):
> e = self.linearApproximation(e,1.0/numberOfSteps,e) # if f(x)=
> e**x, f'(x)=f(x)
> return e
>
> This is your code, right?
>
> I'm not seeing anywhere in here that creates a list of numbers. It
> does exactly what you're hoping for: it feeds the answer back to
> itself for the next step.
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Optimizing Memory Allocation in a Simple, but Long Function

On Mon, Apr 25, 2016 at 3:56 AM, Derek Klinge  wrote:
> Doesn't range(n) create a list n long?

Not in Python 3. If your code is running on Python 2, use xrange
instead of range. I rather doubt that's your problem, though.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Comparing Python enums to Java, was: How much sanity checking is required for function inputs?

On Mon, Apr 25, 2016 at 3:54 AM, BartC  wrote:
> On 24/04/2016 17:47, Chris Angelico wrote:
>>
>> On Mon, Apr 25, 2016 at 2:42 AM, Ethan Furman  wrote:

 Easy: allow an empty list to have the same meaning as an empty tuple.
 Every time you have [] in your source code, you're guaranteed to get a
 new (unique) empty list, and then multiple assignment will work.
>>>
>>>
>>>
>>> *sigh*
>>>
>>> Where were you three years ago?  ;)
>>>
>>> Actually, thinking about it a bit more, if we did that then one could not
>>> use an empty list as an enum value.  Why would one want to?  No idea, but
>>> to
>>> make it nearly impossible I'd want a much better reason than a minor
>>> inconvenience:
>>
>>
>> I would normally expect enumerated values to be immutable and
>> hashable,
>
>
> And, perhaps, to be actual enumerations. (So that in the set (a,b,c,d), you
> don't know nor care about the underlying values, except that they are
> distinct.)

Not necessarily; often, the Python enumeration has to sync up with
someone else's, possibly in C. It might not matter that BUTTON_OK is 1
and BUTTON_CANCEL is 2, but you have to make sure that everyone agrees
on those meanings. So when you build the Python module, it's mandatory
that those values be exactly what they are documented as.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How much sanity checking is required for function inputs?

2016-04-24 Thread Steven D'Aprano

On Sun, 24 Apr 2016 04:40 pm, Michael Selik wrote:

> I think we're giving mixed messages because we're conflating "constants"
> and globals that are expected to change.

When you talk about "state", that usually means "the current state of the
program", not constants. math.pi is not "state".

> In our case here, I think two clients in the same process sharing state
> might be a feature rather than a bug. Or at least it has the same behavior
> as the current implementation.

I don't think so. Two clients sharing state is exactly what makes thread
programming with shared state so exciting.

Suppose you import the decimal module, and set the global context:

py> import decimal
py> decimal.setcontext(decimal.ExtendedContext)
py> decimal.getcontext().prec = 18
py> decimal.Decimal(1)/3
Decimal('0.33')

Great. Now a millisecond later you do the same calculation:

py> decimal.Decimal(1)/3
Decimal('0.3')

WTF just happened here??? The answer is, another client of the module, one
you may not even know about, has set the global context:

decimal.getcontext().prec = 5

and screwed you over but good.

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

from future import print_function

2016-04-24 Thread San

Hi All,
I want details explanation(why this statement used,when it can be used,etc) of 
following statement in python code

"from __future__ import print_function"

Thanks in advance.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: from future import print_function

2016-04-24 Thread Joel Goldstick

On Sun, Apr 24, 2016 at 2:05 PM, San  wrote:
> Hi All,
> I want details explanation(why this statement used,when it can be used,etc) 
> of following statement in python code
>
> "from __future__ import print_function"
>
> Thanks in advance.
> --
> https://mail.python.org/mailman/listinfo/python-list

It lets python 2.7 use python 3.x print function instead of the 2.7
print statement.  You might like some of the options, and your code
will be easier to upgrade to 3.x if you decide to do that

-- 
Joel Goldstick
http://joelgoldstick.com/blog
http://cc-baseballstats.info/stats/birthdays
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Comparing Python enums to Java, was: How much sanity checking is required for function inputs?


On 04/24/2016 09:47 AM, Chris Angelico wrote:


I would normally expect enumerated values to be immutable and
hashable, but that isn't actually required by the code AIUI. Under
what circumstances is it useful to have mutable enum values?


Values can be anything.  The names are immutable and hashable.

--
~Ethan~

--
https://mail.python.org/mailman/listinfo/python-list

Re: Optimizing Memory Allocation in a Simple, but Long Function

On Mon, Apr 25, 2016 at 4:03 AM, Derek Klinge  wrote:
> Ok, from the gmail web client:

Bouncing this back to the list, and removing quote markers for other
people's copy/paste convenience.

## Write a method to approximate Euler's Number using Euler's Method
import math

class EulersNumber():
def __init__(self,n):
self.eulerSteps = n
self.e= self.EulersMethod(self.eulerSteps)
def linearApproximation(self,x,h,d): # f(x+h)=f(x)+h*f'(x)
return x + h * d
def EulersMethod(self, numberOfSteps): # Repeate linear
approximation over an even range
e = 1# e**0 = 1
for step in range(numberOfSteps):
e = self.linearApproximation(e,1.0/numberOfSteps,e) # if
f(x)= e**x, f'(x)=f(x)
return e

def EulerStepWithGuess(accuracy,guessForN):
n = guessForN
e = EulersNumber(n)
while abs(e.e - math.e) > abs(accuracy):
n +=1
e = EulersNumber(n)
print('n={} \te= {} \tdelta(e)={}'.format(n,e.e,abs(e.e - math.e)))
return e

def EulersNumberToAccuracy(PowerOfTen):
x = 1
theGuess = 1
thisE = EulersNumber(1)
while x <= abs(PowerOfTen):
thisE = EulerStepWithGuess(10**(-1*x),theGuess)
theGuess = thisE.eulerSteps * 10
x += 1
return thisE

> To see an example of my problem try something like EulersNumberToAccuracy(-10)

Yep, I see it.

I invoked your script as "python3 -i euler.py" and then made that call
interactively. It quickly ran through the first few iterations, and
then had one CPU core saturated; but at no time did memory usage look
too bad. You may be correct in Python 2, though - it started using
about 4GB of RAM (not a problem to me - I had about 9GB available when
I started it), and then I halted it.

The Python 3 version has been running for a few minutes now.

n=135914023 e= 2.718281818459972 delta(e)=9.999073125044333e-09

'top' says:
  PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND
 7467 rosuav20   0   32432   9072   4844 R 100.0  0.1   3:58.44 python3

In other words, it's saturating one CPU core ("%CPU 100.0"), but its
memory usage (VIRT/RES/SHR) is very low. At best, this process can be
blamed for 0.1% of memory.

Adding these lines to the top makes it behave differently in Python 2:

try: range = xrange
except NameError: pass

The Py3 behaviour won't change, but Py2 should now have the same kind
of benefit (the xrange object is an iterable that doesn't need a
concrete list of integers).

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Comparing Python enums to Java, was: How much sanity checking is required for function inputs?

On Mon, Apr 25, 2016 at 4:12 AM, Ethan Furman  wrote:
> On 04/24/2016 09:47 AM, Chris Angelico wrote:
>
>> I would normally expect enumerated values to be immutable and
>> hashable, but that isn't actually required by the code AIUI. Under
>> what circumstances is it useful to have mutable enum values?
>
>
> Values can be anything.  The names are immutable and hashable.

I know they *can* be, because I looked in the docs; but does it make
sense to a human? Sure, we can legally do this:

>>> class Color(Enum):
... red = 1
... green = 2
... blue = 3
... break_me = [0xA0, 0xF0, 0xC0]
...
>>> Color([0xA0, 0xF0, 0xC0])

>>> Color([0xA0, 0xF0, 0xC0]).value.append(1)
>>> Color([0xA0, 0xF0, 0xC0]).value.append(1)
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/local/lib/python3.6/enum.py", line 241, in __call__
return cls.__new__(cls, value)
  File "/usr/local/lib/python3.6/enum.py", line 476, in __new__
raise ValueError("%r is not a valid %s" % (value, cls.__name__))
ValueError: [160, 240, 192] is not a valid Color

but I don't think it's a good thing to ever intentionally do. It's
fine for the Enum class to not enforce it (it means you can use
arbitrary objects as values, and that's fine), but if you actually do
this, then .

At some point, we're moving beyond the concept of "enumeration" and
settling on "types.SimpleNamespace".

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Comparing Python enums to Java, was: How much sanity checking is required for function inputs?


On 04/24/2016 11:27 AM, Chris Angelico wrote:

On Mon, Apr 25, 2016 at 4:12 AM, Ethan Furman wrote:



Values can be anything.  The names are immutable and hashable.


I know they *can* be, because I looked in the docs; but does it make
sense to a human? Sure, we can legally do this:


Well, not me.  ;)


--> class Color(Enum):
... red = 1
... green = 2
... blue = 3
... break_me = [0xA0, 0xF0, 0xC0]
...
--> Color([0xA0, 0xF0, 0xC0])

--> Color([0xA0, 0xF0, 0xC0]).value.append(1)
--> Color([0xA0, 0xF0, 0xC0]).value.append(1)


If you are looking up by value, you have to use the current value. 
Looks like pebkac error to me.  ;)




At some point, we're moving beyond the concept of "enumeration" and
settling on "types.SimpleNamespace".


Sure.  But like most things in Python I'm not going to enforce it.  And 
if somebody somewhere has a really cool use-case for it, more power to 'em.


--
~Ethan~
--
https://mail.python.org/mailman/listinfo/python-list

Re: from future import print_function


On 04/24/2016 11:14 AM, Joel Goldstick wrote:

On Sun, Apr 24, 2016 at 2:05 PM, San wrote:



I want details explanation(why this statement used,when it can be used,etc) of 
following statement in python code

"from __future__ import print_function"


It lets python 2.7 use python 3.x print function instead of the 2.7
print statement.  You might like some of the options, and your code
will be easier to upgrade to 3.x if you decide to do that


When it can be used:  at the top of a python module; it must be the 
first executable line (only comments and doc-strings can be before it).


--
~Ethan~

--
https://mail.python.org/mailman/listinfo/python-list

Re: Comparing Python enums to Java, was: How much sanity checking is required for function inputs?

2016-04-24 Thread Ian Kelly

On Sun, Apr 24, 2016 at 10:04 AM, Ethan Furman  wrote:
> On 04/24/2016 08:20 AM, Ian Kelly wrote:
>> * Java doesn't have the hokey notion of enum instances being distinct
>> from their "value". The individual enum members *are* the values.
>> Whereas in Python an enum member is an awkward class instance that
>> contains a value of some other type. Python tries to get away from the
>> C-like notion that enums are ints by making the enum members
>> non-comparable, but then gives us IntEnum as a way to work around it
>> if we really want to. Since Java enums don't depend on any other type
>> for their values, there's nothing inviting the user to treat enums as
>> ints in the first place.
>
>
> How does Java share enums with other programs, computers, and/or languages?

Java enums are serializable using the name. If you need it to be
interoperable with other languages where they're int-based, then you
could attach that value as a field. But that would just be data; you
wouldn't be making that value an integral part of the Java enum just
because some other language does it that way.

>> Because they have the same "value", instead of creating a separate
>> member, COUNTER_EARTH gets defined as an alias for EARTH. To work
>> around this, one would have to add a third argument to the above to
>> pass in an additional value for the sole purpose of distinguishing (or
>> else adapt the AutoNumber recipe to work with this example). This
>> example is a bit contrived since it's generally not likely to come up
>> with floats, but it can easily arise (and in my experience frequently
>> does) when the arguments are of more discrete types. It's notable that
>> the Java enum docs feature this very same example but without this
>> weakness. [2]
>
>
> One reason for this is that Python enums are lookup-able via the value:
>
 Planet(9.80265274333129)
> Planet.EARTH
>
> Do Java enums not have such a feature, or this "feature" totally unnecessary
> in Java?

It's unnecessary. If you want to look up an enum constant by something
other than name, you'd provide a static method or mapping.

I'd argue that it's unnecessary in Python too for the same reason. But
as long as Python enums make a special distinction of their value,
there might as well be a built-in way to do it.

> I could certainly add a "no-alias" feature to aenum.  What would be the
> appropriate value-lookup behaviour in such cases?
>
> - return the first match
> - return a list of matches
> - raise an error
> - disable value-lookups for that Enum

Probably the third or fourth, as I think that value lookup would
generally not be useful in such cases, and it can be overridden if
desired.

> Cool.  The stdlib Enum (and therefore the enum34 backport) is unlikely to
> change much.  However, aenum has a few fun things going on, and I'm happy to
> add more:
>
> - NamedTuple (metaclass-based)
> - NamedConstant (no aliases, no by-value lookups)
> - Enum
>   - magic auto-numbering
>   class Number(Enum, auto=True):
>  one, two, three
>  def by_seven(self):
>  return self.value * 7
>   - auto-setting of attributes
>class Planet(Enum, init='mass radius'):
>  MERCURY = 3.303e23, 2.4397e6
>  EARTH = 5.976e24, 6.37814e6
>  NEPTUNE = 1.024e26, 2.4746e7
>--> Planet.EARTH.mass
>5.976e24

Neat!
-- 
https://mail.python.org/mailman/listinfo/python-list

Scraping email to make invoice

2016-04-24 Thread CM

I would like to write a Pythons script to automate a tedious process and could 
use some advice.

The source content will be an email that has 5-10 PO (purchase order) numbers 
and information for freelance work done. The target content will be an invoice. 
(There will be an email like this every week).

Right now, the "recommended" way to go (from the company) from source to target 
is manually copying and pasting all the tedious details of the work done into 
the invoice. But this is laborious, error-prone...and just begging for 
automation. There is no human judgment necessary whatsoever in this.

I'm comfortable with "scraping" a text file and have written scripts for this, 
but could use some pointers on other parts of this operation.

1. INPUT: What's the best way to scrape an email like this? The email is to a 
Gmail account, and the content shows up in the email as a series of basically 
6x7 tables (HTML?), one table per PO number/task. I know if the freelancer were 
to copy and paste the whole set of tables into a text file and save it as plain 
text, Python could easily scrape that file, but I'd much prefer to save the 
user those steps. Is there a relatively easy way to go from the Gmail email to 
generating the invoice directly? (I know there is, but wasn't sure what is 
state of the art these days).

2. OUPUT: The invoice will have boilerplate content on top and then an Excel 
table at bottom that is mostly the same information from the source content. 
Ideally, so that the invoice looks good, the invoice should be a Word document. 
For the first pass at this, it looked best by laying out the entire invoice in 
Excel and then copy and pasting it into a Word doc as an image (since otherwise 
the columns ran over for some reason). In any case, the goal is to create a 
single page invoice that looks like a clean, professional looking invoice.

3. UI: I am comfortable with making GUI apps, so could use this as the 
interface for the (somewhat computer-uncomfortable) user. But the less user 
actions necessary, the better. The emails always come from the same sender, and 
always have the same boilerplate language ("Below please find your Purchase 
Order (PO)"), so I'm envisioning a small GUI window with a single button that 
says "MAKE NEWEST INVOICE" and the user presses it and it automatically 
searches the user's email for PO # emails and creates the newest invoice. I'm 
guessing I could keep a sqlite database or flat file on the computer to just 
track what is meant by "newest", and then the output would have the date 
created in the file, so the user can be sure what has been invoiced.

I'm hoping I can write this in a couple of days. 

Any suggestions welcome! Thanks.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Remove directory tree without following symlinks

2016-04-24 Thread eryk sun

On Sun, Apr 24, 2016 at 5:42 AM, Albert-Jan Roskam
 wrote:
> Aww, I kinda forgot about that already, but I came across this last
> year [1]. Apparently, shutil.rmtree(very_long_path) failed under Win 7,
> even with the "silly prefix". I believe very_long_path was a
> Python2-str.
> [1]
> https://mail.python.org/pipermail/python-list/2015-June/693156.html

Python 2's str branch of the os functions gets implemented on Windows
using the [A]NSI API, such as FindFirstFileA and FindNextFileA to
implement listdir(). Generally the ANSI API is a light wrapper around
the [W]ide-character API. It simply decodes byte strings to UTF-16 and
calls the wide-character function (or a common internal function).

IIRC, in Windows 7, byte strings are decoded using a per-thread buffer
with size MAX_PATH (260), so prefixing the path with "\\?\" won't
help. You have to use the wide-character API. Windows 10, on the other
hand, decodes using a dynamically allocated buffer, so you can usually
get away with using a long byte string. But not with Python 2
os.listdir(), which uses a stack-allocated MAX_PATH+5 buffer in the
str branch. For example:

Python 2 os.mkdir works:

>>> path = os.path.normpath('//?/C:/Temp/long/' + 'a' * 255)
>>> os.makedirs(path)

but os.listdir requires unicode:

>>> os.listdir(path)
Traceback (most recent call last):
  File "", line 1, in 
TypeError: must be (buffer overflow), not str
>>> os.listdir(path.decode('mbcs'))
[]

Also, the str branch of listdir appends "/*.*", with a forward slash,
so it's incompatible with the "\\?\" prefix, even for short paths:

>>> os.listdir(r'\\?\C:\Temp')
Traceback (most recent call last):
  File "", line 1, in 
WindowsError: [Error 123] The filename, directory name, or volume
label syntax is incorrect: '?\\C:\\Temp/*.*'

> It seems useful if shutil or os.path would automatically prefix paths
> with "\\?\". It is rarely really needed, though. (in my case it was
> needed to copy a bunch of MS Outlook .msg files, which automatically
> get the subject line as the filename, and perhaps the first sentence
> of the mail of the mail has no subject).

I doubt a change like that would get backported to 2.7. Recently there
was a lengthy discussion about adding an __fspath__ protocol to Python
3. Possibly this can be automatically handled in the __fspath__
implementation of pathlib.WindowsPath and the DirEntry type returned
by os.scandir.
-- 
https://mail.python.org/mailman/listinfo/python-list

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 54:

2016-04-24 Thread arthur sherman

m using a python web applic (adagios, a nagios configuration tool).
when attempting a certain operation on the client side browser i get the above 
error.
the client side is ubunti 14.04. servers side is debian 8. browser is ff or 
chrome.
both show:
echo $LANG
en_US.UTF-8

before i dive into the code, r there any OS level things to try? 
here's the full error traceback:

Traceback (most recent call last):
File "/opt/adagios/adagios/views.py", line 43, in wrapper
result = view_func(request, *args, **kwargs)
File "/opt/adagios/adagios/objectbrowser/views.py", line 191, in edit_object
c['form'] = PynagForm(pynag_object=my_object, 
initial=my_object._original_attributes)
File "/opt/adagios/adagios/objectbrowser/forms.py", line 312, in __init__
self.fields[field_name] = self.get_pynagField(field_name, css_tag="inherited")
File "/opt/adagios/adagios/objectbrowser/forms.py", line 418, in get_pynagField
_('%(inherited_value)s (inherited from template)') % {'inherited_value': 
smart_str(inherited_value)}
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 54: 
ordinal not in range(128)

tnx in advance for any assistance,
ams
avraham
 
Posts: 2
Joined: Wed Mar 25, 2015 8:58 am
Top
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Scraping email to make invoice

2016-04-24 Thread Friedrich Rentsch

On 04/24/2016 08:58 PM, CM wrote:

I would like to write a Pythons script to automate a tedious process and could
use some advice.

The source content will be an email that has 5-10 PO (purchase order) numbers
and information for freelance work done. The target content will be an invoice.
(There will be an email like this every week).

Right now, the "recommended" way to go (from the company) from source to target
is manually copying and pasting all the tedious details of the work done into the
invoice. But this is laborious, error-prone...and just begging for automation. There is
no human judgment necessary whatsoever in this.

I'm comfortable with "scraping" a text file and have written scripts for this,
but could use some pointers on other parts of this operation.

1. INPUT: What's the best way to scrape an email like this? The email is to a
Gmail account, and the content shows up in the email as a series of basically
6x7 tables (HTML?), one table per PO number/task. I know if the freelancer were
to copy and paste the whole set of tables into a text file and save it as plain
text, Python could easily scrape that file, but I'd much prefer to save the
user those steps. Is there a relatively easy way to go from the Gmail email to
generating the invoice directly? (I know there is, but wasn't sure what is
state of the art these days).

2. OUPUT: The invoice will have boilerplate content on top and then an Excel
table at bottom that is mostly the same information from the source content.
Ideally, so that the invoice looks good, the invoice should be a Word document.
For the first pass at this, it looked best by laying out the entire invoice in
Excel and then copy and pasting it into a Word doc as an image (since otherwise
the columns ran over for some reason). In any case, the goal is to create a
single page invoice that looks like a clean, professional looking invoice.

3. UI: I am comfortable with making GUI apps, so could use this as the interface for the (somewhat
computer-uncomfortable) user. But the less user actions necessary, the better. The emails always come from
the same sender, and always have the same boilerplate language ("Below please find your Purchase Order
(PO)"), so I'm envisioning a small GUI window with a single button that says "MAKE NEWEST
INVOICE" and the user presses it and it automatically searches the user's email for PO # emails and
creates the newest invoice. I'm guessing I could keep a sqlite database or flat file on the computer to just
track what is meant by "newest", and then the output would have the date created in the file, so
the user can be sure what has been invoiced.

I'm hoping I can write this in a couple of days.

Any suggestions welcome! Thanks.

INPUT: What's the best way to scrape an email like this? -- Like what? You
need to explain what exactly your input is or show an example.

Frederic

--
https://mail.python.org/mailman/listinfo/python-list

Re: Optimizing Memory Allocation in a Simple, but Long Function

2016-04-24 Thread Oscar Benjamin

On 24 April 2016 at 19:21, Chris Angelico  wrote:
> On Mon, Apr 25, 2016 at 4:03 AM, Derek Klinge  wrote:
>> Ok, from the gmail web client:
>
> Bouncing this back to the list, and removing quote markers for other
> people's copy/paste convenience.
>
> ## Write a method to approximate Euler's Number using Euler's Method
> import math
>
> class EulersNumber():
> def __init__(self,n):
> self.eulerSteps = n
> self.e= self.EulersMethod(self.eulerSteps)
> def linearApproximation(self,x,h,d): # f(x+h)=f(x)+h*f'(x)
> return x + h * d
> def EulersMethod(self, numberOfSteps): # Repeate linear
> approximation over an even range
> e = 1# e**0 = 1
> for step in range(numberOfSteps):
> e = self.linearApproximation(e,1.0/numberOfSteps,e) # if
> f(x)= e**x, f'(x)=f(x)
> return e
>
>
> def EulerStepWithGuess(accuracy,guessForN):
> n = guessForN
> e = EulersNumber(n)
> while abs(e.e - math.e) > abs(accuracy):
> n +=1
> e = EulersNumber(n)
> print('n={} \te= {} \tdelta(e)={}'.format(n,e.e,abs(e.e - math.e)))
> return e
>
>
> def EulersNumberToAccuracy(PowerOfTen):
> x = 1
> theGuess = 1
> thisE = EulersNumber(1)
> while x <= abs(PowerOfTen):
> thisE = EulerStepWithGuess(10**(-1*x),theGuess)
> theGuess = thisE.eulerSteps * 10
> x += 1
> return thisE
>
>
>> To see an example of my problem try something like 
>> EulersNumberToAccuracy(-10)

Now that I can finally see your code I can see what the problem is. So
essentially you want to calculate Euler's number in the following way:

e = exp(1) and exp(t) is the solution of the initial value problem
with ordinary differential equation dx/dt = x and initial condition
x(0)=1.

So you're using Euler's method to numerically solve the ODE from t=0
to t=1. Which gives you an estimate for x(1) = exp(1) = e.

Euler's method solves this by going in steps from t=0 to t=1 with some
step size e.g. dt = 0.1. You get a sequence of values x[n] where

   x[0] = x(0) = 1  # initial condition
   x[1] = x[0] + dt*f(x[0]) = x[0] + dt*x[0]
   x[2] = x[1] + dt*x[1] # etc.

In order to get to t=1 in N steps you set dt = 1/N. So simplifying
your code (all the classes and functions are just confusing the
situation here):

N = 1000
dt = 1.0 / N
x = 1
for n in range(N):
x = x + dt*x
print(x)

When I run that I get:
2.71692393224

Okay that's great but actually you want to be able to set the accuracy
required and then steadily increase N until it's big enough to achieve
the expected accuracy so you do this:

import math

error = 1
accuracy = 1e-2

N = 1
while error > accuracy:
dt = 1.0 / N
x = 1
for n in range(N):
x = x + dt*x
error = abs(math.e - x)
N += 1
print(x)

But what happens here? You have a loop in a loop. The inner loop takes
n over N values. The outer loop takes N from 1 up to Nmin where Nmin
is the smallest value of N such that we achieve the desired accuracy.

This is a classic case of a quadratic performance algorithm. As you
make the accuracy smaller you're implicitly increasing Nmin. However
the algorithmic performance is quadratic in Nmin i.e. O(Nmin**2). The
problem is the nested loops. If you have an outer loop that increases
the length of an inner loop by 1 at each step then you have a
quadratic algorithm akin to:

# This loop is O(M**2)
for n in range(N):
for N in range(M):
# do stuff

To see that it is quadratic see:

https://en.wikipedia.org/wiki/Triangular_number

The simplest fix here is to replace N+=1 with N*=2. Instead of
increasing the number of steps by one if the accuracy is not small
enough then you should double the number of steps. That will give you
an O(Nmin) algorithm.

https://en.wikipedia.org/wiki/1/2_%2B_1/4_%2B_1/8_%2B_1/16_%2B_%E2%8B%AF

A better method is to do a bit of algebra before putting down the code:

x[1] = x[0] + h*x[0] = x[0]*(1+h) = x[0]*(1+1/N) = (1+1/N)
x[2] = x[1]*(1+1/N) = (1+1/N)**2
...
x[n] = (1 + 1/n)**n

So doing the loop for Euler's method is equivalent to just writing:

x = (1 + 1.0/N)**N

This considered as a sequence in N is well known as a sequence that
converges to e. In fact this is how the number e was first discovered:

https://en.wikipedia.org/wiki/E_%28mathematical_constant%29#Compound_interest

Python can compute this much quicker than your previous version:

N = 1
for _ in range(40):
N *= 2
print((1 + 1.0/N) ** N)

Which runs instantly and gives:

2.25
2.44140625
2.56578451395
2.63792849737
2.67699012938
2.69734495257
2.70773901969
2.71299162425
2.71563200017
2.71695572947
2.71761848234
2.71795008119
2.71811593627
2.71819887772
2.71824035193
2.7182610899
2.71827145911
2.71827664377
2.71827923611
2.71828053228
2.71828118037
2.71828150441
2.71828166644
2.71828174745
2.71828178795
2.71828180821
2.71828181833
2.7182818234
2.71828182593
2.71828182

asyncio and subprocesses

2016-04-24 Thread David


Is this a bug in the asyncio libraries?



This code:

'''
  proc = yield from asyncio.create_subprocess_exec(*cmd,
stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=env)

  # read all data from subprocess pipe, copy to nextCoro
  ln = yield from proc.stdout.read(1024)
  while ln:
  yield from nextCoro.process(ln)
  ln = yield from proc.stdout.read(1024)

'''

will throw this exception:

Traceback (most recent call last):
File "/usr/project/bulk_aio.py", line 52, in db_source
ln = yield from proc.stdout.read(1024)
File "/usr/lib/python3.4/asyncio/streams.py", line 462, in read
self._maybe_resume_transport()
File "/usr/lib/python3.4/asyncio/streams.py", line 349, in 
_maybe_resume_transport  self._transport.resume_reading()
File "/usr/lib/python3.4/asyncio/unix_events.py", line 364, in resume_reading  
self._loop.add_reader(self._fileno, self._read_ready)
AttributeError: 'NoneType' object has no attribute 'add_reader'


The exception always happens at the end of the subprocess's run, in what would 
be the last read. Whether it happens correlates with the time needed for 
nextCoro.process.  If each iteration takes more than about a millisecond, the 
exception will be thrown.

It *seems* that when the transport loses the pipe connection, it schedules the 
event loop for removal immediately, and the _loop gets set to None before the 
data can all be read.  This is the sequence in unix_events.py 
_UnixReadPipeTransport._read_ready.

Is there a workaround to avoid this exception?  Is this a fixed bug, already?  
I am using Python 3.4.2 as distributed in Ubuntu Lucid, with built-in asyncio.

Thank you.
David

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 54:

2016-04-24 Thread Peter Otten

arthur sherman wrote:

> m using a python web applic (adagios, a nagios configuration tool).
> when attempting a certain operation on the client side browser i get the
> above error. the client side is ubunti 14.04. servers side is debian 8.
> browser is ff or chrome. both show:
> echo $LANG
> en_US.UTF-8
> 
> before i dive into the code, r there any OS level things to try?
> here's the full error traceback:
> 
> Traceback (most recent call last):
> File "/opt/adagios/adagios/views.py", line 43, in wrapper
> result = view_func(request, *args, **kwargs)
> File "/opt/adagios/adagios/objectbrowser/views.py", line 191, in
> edit_object c['form'] = PynagForm(pynag_object=my_object,
> initial=my_object._original_attributes) File
> "/opt/adagios/adagios/objectbrowser/forms.py", line 312, in __init__
> self.fields[field_name] = self.get_pynagField(field_name,
> css_tag="inherited") File "/opt/adagios/adagios/objectbrowser/forms.py",
> line 418, in get_pynagField _('%(inherited_value)s (inherited from
> template)') % {'inherited_value': smart_str(inherited_value)}
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 54:
> ordinal not in range(128)

Probably one of

_(...)

or

smart_str(...)

returns unicode and the other returns a non-ascii bytestring:

>>> u"%s" % "\xe2"
Traceback (most recent call last):
  File "", line 1, in 
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: 
ordinal not in range(128)
>>> "\xe2 %s" % u"foo"
Traceback (most recent call last):
  File "", line 1, in 
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: 
ordinal not in range(128)

> before i dive into the code, r there any OS level things to try?

Probably not. If you entered non-ascii text into a form, and can limit 
yourself to ascii-only then that might be a workaround...

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python TUI that will work on DOS/Windows and Unix/Linux

2016-04-24 Thread Peter Brittain

I noticed this trail on Google...  if you're still interested, you could try out

https://github.com/peterbrittain/asciimatics

I ported it to Windows from Linux so exactly the same API works on both.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Scraping email to make invoice

2016-04-24 Thread Michael Torrie

On 04/24/2016 12:58 PM, CM wrote:
> 1. INPUT: What's the best way to scrape an email like this? The
> email is to a Gmail account, and the content shows up in the email as
> a series of basically 6x7 tables (HTML?), one table per PO
> number/task. I know if the freelancer were to copy and paste the
> whole set of tables into a text file and save it as plain text,
> Python could easily scrape that file, but I'd much prefer to save the
> user those steps. Is there a relatively easy way to go from the Gmail
> email to generating the invoice directly? (I know there is, but
> wasn't sure what is state of the art these days).

I would configure Gmail to allow IMAP access (you'll have to set up a
special password for this most likely), and then use an imap library
from Python to directly find the relevant messages and access the email
message body.  If the body is HTML-formatted (sounds like it is) I would
use either BeautifulSoup or lxml to parse it and get out the relevant
information.

> 2. OUPUT: The invoice will have boilerplate content on top and then 
> an Excel table at bottom that is mostly the same information from
> the source content. Ideally, so that the invoice looks good, the
> invoice should be a Word document. For the first pass at this, it
> looked best by laying out the entire invoice in Excel and then copy
> and pasting it into a Word doc as an image (since otherwise the
> columns ran over for some reason). In any case, the goal is to create
> a single page invoice that looks like a clean, professional looking
> invoice.

There are several libraries for creating Excel and Word files,
especially the XML-based formats, though I have little experience with
them.  There are also nice libraries for emitting PDF if that would work
better.

> 3. UI: I am comfortable with making GUI apps, so could use this as 
> the interface for the (somewhat computer-uncomfortable) user. But
> the less user actions necessary, the better. The emails always come
> from the same sender, and always have the same boilerplate language 
> ("Below please find your Purchase Order (PO)"), so I'm envisioning a 
> small GUI window with a single button that says "MAKE NEWEST
> INVOICE" and the user presses it and it automatically searches the
> user's email for PO # emails and creates the newest invoice. I'm
> guessing I could keep a sqlite database or flat file on the computer
> to just track what is meant by "newest", and then the output would
> have the date created in the file, so the user can be sure what has
> been invoiced.

Once you have a working script, your GUI interface would be pretty easy.
 Though it seems to me that it would be unnecessary.  This process
sounds like it should just run automatically from a cron job or something.

> I'm hoping I can write this in a couple of days.

The automated part should be possible, but personally I'd give myself a
week.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How much sanity checking is required for function inputs?

2016-04-24 Thread Michael Selik

On Sun, Apr 24, 2016 at 2:08 PM Steven D'Aprano  wrote:

> On Sun, 24 Apr 2016 04:40 pm, Michael Selik wrote:
> > I think we're giving mixed messages because we're conflating
> "constants" and globals that are expected to change.
>
> When you talk about "state", that usually means "the current state of the
> program", not constants. math.pi is not "state".
>

Perhaps I was unclear. You provided an example of what I was trying to
point out.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Optimizing Memory Allocation in a Simple, but Long Function

Actually, I'm not trying to speed it up, just be able to handle a large
number of n.
(Thank you Chris for the suggestion to use xrange, I am on a Mac using the
stock Python 2.7)

I am looking at the number of iterations of linear approximation that are
required to get a more accurate representation.
My initial data suggest that to get 1 more digit of e (the difference
between the calculated and expected value falls under 10**-n), I need a
little more than 10 times the number of iterations of linear approximation.

I actually intend to compare these to other methods, including limit
definition that you provided, as well as the geometric series definition.

I am trying to provide some real world data for my students to prove the
point that although there are many ways to calculate a value, some are much
more efficient than others.

I tried your recommendation (Oscar) of trying a (1+1/n)**n approach, which
gave me very similar values, but when I took the difference between your
method and mine I consistently got differences of ~10**-15. Perhaps this is
due the binary representation of the decimals?

Also, it seems to me if the goal is to use the smallest value of n to get a
particular level of accuracy, changing your guess of N by doubling seems to
have a high chance of overshoot. I found that I was able to predict
relatively accurately a value of N for achieving a desired accuracy. By
this I mean, that I found that if I wanted my to be accurate to one
additional decimal place I had to multiply my value of N by approximately
10 (I found that the new N required was always < 10N +10).

Derek

On Sun, Apr 24, 2016 at 4:45 PM, Derek Klinge  wrote:

> Actually, I'm not trying to speed it up, just be able to handle a large
> number of n.
> (Thank you Chris for the suggestion to use xrange, I am on a Mac using the
> stock Python 2.7)
>
> I am looking at the number of iterations of linear approximation that are
> required to get a more accurate representation.
> My initial data suggest that to get 1 more digit of e (the difference
> between the calculated and expected value falls under 10**-n), I need a
> little more than 10 times the number of iterations of linear approximation.
>
> I actually intend to compare these to other methods, including limit
> definition that you provided, as well as the geometric series definition.
>
> I am trying to provide some real world data for my students to prove the
> point that although there are many ways to calculate a value, some are much
> more efficient than others.
>
> Derek
>
> On Sun, Apr 24, 2016 at 2:55 PM, Oscar Benjamin <
> oscar.j.benja...@gmail.com> wrote:
>
>> On 24 April 2016 at 19:21, Chris Angelico  wrote:
>> > On Mon, Apr 25, 2016 at 4:03 AM, Derek Klinge 
>> wrote:
>> >> Ok, from the gmail web client:
>> >
>> > Bouncing this back to the list, and removing quote markers for other
>> > people's copy/paste convenience.
>> >
>> > ## Write a method to approximate Euler's Number using Euler's Method
>> > import math
>> >
>> > class EulersNumber():
>> > def __init__(self,n):
>> > self.eulerSteps = n
>> > self.e= self.EulersMethod(self.eulerSteps)
>> > def linearApproximation(self,x,h,d): # f(x+h)=f(x)+h*f'(x)
>> > return x + h * d
>> > def EulersMethod(self, numberOfSteps): # Repeate linear
>> > approximation over an even range
>> > e = 1# e**0
>> = 1
>> > for step in range(numberOfSteps):
>> > e = self.linearApproximation(e,1.0/numberOfSteps,e) # if
>> > f(x)= e**x, f'(x)=f(x)
>> > return e
>> >
>> >
>> > def EulerStepWithGuess(accuracy,guessForN):
>> > n = guessForN
>> > e = EulersNumber(n)
>> > while abs(e.e - math.e) > abs(accuracy):
>> > n +=1
>> > e = EulersNumber(n)
>> > print('n={} \te= {} \tdelta(e)={}'.format(n,e.e,abs(e.e -
>> math.e)))
>> > return e
>> >
>> >
>> > def EulersNumberToAccuracy(PowerOfTen):
>> > x = 1
>> > theGuess = 1
>> > thisE = EulersNumber(1)
>> > while x <= abs(PowerOfTen):
>> > thisE = EulerStepWithGuess(10**(-1*x),theGuess)
>> > theGuess = thisE.eulerSteps * 10
>> > x += 1
>> > return thisE
>> >
>> >
>> >> To see an example of my problem try something like
>> EulersNumberToAccuracy(-10)
>>
>> Now that I can finally see your code I can see what the problem is. So
>> essentially you want to calculate Euler's number in the following way:
>>
>> e = exp(1) and exp(t) is the solution of the initial value problem
>> with ordinary differential equation dx/dt = x and initial condition
>> x(0)=1.
>>
>> So you're using Euler's method to numerically solve the ODE from t=0
>> to t=1. Which gives you an estimate for x(1) = exp(1) = e.
>>
>> Euler's method solves this by going in steps from t=0 to t=1 with some
>> step size e.g. dt = 0.1. You get a sequence of values x[n] where
>>
>>x[0] = x(0) = 1  # initial condition
>>x[1] = x[0] +

Re: Optimizing Memory Allocation in a Simple, but Long Function