Re: Sandboxing eval()

2020-01-20 Thread Frank

On 2020-01-19 7:53 PM, Paul Moore wrote:

On Sun, 19 Jan 2020 at 17:45,  wrote:


Is it actually possible to build a "sandbox" around eval, permitting it
only to do some arithmetic and use some math functions, but no
filesystem acces or module imports?


If you require safety, you really need to write your own parser/evaluator.



I have written a simple parser/evaluator that is sufficient for my 
simple requirements, and I thought I was safe.


Then I saw this comment in a recent post by Robin Becker of ReportLab -

"avoiding simple things like ' '*(10**200) seems quite difficult"

I realised that my method is vulnerable to this  and, like Robin, I have 
not come up with an easy way to guard against it.


Frank Millman

--
https://mail.python.org/mailman/listinfo/python-list


Line graphics on Linux console

2005-01-29 Thread frank
Hi all

I don't think this is strictly a Python problem, but as it manifests
itself in one of my Python programs, I am hoping that somebody in this
group can help me.

The following is a message I sent to co.os.linux.setup -

"My question concerns line graphics on a text-based console. ­My
actual problem relates to a [Python] program I have written using
ncurses, b­ut you can easily test it by running a program like
minicom.

If you call up the minicom menu, it should be surrounded by ­a nice
box made up of horizontal and vertical lines, corners, etc. It used to
work up until Redhat 7. Since upgrading to Redhat 9, and now Fedo­ra,
it (and my program) has stopped working."

I received the following reply from Thomas Dickey -

"That's because Redhat uses UTF-8 locales, and the Linux cons­ole
ignores vt100 line-drawing when it is set for UTF-8.  (screen also
d­oes this).
ncurses checks for $TERM containing "linux" or "screen" (sin­ce
there's no better clues for the breakage) when the encoding is UTF-8­,
and doesn't try to use those escapes (so you would get +'s and -'s).
co­mpiling/linking with libncursesw would get the lines back for a
properly-wri­tten program."

I don't really understand the last sentence. Does anyone know if it is
possible to do this (or anything else) or am I stuck.
TIA for any advice.

Frank Millman

--
http://mail.python.org/mailman/listinfo/python-list


raw_input that able to do detect multiple input

2013-04-06 Thread Frank
Hi all, I would require advise on this question for function call interact:

the desire outcome:
interact()
Friends File: friends.csv
Command: f John Cleese
John Cleese: Ministry of Silly Walks, 421, 27 October
Command: f Michael Palin
Unknown friend Michael Palin
Command: f
Invalid Command: f
Command: a Michael Palin
Invalid Command: a Michael Palin
Command: a John Cleese, Cheese Shop, 5552233, 5 May
John Cleese is already a friend
Command: a Michael Palin, Cheese Shop, 5552233, 5 May
Command: f Michael Palin
Michael Palin: Cheese Shop, 5552233, 5 May
Command: e
Saving changes...
Exiting...

my code so far for interact:
#interact function
def interact(*arg):
open('friends.csv', 'rU')
d = load_friends('friends.csv')
print "Friends File: friends.csv"
s = raw_input("Please input something: ")
command = s.split(" ", 1)
if "f" in command:
display_friends("command",load_friends('friends.csv'))
print command

#display friend function
def display_friends(name, friends_list):
Fname = name[0]
for item in friends_list:
if item[0] == Fname:
print item
break
else:
print False

Let say if i type in " f John Cleese " and after the line 6 , my value of 
"command" should be ['f', 'John Cleese']. Is there ways to extract out John 
Cleese as a input so that i could use it on my function call "display_friends" ?


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: raw_input that able to do detect multiple input

2013-04-06 Thread Frank
Hi Dave,


Sorry for my unclear question.
I didn't use the  d = load_friends('friends.csv') now because I'm going use it 
for other function later on, I should have remove it first to avoid confusion.

This is the code for load_friends , add_info ,display_friends, save_friends 
function:

def load_friends(filename):
f = open(filename, 'rU')
for row in f:
return list (row.strip() for row in f)

def add_info(new_info, new_list):
# Persons name is the first item of the list
name = new_info[0]
# Check if we already have an item with that name
for item in new_list:
if item[0] == name:
print "%s is already in the list" % name
return False
# Insert the item into the list
new_list.append(new_info)
return True

def display_friends(name, friends_list):
Fname = name[0]
for item in friends_list:
if item[0] == Fname:
print item
break
else:
print False

def save_friends(friend_info, new_list):
with open(friend_info, 'w') as f:
for line in new_list:
f.write(line + '\n')


I will elaborate my question further  , when the user type the function call 
interact() this will appear :

interact() 
Friends File: friends.csv 

so after which the user would type in the command call maybe we call it " F 
John Cleese", the program need to know if the user input contain a "f" "a" or 
"e" at the first char and 

if 'f' it mean it would takes a name as an argument, prints out the information 
about that friend or prints an error message if the given name is notthe name 
of a friend in the database(friends.csv).

if "a" it would takes four arguments (comma separated) with information
about a person and adds that person as a friend. An error message is printed
if that person is already a friend.

if "e" it would ends the interaction and, if the friends information has been
updated, the information is saved to the friends.csv.

This is the example output 

Command: f John Cleese 
John Cleese: Ministry of Silly Walks, 421, 27 October 
Command: f Michael Palin 
Unknown friend Michael Palin 
Command: f 
Invalid Command: f 
Command: a Michael Palin 
Invalid Command: a Michael Palin 
Command: a John Cleese, Cheese Shop, 5552233, 5 May 
John Cleese is already a friend 
Command: a Michael Palin, Cheese Shop, 5552233, 5 May 
Command: f Michael Palin 
Michael Palin: Cheese Shop, 5552233, 5 May 
Command: e 
Saving changes... 
Exiting... 

So currently I think i had my other functions ready but I do not know how do i 
apply it into interact() 

my rough idea is :

def interact(*arg): 
open('friends.csv', 'rU') 
d = load_friends('friends.csv') 
print "Friends File: friends.csv" 
s = raw_input() 
command = s.split(" ", 1) 
if "f" in command: 
# invoke display_friends function 
print result
elif "a" in command:
# invoke add_info function
print result
elif "e" in command:
# invoke save_friends function
print result

My idea is to split the user command out to ['f', 'John Cleese'] and use the 
'F' to invoke my "f" in the if statement and then i would use the 
display_friends function to process 'John Cleese' but i'm not sure if i'm able 
to do it this way 




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: raw_input that able to do detect multiple input

2013-04-06 Thread Frank
Now you've saved the data in a different file.  How does the next run of 
the program find it? 


What user?  In what environment can a user enter function calls into 
your code? 
-The user will call the function out from IDLE

Why is the command invalid? 
-Because the user need to type out a name after the "f"

That's not the way the message is worded in the code 
- because if user type in " a John Cleese, Cheese Shop, 5552233, 5 May"
it mean it would takes four arguments (comma separated) with information 
about a person and adds that person to my "friends.csv". An error message is 
printed if that person is already a friend. Because the name "John Cleese" is 
already in my friends.csv that why it will prompt out "John Cleese is already a 
friend"

In this function and in save_friends, there is no return value, so not 
clear what you mean by  'result' 

e ends the interaction and, if the friends information has been
updated, the information is saved to the friends.csv , i think i used the wrong 
function for this.

The question I'm told to work on:
interact() is the top-level function that denes the text-base user interface
as described in the introduction.

Here is an example of what is expected from your program. The input is
everything after Command: on a line (and the initial friends.csv). Every-
thing else is output. Your output should be exactly the same as below for
the given input.

interact() 
Friends File: friends.csv 
Command: f John Cleese 
John Cleese: Ministry of Silly Walks, 421, 27 October 
Command: f Michael Palin 
Unknown friend Michael Palin 
Command: f 
Invalid Command: f 
Command: a Michael Palin 
Invalid Command: a Michael Palin 
Command: a John Cleese, Cheese Shop, 5552233, 5 May 
John Cleese is already a friend 
Command: a Michael Palin, Cheese Shop, 5552233, 5 May 
Command: f Michael Palin 
Michael Palin: Cheese Shop, 5552233, 5 May 
Command: e 
Saving changes... 
Exiting...
-- 
http://mail.python.org/mailman/listinfo/python-list


How to test if a key in a dictionary exists?

2007-03-10 Thread Frank
Hi,

does anyone know how one can test if, e.g., a dictionary 'name' has a
key called 'name_key'?

This would be possible:

keys_of_names = names.keys()
L = len(keys_of_names)

for i in range(L):
if keys_of_names[i] == name_key:
print 'found'


But certainly not efficient. I would expect there is something like:

name.is_key(name_key)

I appreciate your help!

Frank

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Formatted Input

2007-03-10 Thread Frank
On Mar 10, 11:28 am, "Deep" <[EMAIL PROTECTED]> wrote:
> Hi all,
> I am a newbie to python
> I have an input of form
>  space 
> ie.
> 4 3
> how can i assign this to my variables??

Hi,

you could use:

aux = f.readline()  # read a line from your input file
new_aux = string.split(aux, ' ') # you can use also other separaters,
e.g., \t, or '  more than one space', or...
L = len(new_aux) # number of values
Remark:
the elements in new_aux are string, that means you have to convert
them to numbers by,, e.g., int(row[i]).

Hope this helps,

Frank


-- 
http://mail.python.org/mailman/listinfo/python-list


Calling cpp from python/SWIG

2007-03-12 Thread Frank
Hi,

I have the following problem:

I want to call a cpp program from python. Let's call this cpp program
fct. The problem is that I will parse a large array, say M1, to fct
and also receive a large array, say M2 back (each about 1000 x 1000).

Normally, I would write M1 to a file, call fct via
subprocess.Popen
write M2 in a file within fct, and read M2 with python.

I followed some threads discussing the use of swig in this context.

My questions are:

- is SWIG the best solution for my problem
- are there other ways besides SWIG
- can I find somewhere examples how to do that exactly

Thanks to all!

Frank

-- 
http://mail.python.org/mailman/listinfo/python-list


Calling cpp from python/SWIG

2007-03-12 Thread Frank
Hi,

I have the following problem:

I want to parse an array M1 from python to a cpp function fct which
returns an array M2.

How can I do this best? Is SWIG appropriate or is there something
else?

If someone could give some code example or a link to a page with
examples, that would be great!

Thanks to all!

Frank

-- 
http://mail.python.org/mailman/listinfo/python-list


Calling cpp from python/SWIG

2007-03-13 Thread Frank
Hi,

I have the following problem:

I want to parse an array M1 from python to a cpp function fct which
returns an array M2.

How can I do this best? Is SWIG appropriate or is there something
else?

If someone could give some code example or a link to a page with
examples, that would be great!

Thanks to all!

Frank

-- 
http://mail.python.org/mailman/listinfo/python-list


python/C++ wrapper

2007-03-13 Thread Frank

Hi,

is there anyone here that calls C++ programs from python via swig? It
seems that there are a lot of different ways to do that. For me it
would be important that the original c++ code (which is available)
does not need to be changed and the whole compilation process (swig -
python g++etc) is as simple as possible.

Does anyone have a running example of the following problem:

- A c++ program receives a 2D-matrix from python as input and gives a
2D-matrix as output back to python.

That's all! I would expect there should be someone who actually uses
swig for this kind of problem. If so, could you send me the code, that
would be great!

If it is important, I use linux and numpy.

Thanks,

Frank

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python/C++ wrapper

2007-03-14 Thread Frank
On Mar 14, 1:42 pm, "Gabriel Genellina" <[EMAIL PROTECTED]>
wrote:
> En Wed, 14 Mar 2007 01:55:55 -0300, Frank <[EMAIL PROTECTED]> escribió:
>
> > is there anyone here that calls C++ programs from python via swig? It
>
> I suggest you read the responses to your previous question; also search
> the list archives for this month.
>
> --
> Gabriel Genellina


Hi Gabriel,

and I suggest you read the question I asked precisely!

Frank

-- 
http://mail.python.org/mailman/listinfo/python-list


plot dendrogram with python

2007-03-27 Thread Frank
Hi,

does anyone know if there is a way to plot a dendrogram with python.
Pylab or matplotlib do not provide such a function.

Thanks!

Frank

-- 
http://mail.python.org/mailman/listinfo/python-list


rpy: parsing arrays from python to R

2007-03-27 Thread Frank

Hi,

I use rpy on linux to call R functions. Works fine up to the following
problem: How to parse arrays (no vectors, that means 2-dimensional) to
R without much effort?

The following code solves the problem (in two different ways).
However, it seems to me that there might be a way to do it more
efficiently.

rpy.r.assign("N", N)
rpy.r("A2 <- array(1:N^2, dim=c(N,N))")
rpy.r("A3 <- array(1:N^2, dim=c(N,N))")


for i in range(N):   # two alternative ways to parse
arrays
rpy.r.assign("Wi", W[i])
rpy.r.assign("i", i+1)
rpy.r("""for( j in 1:N ){
A2[i,j] <- Wi[j]}""")
for k in range(N):
rpy.r.assign("k", k+1)
rpy.r("A3[i,k] <- Wi[k]")


print rpy.r("A3")
print rpy.r("A2")


As I see it, the problem is, that the 'assign' command works only
either for scalars or vectors (one-dimensional arrays but not more-
dimensional arrays). I tried it for 2-dimensional arrays and the
result is a list whose components are vectors. Again, this is easy to
convert to a two-dimensional array but the point here is that one has
to do it.

Maybe there are people using R with python who have some more
experience. I would be interested how they solved this problem.

Thanks!

Frank

-- 
http://mail.python.org/mailman/listinfo/python-list


plotting R graphics with rpy: figure crashes

2007-03-27 Thread Frank
Hi,

I use rpy to plot functions and have the following problem. When I
execute the following code line by line (start python and then execute
line by line) the resulting figure looks as it should. However, when I
put these lines in a script and execute the script the figure appears
for half a second but then crashes. Using pylab.show() at the end of
the script prevents the crash but now the window is no longer
refreshed (e.g., changing the size of the window makes the contents
disappear).

import pylab
import rpy

x = range(0, 10)
y = [ 2*i for i in x ]
rpy.r.plot(x,y)

I compared already with sys.version if the python version is the same
in both cases (it is).  Hence, the problem might be caused by rpy.

Has anyone an idea how to figure that out?

Thanks!

Frank

-- 
http://mail.python.org/mailman/listinfo/python-list


problems loading modules

2007-02-04 Thread Frank
Hi,

I have the following weird behavior when I load modules:


>>> import random
>>> print random.randrange(10)
8
>>>

Everything is fine.

>>> import random
>>> from numpy import *
>>>
>>> print random.randrange(10)
Traceback (most recent call last):
 File "", line 1, in ?
AttributeError: 'module' object has no attribute 'randrange'
>>>

Here it does not work.


>>> from numpy import *
>>> import random
>>>
>>> print random.randrange(10)
8
>>>

Now everything is back to normal.

That means the order the modules are loaded matters! I would expect
there is a problem with my installation because I guess this should
normally be independent of the loaded modules.

Here are my questions:
1. Does anyone has this behavior too?

2. How can I fix this problem?

I use linux with fedora core 6 and python 2.4.4

I appreciate any hint. Thanks!

Frank

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: problems loading modules

2007-02-06 Thread Frank

Thanks guys!

-- 
http://mail.python.org/mailman/listinfo/python-list


Saving PyOpenGl figures as ps

2007-02-11 Thread Frank
Hi,

I installed pyopengl (opengl for python) on my linux box and
everything works fine. But now I want to save the generated images as,
e.g., ps or eps. How can I do that and how can I adjust the resolution
(if necessary)? This is probably simple but for some reason I can not
find out how to do that.

I appreciate every hint!

Thanks, Frank

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: numpy, numarray, or numeric?

2007-02-15 Thread Frank
On Feb 15, 4:40 pm, "Christian Convey" <[EMAIL PROTECTED]>
wrote:
> I need to bang out an image processing library (it's schoolwork, so I
> can't just use an existing one).  But I see three libraries competing
> for my love: numpy, numarray, and numeric.
>
> Can anyone recommend which one I should use?  If one is considered the
> officially blessed one going forward, that would be my ideal.
>
> Thanks,
> Christian


Hi,

yeah numpy is the newest one. It has only one drawback, there is no
comprehensive documentation available that would be free but of course
you could buy one. numpy is very similar to the other two packages but
not identical that means one has always some troulbe finding out how
things work. For example, in numarray you can calculate the
eigenvectors of a matrix with eigenvectors(A), in numpy it is eig(A).
This looks similar, but the difference is that in numarray the
eigenvectors are returned as rows and in numpy as columns.

If someone knows of a free manual, let me know.

Frank

-- 
http://mail.python.org/mailman/listinfo/python-list


saving path to modules permanently

2007-02-16 Thread Frank

Hi,

I want to create some modules to use them in my programs. Now the
problem is, how to add the path to these modules permanently to the
python search path.

For example:

import sys
sys.path.append('path to my modules')


works fine for one session but does not save this path permanently. I
also tried alreday to set the PYTHONPAHT variable to my modules but
for some reason this does not work either.

Question:
Can one add a permenent path via pathon commands (I checked already
sys, but it does not seem to have a function for this).

If not, how to do it otherwise.

It would be great to give the precise commands if someone knows how.
(BTW, I use linux)

Thanks!

Frank

-- 
http://mail.python.org/mailman/listinfo/python-list


pyExcelerator: setting zoom of a worksheet

2008-11-25 Thread Frank

Hi everybody,

I use pyExcelerator and am quite happy with it, except: I cannot find a option 
for setting the zoom of a particular worksheet. I am using pyExcelerator 
0.6.3a which is the latest as far as i know.


I already contacted the developer of pyExcelerator, he says there is a zoom-option 
in Worksheet.py, but it seems not to be there in my version.


Anyone ever worked with it and maybe can help me?

Thanks a lot.

KR,
Frank


--
http://mail.python.org/mailman/listinfo/python-list


Re: Urgent:Serial Port Read/Write

2013-05-09 Thread Frank Miles
On Thu, 09 May 2013 23:35:53 +0800, chandan kumar wrote:

> Hi all,I'm new to python and facing issue using serial in python.I'm
> facing the below error
>     ser.write(port,command)NameError: global name 'ser' is not defined
> Please find the attached script and let me know whats wrong in my script
> and also how can i read data from serial port for the  same script.
[snip]
> if __name__ == "__main__":
> 
> CurrDir=os.getcwd()
> files = glob.glob('./*pyc')
> for f in files:
> os.remove(f)
> OpenPort(26,9600)
> SetRequest(ER_Address)
> #SysAPI.SetRequest('ER',ER_Address)
> 
> print "Test Scripts Execution complete"

What kind of 'port' is 26?  Is that valid on your machine?  My guess is 
that  is NULL (because the open is failing, likely due to the port 
selection), leading to your subsequent problems.

HTH..
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PDF generator decision

2013-05-14 Thread Frank Miles
On Tue, 14 May 2013 08:05:53 -0700, Christian Jurk wrote:

> Hi folks,
> 
> This questions may be asked several times already, but the development
> of relevant software continues day-for-day. For some time now I've been
> using xhtml2pdf [1] to generate PDF documents from HTML templates (which
> are rendered through my Django-based web application. This have been
> working for some time now but I'm constantly adding new templates and
> they are not looking like I want it (sometimes bold text is bold,
> sometimes not, layout issues, etc). I'd like to use something else than
> xhtml2pdf.
> 
> So far I'd like to ask which is the (probably) best way to create PDFs
> in Python (3)? It is important for me that I am able to specify not only
> background graphics, paragaphs, tables and so on but also to specify
> page headers/footers. The reason is that I have a bunch of documents to
> be generated (including Invoice templates, Quotes - stuff like that).
> 
> Any advice is welcome. Thanks.
> 
> [1] https://github.com/chrisglass/xhtml2pdf

Reportlab works well in Python 2.x.  Their _next_ version is supposed to 
work with Python3... {yes, not much help there}
-- 
http://mail.python.org/mailman/listinfo/python-list


Question about ast.literal_eval

2013-05-20 Thread Frank Millman

Hi all

I am trying to emulate a SQL check constraint in Python. Quoting from 
the PostgreSQL docs, "A check constraint is the most generic constraint 
type. It allows you to specify that the value in a certain column must 
satisfy a Boolean (truth-value) expression."


The problem is that I want to store the constraint as a string, and I 
was hoping to use ast.literal_eval to evaluate it, but it does not work.


>>> x = 'abc'
>>> x in ('abc', xyz')
True
>>> b = "x in ('abc', 'xyz')"
>>> eval(b)
True
>>> from ast import literal_eval
>>> literal_eval(b)
ValueError: malformed node or string: <_ast.Compare object at ...>

Is there a safe way to do what I want? I am using python 3.3.

Thanks

Frank Millman

--
http://mail.python.org/mailman/listinfo/python-list


Re: Question about ast.literal_eval

2013-05-20 Thread Frank Millman

[Corrected top-posting]

>> To: python-list@python.org

From: fr...@chagford.com
Subject: Question about ast.literal_eval
Date: Mon, 20 May 2013 09:05:48 +0200

Hi all

I am trying to emulate a SQL check constraint in Python. Quoting from
the PostgreSQL docs, "A check constraint is the most generic constraint
type. It allows you to specify that the value in a certain column must
satisfy a Boolean (truth-value) expression."

The problem is that I want to store the constraint as a string, and I
was hoping to use ast.literal_eval to evaluate it, but it does not work.



On 20/05/2013 09:34, Carlos Nepomuceno wrote:


It seems to me you can't use ast.literal_eval()[1] to evaluate that kind of 
expression

> because it's just for literals[2].


Why don't you use eval()?



Because users can create their own columns, with their own constraints. 
Therefore the string is user-modifiable, so it cannot be trusted.


Frank


--
http://mail.python.org/mailman/listinfo/python-list


Re: Question about ast.literal_eval

2013-05-20 Thread Frank Millman

On 20/05/2013 09:55, Chris Angelico wrote:

On Mon, May 20, 2013 at 5:50 PM, Frank Millman  wrote:

On 20/05/2013 09:34, Carlos Nepomuceno wrote:

Why don't you use eval()?



Because users can create their own columns, with their own constraints.
Therefore the string is user-modifiable, so it cannot be trusted.


Plenty of reason right there :)

Is it a requirement that they be able to key in a constraint as a
single string? We have a similar situation in one of the systems at
work, so we divided the input into three(ish) parts: pick a field,
pick an operator (legal operators vary according to field type -
integers can't be compared against regular expressions, timestamps can
use >= and < only), then enter the other operand. Sure, that cuts out
a few possibilities, but you get 99.9%+ of all usage and it's easy to
sanitize.

ChrisA



It is not a requirement, no. I just thought it would be a convenient 
short-cut.


I had in mind something similar to your scheme above, so I guess I will 
have to bite the bullet and implement it.


Thanks

Frank


--
http://mail.python.org/mailman/listinfo/python-list


Re: Question about ast.literal_eval

2013-05-20 Thread Frank Millman

On 20/05/2013 09:55, Carlos Nepomuceno wrote:




Why don't you use eval()?



Because users can create their own columns, with their own constraints.
Therefore the string is user-modifiable, so it cannot be trusted.


I understand your motivation but I don't know what protection 
ast.literal_eval() is offering that eval() doesn't.



Quoting from the manual -

"Safely evaluate an expression node or a string containing a Python 
expression. The string or node provided may only consist of the 
following Python literal structures: strings, bytes, numbers, tuples, 
lists, dicts, sets, booleans, and None."


The operative word is 'safely'. I don't know the details, but it 
prevents the kinds of exploits that can be carried out by malicious code 
using eval().


I believe it is the same problem as SQL injection, which is solved by 
using parameterised queries.


Frank


--
http://mail.python.org/mailman/listinfo/python-list


Re: Question about ast.literal_eval

2013-05-20 Thread Frank Millman

On 20/05/2013 10:07, Frank Millman wrote:

On 20/05/2013 09:55, Chris Angelico wrote:

Is it a requirement that they be able to key in a constraint as a
single string? We have a similar situation in one of the systems at
work, so we divided the input into three(ish) parts: pick a field,
pick an operator (legal operators vary according to field type -
integers can't be compared against regular expressions, timestamps can
use >= and < only), then enter the other operand. Sure, that cuts out
a few possibilities, but you get 99.9%+ of all usage and it's easy to
sanitize.

ChrisA



It is not a requirement, no. I just thought it would be a convenient
short-cut.

I had in mind something similar to your scheme above, so I guess I will
have to bite the bullet and implement it.



Can anyone see anything wrong with the following approach. I have not 
definitely decided to do it this way, but I have been experimenting and 
it seems to work.


I store the boolean test as a json'd list of 6-part tuples. Each element 
of the tuple is a string, defined as follows -


0 - for the first entry in the list, the word 'check' (a placeholder - 
it is discarded at evaluation time), for any subsequent entries the word 
'and' or 'or'.


1 - left bracket - either '(' or ''.

2 - column name to check - it will be validated on entry.

3 - operator - must be one of '=', '!=', '<', '>', '<=', '>=', 'in', 
'is', 'is not'. At evaluation time, '=' is changed to '=='.


4 - value to compare - at evaluation time I call 
str(literal_eval(value)) to ensure that it is safe.


5 - right bracket - either ')' or ''.

At evaluation time I loop through the list, construct the boolean test 
as a string, and call eval() on it.


Here are some examples -

check = []
check.append(('check', '', 'name', 'in', "('abc', 'xyz')", ''))

check = []
check.append(('check', '', 'value', '>=', '0', ''))

check = []
check.append(('check', '(', 'descr', 'is not', 'None', ''))
check.append(('and', '', 'alt', 'is', 'None', ')'))
check.append(('or', '(', 'descr', 'is', 'None', ''))
check.append(('and', '', 'alt', 'is not', 'None', ')'))

I don't plan to check the logic - I will just display the exception if 
it does not evaluate.


It seems safe to me. Can anyone see a problem with it?

Frank


--
http://mail.python.org/mailman/listinfo/python-list


Re: Question about ast.literal_eval

2013-05-20 Thread Frank Millman

On 21/05/2013 04:39, matt.newvi...@gmail.com wrote:


You might find the asteval module (https://pypi.python.org/pypi/asteval) useful.   It 
provides a relatively safe "eval", for example:

 >>> import asteval
 >>> a = asteval.Interpreter()
 >>> a.eval('x = "abc"')
 >>> a.eval('x in ("abc", "xyz")')
 True
 >>> a.eval('import os')
 NotImplementedError
import os
 'Import' not supported
 >>> a.eval('__import__("os")')
 NameError
__import__("os")
 name '__import__' is not defined

This works by maintaining an internal namespace (a flat dictionary), and 
walking the AST generated for the expression.  It supports most Python syntax,
including if, for, while, and try/except blocks, and function definitions, and 
with the notable exceptions of eval, exec, class, lambda, yield, and import.   
This requires Python2.6 and higher, and does work with Python3.3.

Of course, it is not guaranteed to be completely safe, but it does disallow 
imports, which seems like the biggest vulnerability concern listed here.  
Currently, there is no explicit protection against long-running calculations 
for denial of service attacks.  If you're exposing an SQL database to 
user-generated code, that may be worth considering.


Thanks for this, Matt. I will definitely look into it.

Frank



--
http://mail.python.org/mailman/listinfo/python-list


Re: Question about ast.literal_eval

2013-05-20 Thread Frank Millman

On 20/05/2013 18:12, Steven D'Aprano wrote:

On Mon, 20 May 2013 15:26:02 +0200, Frank Millman wrote:


Can anyone see anything wrong with the following approach. I have not
definitely decided to do it this way, but I have been experimenting and
it seems to work.


[...]


It seems safe to me too, but then any fool can come up with a system
which they themselves cannot break :-)



Thanks for the detailed response.


I think the real worry is validating the column name. That will be
critical.


I would not pass the actual column name to eval(), I would use it to 
retrieve a value from a data object and pass that to eval(). However, 
then your point becomes 'validating the value retrieved'. I had not 
thought about that. I will investigate further.



Personally, I would strongly suggest writing your own mini-
evaluator that walks the list and evaluates it by hand. It isn't as
convenient as just calling eval, but *definitely* safer.



I am not sure I can wrap my mind around mixed 'and's, 'or's, and brackets.

[Thinking aloud]
Maybe I can manually reduce each internal test to a True or False, 
substitute them in the list, and then call eval() on the result.


eval('(True and False) or (False or True)')

I will experiment with that.


If you do call eval, make sure you supply the globals and locals
arguments. The usual way is:

eval(expression, {'__builtins__': None}, {})

which gives you an empty locals() and a minimal, (mostly) safe globals.



Thanks - I did not know about that.


Finally, as a "belt-and-braces" approach, I wouldn't even call eval
directly, but call a thin wrapper that raises an exception if the
expression contains an underscore. Underscores are usually the key to
breaking eval, so refusing to evaluate anything with an underscore raises
the barrier very high.

And even with all those defences, I wouldn't allow untrusted data from
the Internet anywhere near this. Just because I can't break it, doesn't
mean it's safe.



All good advice - much appreciated.

Frank


--
http://mail.python.org/mailman/listinfo/python-list


Re: Question about ast.literal_eval

2013-05-20 Thread Frank Millman

On 20/05/2013 18:13, Chris Angelico wrote:

On Mon, May 20, 2013 at 11:26 PM, Frank Millman  wrote:

0 - for the first entry in the list, the word 'check' (a placeholder - it is
discarded at evaluation time), for any subsequent entries the word 'and' or
'or'.

1 - left bracket - either '(' or ''.

5 - right bracket - either ')' or ''.


I think what you have is safe, but extremely complicated to work with.
Six separate pieces, and things have to be in the right slots... I
think you've spent too many "complexity points" on the above three
components, and you're getting too little return for them. What
happens if the nesting is mucked up? Could get verrry messy to check.

Combining multiple conditions with a mixture of ands and ors is a
nightmare to try to explain (unless you just point to the Python docs,
which IMO costs you even more complexity points); the only safe option
is to parenthesize everything. The system I pushed for at work (which
was finally approved and implemented a few months ago) is more or less
this: conditions are grouped together into blocks; for each group, you
can choose whether it's "all" or "any" (aka and/or), and you choose
whether the overall result is all-groups or any-group. That still
costs a few complexity points (and, btw, our *actual* implementation
is a bit more complicated than that, but I think we could cut it down
to what I just described here without loss of functionality), but it
gives the bulk of what people will actually want without the
complexities of point-and-click code.

The downside of that sort of system is that it requires a two-level
tree. On the flip side, that's often how people will be thinking about
their conditions anyway (eg using a pair of conditions ">" and "<" to
implement a range check - conceptually it's a single check), so that
won't cost too much.



You may be right, Chris, but I don't think my approach is all that bad.

The vast majority of tests will be simple - either a single line, or two 
lines for a range check, with no brackets at all.


If the requirement is more complicated than that, well, I don't think 
the complication can be avoided, and at least this approach gives full 
control.


FWIW, I use the same approach to allow users to construct their own 
WHERE clauses in custom views. Again, the vast majority are simple, but 
there are times when it can get complicated.


The proof of the pudding will be when I try to get ordinary users to get 
their own hands dirty - I am not there yet. If I ever get this released, 
the business model will be free software, but support will be charged 
for. So if a user gets out of his/her depth, there will be assistance 
available.


Time will tell who is right ... ;-)

Frank


--
http://mail.python.org/mailman/listinfo/python-list


Re: Question about ast.literal_eval

2013-05-21 Thread Frank Millman

On 21/05/2013 09:21, Steven D'Aprano wrote:

On Tue, 21 May 2013 08:30:03 +0200, Frank Millman wrote:


I am not sure I can wrap my mind around mixed 'and's, 'or's, and
brackets.


Parsers are a solved problem in computer science, he says as if he had a
clue what he was talking about *wink*

Here's a sketch of a solution... suppose you have a sequence of records,
looking like this:

(bool_op, column_name, comparison_op, literal)

with appropriate validation on each field. The very first record has
bool_op set to "or". Then, you do something like this:

import operator
OPERATORS = {
 '=': operator.eq,
 'is': operator.is_,
 '<': operator.lt,
 # etc.
 }

def eval_op(column_name, op, literal):
 value = lookup(column_name)  # whatever...
 return OPERATORS[op](value, literal)

result = False

for (bool_op, column_name, comparison_op, literal) in sequence:
 flag = eval_op(column_name, comparison_op, literal)
 if bool_op == 'and':
 result = result and flag
 else:
 assert bool_op == 'or'
 result = result or flag
 # Lazy processing?
 if result:
 break

and in theory it should all Just Work.


That's very clever - thanks, Steven.

It doesn't address the issue of brackets. I imagine that the answer is 
something like -


  maintain a stack of results
  for each left bracket, push a level
  for each right bracket, pop the result

or something ...

I am sure that with enough trial and error I can get it working, but I 
might cheat for now and use the trick I mentioned earlier of calling 
eval() on a sequence of manually derived True/False values. I really 
can't see anything going wrong with that.


BTW, thanks to ChrisA for the following tip -

import operator
ops = {
  'in':lambda x,y: x in y,  # operator.contains has the args backwards

I would have battled with that one.

Frank


--
http://mail.python.org/mailman/listinfo/python-list


Re: Future standard GUI library

2013-06-13 Thread Frank Millman

"Wolfgang Keller"  wrote in message 
news:2013061819.2a044e86ab4b6defe1939...@gmx.net...
>
> But could it be that you have never seen an actually proficient user of
> a typical "enterprise" application (ERP, MRP, whatever) "zipping"
> through the GUI of his/her bread-and-butter application so fast that
> you can not even read the titles of windows or dialog boxes.
>
> Obviously, this won't work if the client runs on this pathological
> non-operating system MS (Not Responding), much less with "web
> applications".
>
[...]
>
>>
>> On a LAN, with a proper back-end, I can get instant response from a
>> web app.
>
> I have been involved as "domain specialist" (and my input has always
> been consistently conveniently ignored) with projects for web
> applications and the results never turned out to be even remotely usable
> for actually productive work.
>

Hi Wolfgang

I share your passion for empowering a human operator to complete and submit 
a form as quickly as possible. I therefore agree that one should be able to 
complete a form using the keyboard only.

There is an aspect I am unsure of, and would appreciate any feedback based 
on your experience.

I am talking about what I call 'field-by-field validation'. Each field could 
have one or more checks to ensure that the input is valid. Some can be done 
on the client (e.g. value must be numeric), others require a round-trip to 
the server (e.g. account number must exist on file). Some applications defer 
the server-side checks until the entire form is submitted, others perform 
the checks in-line. My preference is for the latter.

I agree with Chris that on a LAN, it makes little or no difference whether 
the client side is running a web browser or a traditional gui interface. On 
a WAN, there could be a latency problem. Ideally an application should be 
capable of servicing a local client or a remote client, so it is not easy to 
find the right balance.

Do you have strong views on which is the preferred approach.

Thanks for any input.

Frank Millman



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Future standard GUI library

2013-06-13 Thread Frank Millman

"Chris Angelico"  wrote in message 
news:CAPTjJmo+fWsCD3Lb6s+zmWspKzzk_JB=pbcvflbzjgcfxvm...@mail.gmail.com...
> On Thu, Jun 13, 2013 at 7:32 PM, Frank Millman  wrote:
>> I am talking about what I call 'field-by-field validation'. Each field 
>> could
>> have one or more checks to ensure that the input is valid. Some can be 
>> done
>> on the client (e.g. value must be numeric), others require a round-trip 
>> to
>> the server (e.g. account number must exist on file). Some applications 
>> defer
>> the server-side checks until the entire form is submitted, others perform
>> the checks in-line. My preference is for the latter.
>
> It's not either-or. The server *MUST* perform the checks at the time
> of form submission; the question is whether or not to perform
> duplicate checks earlier. This is an absolute rule of anything where
> the client is capable of being tampered with, and technically, you
> could violate it on a closed system; but it's so easy to migrate from
> closed system to diverse system without adding all the appropriate
> checks, so just have the checks from the beginning.
>

In my case, it is either-or. I do not just do field-by-field validation, I 
do field-by-field submission. The server builds up a record of the data 
entered while it is being entered. When the user selects 'Save', it does not 
resend the entire form, it simply sends a message to the server telling it 
to process the data it has already stored.

> In terms of software usability, either is acceptable, but do make sure
> the user can continue working with the form even if there's latency
> talking to the server - don't force him/her to wait while you check if
> the previous field was valid. I know that seems obvious, but
> apparently not to everyone, as there are forms out there that violate
> this...
>

I plead guilty to this, but I am not happy about it, hence my original post. 
I will take on board your comments, and see if I can figure out a way to 
have the best of both worlds.

Frank



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Future standard GUI library

2013-06-14 Thread Frank Millman

"Chris Angelico"  wrote in message 
news:captjjmq_m4y0uxxt3jqythjj9ckbsvp+z2pgf5v_31xlrgf...@mail.gmail.com...
> On Fri, Jun 14, 2013 at 3:39 PM, Frank Millman  wrote:
>>
>> In my case, it is either-or. I do not just do field-by-field validation, 
>> I
>> do field-by-field submission. The server builds up a record of the data
>> entered while it is being entered. When the user selects 'Save', it does 
>> not
>> resend the entire form, it simply sends a message to the server telling 
>> it
>> to process the data it has already stored.
>
> Ah, I see what you mean. What I was actually saying was that it's
> mandatory to check on the server, at time of form submission, and
> optional to pre-check (either on the client itself, for simple
> syntactic issues, or via AJAX or equivalent) for faster response.
>
> As a general rule, I would be inclined to go with a more classic
> approach for reasons of atomicity. What happens if the user never gets
> around to selecting Save? Does the server have a whole pile of data
> that it can't do anything with? Do you garbage-collect that
> eventually? The classic model allows you to hold off inserting
> anything into the database until it's fully confirmed, and then do the
> whole job in a single transaction.
>

The data is just stored in memory in a 'Session' object. I have a 
'keep-alive' feature that checks if the client is alive, and removes the 
session with all its data if it detects that the client has gone away. 
Timeout is configurable, but I have it set to 30 seconds at the moment.

The session is removed immediately if the user logs off. He is warned if 
there is unsaved data.

Frank



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Debugging memory leaks

2013-06-20 Thread Frank Millman

"writeson"  wrote in message 
news:09917103-b35e-4728-8fea-bcb4ce2bd...@googlegroups.com...
> Hi all,
>
> I've written a program using Twisted that uses SqlAlchemy to access a 
> database using threads.deferToThread(...) and SqlAlchemy's 
> scoped_session(...). This program runs for a long time, but leaks memory 
> slowly to the point of needing to be restarted. I don't know that the 
> SqlAlchemy/threads thing is the problem, but thought I'd make you aware of 
> it.
>
> Anyway, my real question is how to go about debugging memory leak problems 
> in Python, particularly for a long running server process written with 
> Twisted. I'm not sure how to use heapy or guppy, and objgraph doesn't tell 
> me enough to locate the problem. If anyone as any suggestions or pointers 
> it would be very much appreciated!
>
> Thanks in advance,
> Doug

You have received lots of good advice, but there is one technique that I 
have found useful that has not been mentioned.

As you are probably aware, one of the main causes of a 'memory leak' in 
python is an object that is supposed to be garbage collected, but hangs 
around because there is still a reference pointing to it.

You cannot directly confirm that an object has been deleted, because 
invoking its '__del__' method causes side-effects which can prevent it from 
being deleted even if it is otherwise ok.

However, there is an indirect way of confirming it - a 'DelWatcher' class. I 
got this idea from a thread on a similar subject in this forum a long time 
ago. Here is how it works.

class DelWatcher:
def __init__(self, obj):
# do not store a reference to obj - that would create a circular 
reference
# store some attribute that uniquely identifies the 'obj' instance
self.name = obj.name
print(self.name, 'created')
def __del__(self):
print(self.name, 'deleted')

class MyClass:
def __init__(self, ...):
[...]
self._del = DelWatcher(self)

Now you can watch the objects as they are created, and then check that they 
are deleted when you expect them to be.

This can help to pinpoint where the memory leak is occurring.

HTH

Frank Millman



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Default scope of variables

2013-07-09 Thread Frank Millman

"Chris Angelico"  wrote in message 
news:captjjmqkmfd4-jpugr-vubub6ribv6k_mwnxc_u3cvabr_w...@mail.gmail.com...
> On Tue, Jul 9, 2013 at 4:08 PM, alex23  wrote:
>> On 9/07/2013 3:07 PM, Chris Angelico wrote:
>>>
>>> The subtransactions are NOT concepted as separate transactions. They
>>> are effectively the database equivalent of a try/except block.
>>
>>
>> Sorry, I assumed each nested query was somehow related to the prior
>> one. In which case, I'd probably go with Ethan's suggestion of a
>> top-level transaction context manager with its own substransaction
>> method.
>
> Yeah, that would probably be the best option in this particular
> instance. Though I do still like the ability to have variables shadow
> each other, even if there's a way around one particular piece of code
> that uses the technique.
>

I have been following this sub-thread with interest, as it resonates with 
what I am doing in my project.

In my case, one update can trigger another, which can trigger another, etc. 
It is important that they are treated as a single transaction. Each object 
has its own 'save' method, so there is not one place where all updates are 
executed, and I found it tricky to control.

I came up with the following context manager -

class DbSession:
"""
A context manager to handle database activity.
"""

def __init__(self):
self.conn = None
self.no_connections = 0
self.transaction_active = False

def __enter__(self):
if self.conn is None:
self.conn = _get_connection()  # get connection from pool
self.conn.cur = self.conn.cursor()
# all updates in same transaction use same timestamp
self.conn.timestamp = datetime.now()
self.no_connections += 1
return self.conn

def __exit__(self, type, exc, tb):
if type is not None:  # an exception occurred
if self.transaction_active:
self.conn.rollback()
self.transaction_active = False
self.conn.release()  # return connection to pool
self.conn = None
return  # will reraise exception
self.no_connections -= 1
if not self.no_connections:
if self.transaction_active:
self.conn.commit()
self.transaction_active = False
self.conn.cur.close()
self.conn.release()  # return connection to pool
self.conn = None

All objects created within a session share a common DbSession instance.

When any of them need any database access, whether for reading or for 
updating, they execute the following -

with db_session as conn:
conn.transaction_active = True  # this line must be added if 
updating
conn.cur.execute(__whatever__)

Now it 'just works'. I don't have the need for save-points - either all 
updates happen, or none of them do.

Frank Millman



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Default scope of variables

2013-07-09 Thread Frank Millman

"Chris Angelico"  wrote in message 
news:captjjmr4mr0qcgwqxwyvdcz55nuav79vbtt8bjndsdvhrkq...@mail.gmail.com...
> On Tue, Jul 9, 2013 at 5:35 PM, Frank Millman  wrote:
>> I have been following this sub-thread with interest, as it resonates with
>> what I am doing in my project.
>
> Just FYI, none of my own code will help you as it's all using libpqxx,
> but the docs for the library itself are around if you want them (it's
> one of the standard ways for C++ programs to use PostgreSQL).
>

I support multiple databases (PostgreSQL, MS SQL Server, sqlite3 at this 
stage) so I use generic Python as much as possible.

>> I came up with the following context manager -
>>
>> class DbSession:
>> def __exit__(self, type, exc, tb):
>> if self.transaction_active:
>> self.conn.commit()
>> self.transaction_active = False
>
> Hmm. So you automatically commit. I'd actually be inclined to _not_ do
> this; make it really explicit in your code that you now will commit
> this transaction (which might throw an exception if you have open
> subtransactions).
>

I endeavour to keep all my database activity to the shortest time possible - 
get a connection, execute a command, release the connection. So setting 
'transaction_active = True' is my way of saying 'execute this command and 
commit it straight away'. That is explicit enough for me. If there are 
nested updates they all follow the same philosophy, so the transaction 
should complete quickly.

Frank



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Default scope of variables

2013-07-09 Thread Frank Millman

"Ian Kelly"  wrote in message 
news:CALwzid=fzgjpebifx1stdbkh8iwltwggwwptphz1ykyg+05...@mail.gmail.com...
> On Tue, Jul 9, 2013 at 1:35 AM, Frank Millman  wrote:
>> When any of them need any database access, whether for reading or for
>> updating, they execute the following -
>>
>> with db_session as conn:
>> conn.transaction_active = True  # this line must be added if
>> updating
>> conn.cur.execute(__whatever__)
>
> I'd probably factor out the transaction_active line into a separate
> DbSession method.
>
>@contextmanager
>def updating(self):
>with self as conn:
>conn.transaction_active = True
>yield conn
>
> Then you can do "with db_session" if you're merely reading, or "with
> db_session.updating()" if you're writing, and you don't need to repeat
> the transaction_active line all over the place.
>

I'll bear it in mind, but I will have to expend some mental energy to 
understand it first , so it will have to wait until I can find some time.

> I would also probably make db_session a factory function instead of a 
> global.

It is not actually a global. When I create a new session, I create a 
db_session instance and store it as a session attribute. Whenever I create a 
database object during the session, I pass in the instance as an argument, 
so they all share the same one.

Frank



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Default scope of variables

2013-07-09 Thread Frank Millman

"Ian Kelly"  wrote in message 
news:calwzidnf3obe0enf3xthlj5a40k8hxvthveipecq8+34zxy...@mail.gmail.com...
> On Tue, Jul 9, 2013 at 10:07 AM, Ethan Furman  wrote:
>> You could also do it like this:
>>
>> def updating(self):
>> self.transaction_active = True
>> return self
>
> Yes, that would be simpler.  I was all set to point out why this
> doesn't work, and then I noticed that the location of the
> "transaction_active" attribute is not consistent in the original code.
> The DbSession class places it on self, and then the example usage
> places it on the connection object (which I had based my version on).
> Since that seems to be a source of confusion, it demonstrates another
> reason why factoring this out is a good thing.

You had me worried there for a moment, as that is obviously an error.

Then I checked my actual code, and I find that I mis-transcribed it. It 
actually looks like this -

with db_session as conn:
db_session.transaction_active = True
conn.cur.execute(...)

I am still not quite sure what your objection is to this. It feels 
straightforward to me.

Here is one possible answer. Whenever I want to commit a transaction I have 
to add the extra line. There is a danger that I could mis-spell 
'transaction_active', in which case it would not raise an error, but would 
not commit the transaction, which could be a hard-to-trace bug. Using your 
approach, if I mis-spelled 'db_session.connect()', it would immediately 
raise an error.

Is that your concern, or are there other issues?

Frank



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Default scope of variables

2013-07-10 Thread Frank Millman

"Ian Kelly"  wrote in message 
news:calwzidk2+b5bym5b+xvtoz8lheyvhcos4v58f8z2o1jb6sa...@mail.gmail.com...
> On Tue, Jul 9, 2013 at 11:54 PM, Frank Millman  wrote:
>> You had me worried there for a moment, as that is obviously an error.
>>
>> Then I checked my actual code, and I find that I mis-transcribed it. It
>> actually looks like this -
>>
>> with db_session as conn:
>> db_session.transaction_active = True
>> conn.cur.execute(...)
>>
>> I am still not quite sure what your objection is to this. It feels
>> straightforward to me.
>>
>> Here is one possible answer. Whenever I want to commit a transaction I 
>> have
>> to add the extra line. There is a danger that I could mis-spell
>> 'transaction_active', in which case it would not raise an error, but 
>> would
>> not commit the transaction, which could be a hard-to-trace bug. Using 
>> your
>> approach, if I mis-spelled 'db_session.connect()', it would immediately
>> raise an error.
>>
>> Is that your concern, or are there other issues?
>
> Yes, that is one concern.  Another is that since you mistakenly typed
> "conn" instead of "db_session" once, you might make the same mistake
> again in actual code, with the same effect (unless the conn object
> doesn't allow arbitrary attributes, which is a possibility).  Another
> is that the code adheres better to the DRY principle if you don't need
> to copy that line all over the place.

Thanks to you and Ethan - that does make sense.

I have reviewed my code base to see how many occurrences there are, and 
there are just three.

All database objects inherit from a DbOject class. The class has a save() 
method and a delete() method. Each of these requires a commit, so I use my 
technique there. All database updates are activated by calling save() or 
delete().

I have an init.py script to 'bootstrap' a brand new installation, which 
requires populating an empty database with some basic structures up front. 
DbObject cannot be used here as the required plumbing is not in place, so I 
use my technique here as well.

However, this does not invalidate your general point, so I will keep it in 
mind.

Thanks

Frank



-- 
http://mail.python.org/mailman/listinfo/python-list


Problem with psycopg2, bytea, and memoryview

2013-07-31 Thread Frank Millman
Hi all

I don't know if this question is more appropriate for the psycopg2 list, but 
I thought I would ask here first.

I have some binary data (a gzipped xml object) that I want to store in a 
database. For PostgreSQL I use a column with datatype 'bytea', which is 
their recommended way of storing binary strings.

I use psycopg2 to access the database. It returns binary data in the form of 
a python 'memoryview'.

My problem is that, after a roundtrip to the database and back, the object 
no longer compares equal to the original.

>>> memoryview(b'abcdef') == b'abcdef'
True
>>> cur.execute('create table fmtemp (code int, xml bytea)')
>>> cur.execute('insert into fmtemp values (%s, %s)', (1, b'abcdef'))
>>> cur.execute('select * from fmtemp where code =1')
>>> row = cur.fetchone()
>>> row
(1, )
>>> row[1] == b'abcdef'
False
>>> row[1].tobytes() == b'abcdef'
True
>>>

Using MS SQL Server and pyodbc, it returns a byte string, not a memoryview, 
and it does compare equal with the original.

I can hack my program to use tobytes(), but it would add complication, and 
it would be database-specific. I would prefer a cleaner solution.

Does anyone have any suggestions?

Versions - Python: 3.3.2  PostgreSQL: 9.2.4  psycopg2: 2.5

Frank Millman



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem with psycopg2, bytea, and memoryview

2013-07-31 Thread Frank Millman

"Antoine Pitrou"  wrote in message 
news:loom.20130731t114936-...@post.gmane.org...
> Frank Millman  chagford.com> writes:
>>
>> I have some binary data (a gzipped xml object) that I want to store in a
>> database. For PostgreSQL I use a column with datatype 'bytea', which is
>> their recommended way of storing binary strings.
>>
>> I use psycopg2 to access the database. It returns binary data in the form 
>> of
>> a python 'memoryview'.
>>
> [...]
>>
>> Using MS SQL Server and pyodbc, it returns a byte string, not a 
>> memoryview,
>> and it does compare equal with the original.
>>
>> I can hack my program to use tobytes(), but it would add complication, 
>> and
>> it would be database-specific. I would prefer a cleaner solution.
>
> Just cast the result to bytes (`bytes(row[1])`). It will work both with 
> bytes
> and memoryview objcts.
>
> Regards
>
> Antoine.
>

Thanks for that, Antoine. It is an improvement over tobytes(), but i am 
afraid it is still not ideal for my purposes.

At present, I loop over a range of columns, comparing 'before' and 'after' 
values, without worrying about their types. Strings are returned as str, 
integers are returned as int, etc. Now I will have to check the type of each 
column before deciding whether to cast to 'bytes'.

Can anyone explain *why* the results do not compare equal? If I understood 
the problem, I might be able to find a workaround.

Frank



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problem with psycopg2, bytea, and memoryview

2013-07-31 Thread Frank Millman

"Antoine Pitrou"  wrote in message 
news:loom.20130731t150154-...@post.gmane.org...
> Frank Millman  chagford.com> writes:
>>
>> Thanks for that, Antoine. It is an improvement over tobytes(), but i am
>> afraid it is still not ideal for my purposes.
>
> I would suggest asking the psycopg2 project why they made this choice, and
> if they would reconsider. Returning a memoryview doesn't make much sense 
> IMHO.
>

I'll try it, and see what they say.

> For example, the standard sqlite3 module returns bytes for BLOB columns,
> and str for TEXT columns:
> http://docs.python.org/3.4/library/sqlite3.html#introduction
>
>> Can anyone explain *why* the results do not compare equal? If I 
>> understood
>> the problem, I might be able to find a workaround.
>
> Well, under recent Python versions, they should compare equal:
>
> Python 3.2.3 (default, Oct 19 2012, 19:53:16)
> [GCC 4.7.2] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> memoryview(b"abc") == b"abc"
> True
>

I am using Python 3.3.2.

If I try your example above, it does work.

However, for some reason, after a round-trip to the server, they do not 
compare equal.

See my original post for a full example.

Frank



-- 
http://mail.python.org/mailman/listinfo/python-list


Question about weakref

2012-07-04 Thread Frank Millman

Hi all

I have a situation where I thought using weakrefs would save me a bit of 
effort.


I have a basic publish/subscribe scenario. The publisher maintains a 
list of listener objects, and has a method whereby a listener can 
subscribe to the list by passing in 'self', whereupon it gets added to 
the list.


When the publisher has something to say, it calls a pre-defined method 
on each of the listeners. Simple, but it works.


The listeners are fairly transient, so when they go out of scope, I need 
to remove them from the list maintained by the publisher. Instead of 
keeping track of all of them and removing them explicitly, I thought of 
using weakrefs and let them be removed automatically.


It almost works. Here is an example -

import weakref

class A:  # the publisher class
def __init__(self):
self.array = []
def add_b(self, b):
self.array.append(weakref.ref(b, self.del_b))
def del_b(self, b):
self.array.remove(b)
def chk_b(self, ref):
for b in self.array:
b().hallo(ref)

class B:  # the listener class
def __init__(self, a, name):
self.name = name
a.add_b(self)
def hallo(self, ref):
print(self.name, 'hallo from', ref)
def __del__(self):
print('%s deleted' % self.name)

a = A()
x = B(a, 'test x')
y = B(a, 'test y')
z = B(a, 'test z')
a.chk_b(1)
del x
a.chk_b(2)
del y
a.chk_b(3)
del z
a.chk_b(4)
print(a.array)

The output is as expected -

test x hallo from 1
test y hallo from 1
test z hallo from 1
test x deleted
test y hallo from 2
test z hallo from 2
test y deleted
test z hallo from 3
test z deleted
[]

Then I tried weakref.proxy.

I changed
self.array.append(weakref.ref(b, self.del_b))
to
self.array.append(weakref.proxy(b, self.del_b))
and
b().hallo(ref)
to
b.hallo(ref)

I got the same result.

Then I varied the order of deletion - instead of x, then y, then z, I 
tried x, then z, then y.


Now I get the following traceback -

test x hallo from 1
test y hallo from 1
test z hallo from 1
test x deleted
test y hallo from 2
test z hallo from 2
Exception ReferenceError: 'weakly-referenced object no longer exists' in 

method A.del_b of <__main__.A object at 0x00A8A750>> ignored
test z deleted
test y hallo from 3
Traceback (most recent call last):
  File "F:\junk\weaklist.py", line 70, in 
a.chk_b(3)
  File "F:\junk\weaklist.py", line 51, in chk_b
b.hallo(ref)
ReferenceError: weakly-referenced object no longer exists
test y deleted

If I go back to using weakref.ref, but with the new deletion order, it 
works.


So now I am confused.

1. Why do I get the traceback?

2. Can I rely on using weakref.ref, or does that also have some problem 
that has just not appeared yet?


Any advice will be appreciated.

BTW, I am using python 3.2.2.

Thanks

Frank Millman

--
http://mail.python.org/mailman/listinfo/python-list


Re: Question about weakref

2012-07-05 Thread Frank Millman

On 05/07/2012 10:46, Dieter Maurer wrote:

Frank Millman  writes:


I have a situation where I thought using weakrefs would save me a bit
of effort.


Instead of the low level "weakref", you might use a "WeakKeyDictionary".



Thanks, Dieter. I could do that.

In fact, a WeakSet suits my purposes better. I tested it with my 
original example, and it works correctly. It also saves me the step of 
deleting the weak reference once the original object is deleted, as it 
seems to do that automatically.


I just need to double-check that I would never have the same 
listener-object try to register itself with the publisher twice, as that 
would obviously fail with a Set, as it would with a Dict.


I would still like to know why weakref.proxy raised an exception. I have 
re-read the manual several times, and googled for similar problems, but 
am none the wiser. Naturally I feel a bit uneasy using a feature of the 
language which sometimes fails mysteriously, so if anyone has an 
explanation, I would really appreciate it.


Frank

--
http://mail.python.org/mailman/listinfo/python-list


Re: Question about weakref

2012-07-06 Thread Frank Millman

On 05/07/2012 19:47, Dieter Maurer wrote:

Frank Millman  writes:


I would still like to know why weakref.proxy raised an exception. I
have re-read the manual several times, and googled for similar
problems, but am none the wiser.


In fact, it is documented. Accessing a proxy will raise an exception
when the proxied object no longer exists.

What you can ask is why your proxy has been accessed after the
object was deleted. The documentation is specific: during the callback,
the object should still exist. Thus, apparently, one of your proxies
outlived an event that should have deleted it (probably a hole in
your logic).



I have investigated a bit further, and now I have a clue as to what is 
happening, though not a full understanding.


If you use 'b = weakref.ref(obj)', 'b' refers to the weak reference, and 
'b()' refers to the referenced object.


If you use 'b = weakref.proxy(obj)', 'b' refers to the referenced 
object. I don't know how to refer to the weak reference itself. In a way 
that is the whole point of using 'proxy', but the difficulty comes when 
you want to remove the weak reference when the referenced object is deleted.


This is from the manual section on weakref.ref -
"If callback is provided and not None, and the returned weakref object 
is still alive, the callback will be called when the object is about to 
be finalized; the weak reference object will be passed as the only 
parameter to the callback; the referent will no longer be available."


My callback method looks like this -
del del_b(b):
   self.array.remove(b)
It successfully removes the weak reference from self.array.

This is from the manual section on weakref.proxy -
"callback is the same as the parameter of the same name to the ref() 
function."


My callback method looks the same. However, although 'b' is the weak 
reference, when I refer to 'b' it refers to the original object, which 
at this stage no longer exists.


So my revised question is -
  How can you remove the weak reference if you use proxy?

The full story is more complicated than that - why does my example work 
when I delete x, then y, then z, but not if I reverse the order?


However, I think that I have isolated the fundamental reason. So any 
comments on my latest findings will be appreciated.


Frank

--
http://mail.python.org/mailman/listinfo/python-list


Re: Question about weakref

2012-07-06 Thread Frank Millman

On 06/07/2012 20:12, Ethan Furman wrote:

Ian Kelly wrote:

def del_b(self, b):
for i, x in enumerate(self.array):
if b is x:
del self.array[i]
break


Nice work, Ian.


I second that. Thanks very much, Ian.

Frank

--
http://mail.python.org/mailman/listinfo/python-list


Regex Question

2012-08-17 Thread Frank Koshti
Hi,

I'm new to regular expressions. I want to be able to match for tokens
with all their properties in the following examples. I would
appreciate some direction on how to proceed.


@foo1
@foo2()
@foo3(anything could go here)


Thanks-
Frank
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Regex Question

2012-08-18 Thread Frank Koshti
I think the point was missed. I don't want to use an XML parser. The
point is to pick up those tokens, and yes I've done my share of RTFM.
This is what I've come up with:

'\$\w*\(?.*?\)'

Which doesn't work well on the above example, which is partly why I
reached out to the group. Can anyone help me with the regex?

Thanks,
Frank
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Regex Question

2012-08-18 Thread Frank Koshti
Hey Steven,

Thank you for the detailed (and well-written) tutorial on this very
issue. I actually learned a few things! Though, I still have
unresolved questions.

The reason I don't want to use an XML parser is because the tokens are
not always placed in HTML, and even in HTML, they may appear in
strange places, such as Hello. My specific issue is
I need to match, process and replace $foo(x=3), knowing that (x=3) is
optional, and the token might appear simply as $foo.

To do this, I decided to use:

re.compile('\$\w*\(?.*?\)').findall(mystring)

the issue with this is it doesn't match $foo by itself, and requires
there to be () at the end.

Thanks,
Frank
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Regex Question

2012-08-18 Thread Frank Koshti
On Aug 18, 11:48 am, Peter Otten <__pete...@web.de> wrote:
> Frank Koshti wrote:
> > I need to match, process and replace $foo(x=3), knowing that (x=3) is
> > optional, and the token might appear simply as $foo.
>
> > To do this, I decided to use:
>
> > re.compile('\$\w*\(?.*?\)').findall(mystring)
>
> > the issue with this is it doesn't match $foo by itself, and requires
> > there to be () at the end.
> >>> s = """
>
> ... $foo1
> ... $foo2()
> ... $foo3(anything could go here)
> ... """>>> re.compile("(\$\w+(?:\(.*?\))?)").findall(s)
>
> ['$foo1', '$foo2()', '$foo3(anything could go here)']

PERFECT-
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Regex Question

2012-08-18 Thread Frank Koshti
On Aug 18, 12:22 pm, Jussi Piitulainen 
wrote:
> Frank Koshti writes:
> > not always placed in HTML, and even in HTML, they may appear in
> > strange places, such as Hello. My specific issue
> > is I need to match, process and replace $foo(x=3), knowing that
> > (x=3) is optional, and the token might appear simply as $foo.
>
> > To do this, I decided to use:
>
> > re.compile('\$\w*\(?.*?\)').findall(mystring)
>
> > the issue with this is it doesn't match $foo by itself, and requires
> > there to be () at the end.
>
> Adding a ? after the meant-to-be-optional expression would let the
> regex engine know what you want. You can also separate the mandatory
> and the optional part in the regex to receive pairs as matches. The
> test program below prints this:
>
> >$foo()$foo(bar=3)$$$foo($)$foo($bar(v=0))etc
> ('$foo', '')
> ('$foo', '(bar=3)')
> ('$foo', '($)')
> ('$foo', '')
> ('$bar', '(v=0)')
>
> Here is the program:
>
> import re
>
> def grab(text):
>     p = re.compile(r'([$]\w+)([(][^()]+[)])?')
>     return re.findall(p, text)
>
> def test(html):
>     print(html)
>     for hit in grab(html):
>         print(hit)
>
> if __name__ == '__main__':
>     test('>$foo()$foo(bar=3)$$$foo($)$foo($bar(v=0))etchttp://mail.python.org/mailman/listinfo/python-list


Re: Flexible string representation, unicode, typography, ...

2012-08-25 Thread Frank Millman

On 25/08/2012 10:58, Mark Lawrence wrote:

On 25/08/2012 08:27, wxjmfa...@gmail.com wrote:


Unicode design: a flat table of code points, where all code
points are "equals".
As soon as one attempts to escape from this rule, one has to
"pay" for it.
The creator of this machinery (flexible string representation)
can not even benefit from it in his native language (I think
I'm correctly informed).

Hint: Google -> "Das grosse Eszett"

jmf



It's Saturday morning, I'm stone cold sober, had a good sleep and I'm
still baffled as to the point if any.  Could someone please enlightem me?



Here's what I think he is saying. I am posting this to test the water. I 
am also confused, and if I have got it wrong hopefully someone will 
correct me.


In python 3.3, unicode strings are now stored as follows -
  if all characters can be represented by 1 byte, the entire string is 
composed of 1-byte characters
  else if all characters can be represented by 1 or 2 bytea, the entire 
string is composed of 2-byte characters

  else the entire string is composed of 4-byte characters

There is an overhead in making this choice, to detect the lowest number 
of bytes required.


jmfauth believes that this only benefits 'english-speaking' users, as 
the rest of the world will tend to have strings where at least one 
character requires 2 or 4 bytes. So they incur the overhead, without 
getting any benefit.


Therefore, I think he is saying that he would have preferred that python 
standardise on 4-byte characters, on the grounds that the saving in 
memory does not justify the performance overhead.


Frank Millman


--
http://mail.python.org/mailman/listinfo/python-list


Getting ipython notebook to plot inline

2012-10-09 Thread Frank Franklin
I've just managed to install ipython and get it to run by typing ipython 
notebook --pylab=inline

Now I'm getting the following error when I try to plot something in ipython 
notebook:
AttributeError: 'module' object has no attribute 'FigureCanvas'

I've tried using imports to make this work:
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(0, 5, 0.1);
y = np.sin(x)
plt.plot(x, y)

But for some reason I still get this error. Anybody else know what's going on 
here? All of the print statements I've done before have worked, and I actually 
got my plots to work when I didn't set --pylab=inline, though they came up in a 
separate window and I want them to stay in the ipython notebook.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Getting ipython notebook to plot inline [updated]

2012-10-11 Thread Frank Franklin
Ok, so just to add to this, there is no problem plotting when I used the 
following command in my terminal to start the notebook:
ipython notebook
The only problem is that this plots my figures outside of the notebook page, 
and I really want to get everything into the notebook, since that's the point 
of the install.

but for some reason when I add the --pylab=inline that all the tutorials 
mention I get the AttributeError I mentioned before. I'm starting to wonder if 
this is a problem with my machine setup, if I'm missing something else that 
ipython notebook needs to do inline plotting.

Any help on this would be appreciated -- at this point I'm banging my head 
against the wall and solution doesn't seem to have surfaced through googling.




On Tuesday, October 9, 2012 2:02:17 PM UTC-4, Frank Franklin wrote:
> I've just managed to install ipython and get it to run by typing ipython 
> notebook --pylab=inline
> 
> 
> 
> Now I'm getting the following error when I try to plot something in ipython 
> notebook:
> 
> AttributeError: 'module' object has no attribute 'FigureCanvas'
> 
> 
> 
> I've tried using imports to make this work:
> 
> import matplotlib.pyplot as plt
> 
> import numpy as np
> 
> x = np.arange(0, 5, 0.1);
> 
> y = np.sin(x)
> 
> plt.plot(x, y)
> 
> 
> 
> But for some reason I still get this error. Anybody else know what's going on 
> here? All of the print statements I've done before have worked, and I 
> actually got my plots to work when I didn't set --pylab=inline, though they 
> came up in a separate window and I want them to stay in the ipython notebook.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Organisation of python classes and their methods

2012-11-02 Thread Frank Millman

On 02/11/2012 08:16, Martin Hewitson wrote:

Dear list,

I'm relatively new to Python and have googled and googled but haven't found a 
reasonable answer to this question, so I thought I'd ask it here.

I'm beginning a large Python project which contains many packages, modules and 
classes. The organisation of those is clear to me.

Now, the classes can contain many methods (100s of data analysis methods) which operate 
on instances of the class they belong to. These methods can be long and complex. So if I 
put these methods all in the module file inside the class, the file will get insanely 
long. Reading on google, the answer is usually "refactor", but that really 
doesn't make sense here. It's just that the methods are many, and each method can be a 
long piece of code. So, is there a way to put these methods in their own files and have 
them 'included' in the class somehow? I read a little about mixins but all the solutions 
looked very hacky. Is there an official python way to do this? I don't like having source 
files with 100's of lines of code in, let alone 1000's.

Many thanks,

Martin



I have read the other responses, so I may get some flak for encouraging 
bad habits. Nevertheless, I did have a similar requirement, and I found 
a solution that worked for me.


My situation was not as extreme as yours. I had a class with a number of 
methods. Some of them were of an 'operational' nature - they represented 
the main functionality of the class, and could be called often. Some of 
them were of a 'setup' nature - they were optionally called when the 
object was instantiated, but would only be called once.


I found that when I wanted to focus on one set of methods, the other set 
'got in my way', and vice-versa. My solution was to take the 'setup' 
methods and put them in another file. This is how I did it.


BEFORE
==

main.py -

class MyClass:
def setup1(self, ...):
[...]
def setup2(self, ...):
[...]
def func1(self, ...):
[...]
def func2(self, ...):
[...]


AFTER
=

setup.py -

def setup1(self, ...):
[...]
def setup2(self, ...):
[...]

main.py -

import setup
class MyClass:
setup1 = setup.setup1
setup2 = setup.setup2
def func1(self, ...):
[...]
def func2(self, ...):
[...]


Hope this gives you some ideas.

Frank Millman


--
http://mail.python.org/mailman/listinfo/python-list


Re: MySQLdb compare lower

2012-12-13 Thread Frank Millman

On 14/12/2012 06:16, Chris Angelico wrote:


Yeah, it's one of the things that tripped me up when I did a
MySQL->PostgreSQL conversion earlier this year. The code was assuming
case insensitivity, and began failing on PG. Fortunately the simple
change of LIKE to ILIKE solved that.

I'd MUCH rather be explicit about wanting case insensitivity.


Just as a side-note, for those who may be interested -

PostgreSQL allows you to create an index using an expression.

Therefore you can say -

  CREATE INDEX ndx ON table_name (LOWER(col_name))

Then you can SELECT ... WHERE LOWER(col_name) = LOWER(%s), and it will 
use the index, so it is not necessary to coerce the data to lower case 
before storing.


Frank Millman


--
http://mail.python.org/mailman/listinfo/python-list


py2exe is on Sourceforge list of top growth projects

2012-12-17 Thread Frank Millman

This is from Sourceforge's monthly update -



Top Growth Projects

We're always on the lookout for projects that might be doing interesting 
things, and a surge in downloads is one of many metrics that we look at 
to identify them. Here's the projects that had the greatest growth in 
the last month.


[...]

py2exe: A distutils extension to create standalone Windows programs from 
python scripts.




It is 19th on a list of 19, but still, it is nice to see. I wonder if 
there was any particular reason for that?


Frank Millman

--
http://mail.python.org/mailman/listinfo/python-list


Re: Need a specific sort of string modification. Can someone help?

2013-01-05 Thread Frank Millman

On 05/01/2013 10:35, Sia wrote:

I have strings such as:

tA.-2AG.-2AG,-2ag
or
.+3ACG.+5CAACG.+3ACG.+3ACG

The plus and minus signs are always followed by a number (say, i). I want 
python to find each single plus or minus, remove the sign, the number after it 
and remove i characters after that. So the two strings above become:

tA..,
and
...

How can I do that?
Thanks.



Here is a naive solution (I am sure there are more elegant ones) -

def strip(old_string):
new_string = ''
max = len(old_string)
pos = 0
while pos < max:
char = old_string[pos]
if char in ('-', '+'):
num_pos = pos+1
num_str = ''
while old_string[num_pos].isdigit():
num_str += old_string[num_pos]
num_pos += 1
pos = num_pos + int(num_str)
else:
new_string += old_string[pos]
pos += 1
return new_string

It caters for the possibility that the number following the +/- could be 
greater than 9 - I don't know if you need that.


It works with your examples, except that the second one returns '', 
which I think is correct - there are 4 dots in the original string.


HTH

Frank Millman


--
http://mail.python.org/mailman/listinfo/python-list


Re: Else statement executing when it shouldnt

2013-01-25 Thread Frank Millman

On 23/01/2013 15:35, Jussi Piitulainen wrote:

Thomas Boell writes:


Using a keyword that has a well-understood meaning in just about
every other programming language on the planet *and even in
English*, redefining it to mean something completely different, and
then making the syntax look like the original, well-understood
meaning -- that's setting a trap out for users.

The feature isn't bad, it's just very, very badly named.


I believe it would read better - much better - if it was "for/then"
and "while/then" instead of "for/else" and "while/else".

I believe someone didn't want to introduce a new keyword for this,
hence "else".



There is a scenario, which I use from time to time, where 'else' makes 
perfect sense.


You want to loop through an iterable, looking for 'something'. If you 
find it, you want to do something and break. If you do not find it, you 
want to do something else.


for item in iterable:
if item == 'something':
do_something()
break
else:  # item was not found
do_something_else()

Not arguing for or against, just saying it is difficult to find one word 
which covers all requirements.


Frank Millman


--
http://mail.python.org/mailman/listinfo/python-list


Re: ??????????? DOES GOG EXIST

2013-01-26 Thread Frank Millman

On 26/01/2013 18:41, BV BV wrote:


DOES GOG EXIST


http://www.youtube.com/watch?v=tRMmTbCXXAk&feature=related



THANK YOU



Did you hear about the dyslexic agnostic insomniac?

He lies awake at night wondering if there is a dog.


--
http://mail.python.org/mailman/listinfo/python-list


Sorting a hierarchical table (SQL)

2013-01-30 Thread Frank Millman

Hi all

This is not really a python question, but I am hoping that some of you 
can offer some suggestions.


I have a SQL table with hierarchical data. There are two models that can 
be used to represent this - Adjacency Lists and Nested Sets. Here is a 
link to an article that discusses and compares the two approaches -


http://explainextended.com/2009/09/24/adjacency-list-vs-nested-sets-postgresql/

A feature of the Nested Sets model is that a SELECT returns the rows by 
following the links in the structure - root first, followed by its first 
child, followed by *its* first child, until the bottom is reached, then 
any siblings, then up one level to the next child, and so on, until the 
entire tree is exhausted.


I am looking for a way to emulate this behaviour using Adjacency Lists. 
It is not that easy.


The article above shows a way of doing this using an Array. 
Unfortunately that is a PostgreSQL feature not available in all 
databases, so I want to avoid that. Here is the best I have come up with.


For each row, I know the parent id, I know the level (depth in the 
tree), and I know the sequence number - every row has a sequence number 
that is unique within any group of siblings within the tree, always 
starting from zero.


I create a string to be used as a sort key, consisting of the parent's 
sort key, a comma, and the row's sequence number. So the root has a key 
of '0', the first child has '0,0', its first child has '0,0,0', etc.


If there could never be more than 10 siblings, that would work, but if 
it goes over 10, the next key would contain the substring '10', which 
sorts earlier than '2', which would be incorrect.


Therefore, on the assumption that there will never be more that 1 
siblings, I zero-fill each element to a length of 4, so the first key is 
'', the next one is ',', then ',,', etc.


All this is done in SQL, as part of a complicated SELECT statement.

It works, and it would be unusual to have a tree with a depth of more 
than 4 or 5, so I can live with it.


However, it is not pretty. I wondered if anyone can suggest a more 
elegant solution.


Thanks

Frank Millman

--
http://mail.python.org/mailman/listinfo/python-list


Re: Distributing methods of a class across multiple files

2012-01-25 Thread Frank Millman

"lh"  wrote:
> Is this possible please?  I have done some searching but it is hard to
> narrow down Google searches to this question. What I would like to do
> is, for example:
> 1) define a class Foo in file test.py... give it some methods
> 2) define a file test2.py which contains a set of methods that are
> methods of class Foo defined in test.py.  I can import Foo obviously
> but it isn't clear to me how to identify the methods in test2.py to be
> methods of class Foo defined in test.py (normally I would just indent
> them "def"'s under the class but the class isn't textually in
> test2.py).
>
> In short I would like to distribute code for one class across multiple
> files so a given file doesn't get ridiculously long.
>

I take the point of the other responders that it is not a normal thing to 
do, but I had a few long but rarely used methods which I wanted to move out 
of the main file just to keep the main file tidier. I came up with this 
solution, and it seems to work.

In test2.py -

def long_method_1():
pass

def long_method2():
pass

In test.py -

import test2

class Foo:
long_method_1 = test2.long_method_1
long_method_2 = test2.long_method_2

Then in Foo I can refer to self.long_method_1().

HTH

Frank Millman



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: constraint based killer sudoku solver performance improvements

2012-01-26 Thread Frank Millman

"Blockheads Oi Oi"  wrote:

>I have a working program based on [1] that sets up all different 
>constraints for each row, column and box and then sets exact sum 
>constraints for each cage.  It'll run in around 0.2 secs for a simple 
>problem, but a tough one takes 2 hours 45 minutes.  I did some research 
>into improving the performance and found [2] but can't work out how to 
>implement the constraints given.  Can someone please help, assuming that 
>it's even possible.
>
> [1] http://pypi.python.org/pypi/python-constraint/1.1
> [2] http://4c.ucc.ie/~hsimonis/sudoku.pdf

I don't have an answer, but are you aware of this -

http://www.ics.uci.edu/~eppstein/PADS/Sudoku.py

It is a sudoko solver written in pure python.

I don't know what you call a tough problem, but this one solves the hardest 
one I have thrown at it in the blink of an eye. It also outputs a full trace 
of  the reasoning it used to arrive at a solution.

Frank Millman



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python and TAP

2012-02-06 Thread Frank Becker
On 06.02.12 01:58, Matej Cepl wrote:

Hi,

> I have just finished listening to the FLOSS Weekly podcast #200
> (http://twit.tv/show/floss-weekly/200) on autotest, where I've learned
> about the existence of TAP (http://testanything.org/). A standardization
> of testing seems to be so obviously The Right Thing™, that it is strange
> that I don't see much related movement in the Python world (I know only
> about http://git.codesimply.com/?p=PyTAP.git;a=summary or
> git://git.codesimply.com/PyTAP.git, which seems to be very very simple
> and only producer).
> 
> What am I missing? Why nobody seems to care about joining TAP standard?
Not sure. Probably it comes down to what you need depending on your tool
chain. But there are alternatives. Most prominent to my knowledge is
subunit [0]. Here is a comparison between the two [1].

One warning when you jump on the TAP train: Using the Python YAML
module PyYAML you will have to find out that TAP uses a YAML subset
called YAMLish [3]. It's not the same and pretty much defined by the
Perl implementation.

[0] https://launchpad.net/subunit
[1]
http://www.kinoshita.eti.br/2011/06/04/a-comparison-of-tap-test-anything-protocol-and-subunit/
[2] http://pyyaml.org/
[3] http://testanything.org/wiki/index.php/YAMLish

Bye,

Frank


-- 
Frank Becker  (jabber|mail) | http://twitter.com/41i3n8
GnuPG: 0xADC29ECD | F01B 5E9C 1D09 981B 5B40 50D3 C80F 7459 ADC2 9ECD



signature.asc
Description: OpenPGP digital signature
-- 
http://mail.python.org/mailman/listinfo/python-list


Question about circular imports

2012-02-26 Thread Frank Millman
Hi all

I seem to have a recurring battle with circular imports, and I am trying to 
nail it once and for all.

Let me say at the outset that I don't think I can get rid of circular 
imports altogether. It is not uncommon for me to find that a method in 
Module A needs to access something in Module B, and a method in Module B 
needs to access something in Module A. I know that the standard advice is to 
reorganise the code to avoid this, and I try to do this where possible, but 
for now I would like to address the question of how to handle the situation 
if this is otherwise unavoidable.

The problem is clearly explained in the Python Programming FAQ -

"Circular imports are fine where both modules use the "import " form 
of import. They fail when the 2nd module wants to grab a name out of the 
first ("from module import name") and the import is at the top level. That's 
because names in the 1st are not yet available, because the first module is 
busy importing the 2nd."

Having recently reorganised my code into packages, I find that the same 
problem arises with packages. Assume the following structure, copied from 
the Tutorial -

sound/
__init__.py
formats/
__init__.py
wavread.py
wavwrite.py

The following fails -

in wavread.py -
from formats import wavwrite [this works]

in wavwrite.py -
from formats import wavread [this fails with ImportError]

I can think of two solutions - one is cumbersome, the other may not be good 
practice.

The first solution is -

in wavread.py -
import formats.wavwrite

in wavwrite.py -
import formats.wavread

I then have to use the full path to reference any attribute inside the 
imported module, which I find cumbersome.

The second solution is -

in formats/__init__.py
import sys
sys.path.insert(0, __path__[0])

in wavread.py -
import wavwrite

in wavwrite.py -
import wavread

This works, but I don't know if it is a good idea to add all the sub-package 
paths to sys.path. I realise that it is up to me to avoid any name clashes. 
Are there any other downsides?

So I guess my question is -

- is there a better solution to my problem?
- if not, is my second solution acceptable?

If not, I seem to be stuck with using full path names to reference any 
attributes in imported modules.

I am using Python3 exclusively now, if that makes any difference.

Any advice will be appreciated.

Frank Millman



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Question about circular imports

2012-02-26 Thread Frank Millman

"Frank Millman"  wrote in message 
news:jid2a9$n21$1...@dough.gmane.org...
> Hi all
>
> I seem to have a recurring battle with circular imports, and I am trying 
> to nail it once and for all.
>
[...]
>
> The second solution is -
>
> in formats/__init__.py
>import sys
>sys.path.insert(0, __path__[0])
>
> in wavread.py -
>import wavwrite
>
> in wavwrite.py -
>import wavread
>
> This works, but I don't know if it is a good idea to add all the 
> sub-package paths to sys.path. I realise that it is up to me to avoid any 
> name clashes. Are there any other downsides?
>

Answering my own question, I have just found a downside that is a 
showstopper.

If a module in a different sub-package needs to import one of these modules, 
it must use a full path. This results in a new entry in sys.modules, and 
therefore any attributes referenced by the intra-package module have 
different identities from those referenced from outside. If they are static, 
there is no problem, but if not, disaster strikes!

Frank



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Question about circular imports

2012-02-26 Thread Frank Millman

"Peter Otten" <__pete...@web.de> wrote in message 
news:jid424$vfp$1...@dough.gmane.org...
> Frank Millman wrote:
>
>
> To cut a long story short, why should circular imports be unavoidable?
>
> Paths into packages are recipe for desaster. You may end up with multiple
> instances of the same module and your programs will break in "interesting"
> (hard to debug) ways.
>

Thanks, Peter. I have just figured this out for myself, but you beat me to 
it.

Full paths it is, then.

Frank



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Question about circular imports

2012-02-26 Thread Frank Millman
>
> To avoid the tedious reference, follow this with
> read = sound.formats.wavread # choose the identifier you prefer
>

@Terry and OKB

I tried that, but it does not work.

a.py
/b
__init__.py
c.py
   d.py

a.py -
from b import c
c.py -
import b.d
d.py -
import b.c

If I run a.py, it returns with no error.

c.py -
import b.d
d = b.d
d.py -
import b.c
c = b.c

If I run a.py, I get

Traceback (most recent call last):
  File "F:\tests\a.py", line 1, in 
from b import c
  File "F:\tests\b\c.py", line 1, in 
import b.d
  File "F:\tests\b\d.py", line 2, in 
c = b.c
AttributeError: 'module' object has no attribute 'c'

I get the same if I try 'import b.c as c'.

Frank



-- 
http://mail.python.org/mailman/listinfo/python-list


Question about sub-packages

2012-02-27 Thread Frank Millman
Hi all

This is a follow-up to my recent question about circular imports, but on a 
different subject, hence the new thread.

My application has grown to the point that it makes sense to split it up 
into sub-packages.

>From a certain point of view, each package can be said to have an API, not 
just for third-party users of the application, but for other sub-packages 
within the application. In other words, there are a number of functions that 
can be called and a number of objects that can be instantiated from outside 
the sub-package.

It struck me that, even though I can publish the API, it still requires 
external users to know enough of the internals of the package to know which 
modules to import and which objects to reference. This has two 
disadvantages - it makes it more difficult to understand the API, and it 
makes it more difficult for me to restructure the package internally.

An alternative is to have a dedicated API within the sub-package, in the 
form of one-line functions that are called externally, and then perform 
whatever action is required internally and return results as appropriate. 
This is easier for users of the sub-package, and allows me to restructure 
the internals of the package without causing problems.

If this makes sense, my next thought was, where is the best place to put 
this API. Then I thought, why not put it in the __init__.py of the 
sub-package? Then all that the users of the package have to do is import the 
package, and then place calls on it directly.

I did a quick test and it seems to work. Is this a good idea, or are there 
any downsides?

Thanks

Frank Millman



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Question about sub-packages

2012-02-28 Thread Frank Millman

"Frank Millman"  wrote in message 
news:jii0vo$36t$1...@dough.gmane.org...
> Hi all
>
> This is a follow-up to my recent question about circular imports, but on a 
> different subject, hence the new thread.
>
[...]
>
> If this makes sense, my next thought was, where is the best place to put 
> this API. Then I thought, why not put it in the __init__.py of the 
> sub-package? Then all that the users of the package have to do is import 
> the package, and then place calls on it directly.
>
> I did a quick test and it seems to work. Is this a good idea, or are there 
> any downsides?
>

Answering my own question again ...

The one-liner API concept *may* be a good idea - still waiting for some 
feedback on that.

But putting them into __init__.py is not a good idea, as I run into some 
subtle 'circular import' problems again. I don't fully understand the 
conditions under which it fails, but that is unimportant, as my objective is 
to avoid circular imports altogether.

I have created a module called 'api.py' and put them in there, and that 
seems to work (for now...).

Frank



-- 
http://mail.python.org/mailman/listinfo/python-list


Trying to understand 'import' a bit better

2012-03-04 Thread Frank Millman
Hi all

I have been using 'import' for ages without particularly thinking about it - 
it just works.

Now I am having to think about it a bit harder, and I realise it is a bit 
more complicated than I had realised - not *that* complicated, but there are 
some subtleties.

I don't know the correct terminology, but I want to distinguish between the 
following two scenarios -

1. A python 'program', that is self contained, has some kind of startup, 
invokes certain functionality, and then closes.

2. A python 'library', that exposes functionality to other python programs, 
but relies on the other program to invoke its functionality.

The first scenario has the following characteristics -
  - it can consist of a single script or a number of modules
  - if the latter, the modules can all be in the same directory, or in one 
or more sub-directories
  - if they are in sub-directories, the sub-directory must contain 
__init__.py, and is referred to as a sub-package
  - the startup script will normally be in the top directory, and will be 
executed directly by the user

When python executes a script, it automatically places the directory 
containing the script into 'sys.path'. Therefore the script can import a 
top-level module using 'import ', and a sub-package module using 
'import .'.

The second scenario has similar characteristics, except it will not have a 
startup script. In order for a python program to make use of the library, it 
has to import it. In order for python to find it, the directory containing 
it has to be in sys.path. In order for python to recognise the directory as 
a valid container, it has to contain __init__.py, and is referred to as a 
package.

To access a module of the package, the python program must use 'import 
.' (or 'from  import '), and to access a 
sub-package module it must use 'import ...

So far so uncontroversial (I hope).

The subtlety arises when the package wants to access its own modules. 
Instead of using 'import ' it must use 'import .'. 
This is because the directory containing the package is in sys.path, but the 
package itself is not. It is possible to insert the package directory name 
into sys.path as well, but as was pointed out recently, this is dangerous, 
because you can end up with the same module imported twice under different 
names, with potentially disastrous consequences.

Therefore, as I see it, if you are developing a project using scenario 1 
above, and then want to change it to scenario 2, you have to go through the 
entire project and change all import references by prepending the package 
name.

Have I got this right?

Frank Millman



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Reading Live Output from a Subprocess

2012-04-06 Thread Frank Millman

"Dubslow"  wrote:

> It's just a short test script written in python, so I have no idea how to 
> even control the buffering (and even if I did, I still can't modify the 
> subprocess I need to use in my script). What confuses me then is why Perl 
> is able to get around this just fine without faking a terminal or similar 
> stuff. (And also, this needs to work in Windows as well.) For the record, 
> here's the test script:
> ##
> #!/usr/bin/python
>
> import time, sys
> try:
> total = int(sys.argv[1])
> except IndexError:
> total = 10
>
> for i in range(total):
> print('This is iteration', i)
> time.sleep(1)
>
> print('Done. Exiting!')
> sys.exit(0)
> ##
>

I am probably missing something, but this works for me -

sub_proc1.py
--
from time import sleep
for i in range(5):
print(i)
sleep(1)

sub_proc2.py
--
import subprocess as sub
proc = sub.Popen(["python", "sub_proc1.py"])
x, y = proc.communicate()

Running sub_proc1 gives the obvious output - the digits 0 to 4 displayed 
with delays of 1 second.

Running sub_proc2 gives exactly the same output.

This is using python 3.2.2 on Windows Server 2003.

Frank Millman



-- 
http://mail.python.org/mailman/listinfo/python-list


Difference between 'imp' and 'importlib'

2012-04-20 Thread Frank Millman
Hi all

I need the ability to execute a function by parsing a string containing the 
full path to the function. The string is multi-dotted. The last element is 
the function name, the second-last is the name of the module containing the 
function, and the balance is the path to the module.

I have been using 'import imp', and then 'imp.find_module' and 
'imp.load_module'. It works, but I noticed that it would reload the module 
if it was already loaded.

As an alternative, I tried 'import importlib', and then 
'importlib.import_module'. This also works, and does not reload the module.

So my question is, is there any practical difference between the two 
approaches? What about 'There should be one-- and preferably only 
one --obvious way to do it'?

Frank Millman



-- 
http://mail.python.org/mailman/listinfo/python-list


Strange __import__() behavior

2012-04-25 Thread Frank Miles
I have an exceedingly simple function that does a "named import".
It works perfectly for one file "r"- and fails for the second "x".

If I reverse the order of being called, it is still "x" that fails,
and "r" still succeeds.

os.access() always reports that the file is readable (i.e. "true")

If I simply call up the python interpreter (python 2.6 - Debian stable)
and manually "import x" - there is no problem - both work.  Similarly
typing the __import__("x") works when typed directly at the python
prompt.  Both 'x' and 'r' pass pychecker with no errors.  The same error
occurs with winpdb - the exception message says that the file could
not be found.  'file' reports that both files are text files, and there
aren't any strange file-access permissions/attributes.

Here's the function that is failing:

def named_import(fname, description) :
import  os
pname= fname + '.py'
print "ENTRY FILE", pname, ": acces=", os.access(pname, os.R_OK)
try :
X=__import__(fname)
x= [ X.cmnds, X.variables ]
except ImportError :
print "failed"
return x 

This is the first time I've needed to import a file whose name couldn't 
be specified in the script, so there's a chance that I've done something
wrong, but it seems very weird that it works in the CL interpreter and
not in my script.

TIA for any hints or pointers to the relevant overlooked documentation!

  -F


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Strange __import__() behavior

2012-04-26 Thread Frank Miles
On Wed, 25 Apr 2012 23:03:36 +0200, Kiuhnm wrote:

> On 4/25/2012 22:05, Frank Miles wrote:
>> I have an exceedingly simple function that does a "named import". It
>> works perfectly for one file "r"- and fails for the second "x".
>>
>> If I reverse the order of being called, it is still "x" that fails, and
>> "r" still succeeds.
>>
>> os.access() always reports that the file is readable (i.e. "true")
>>
>> If I simply call up the python interpreter (python 2.6 - Debian stable)
>> and manually "import x" - there is no problem - both work.  Similarly
>> typing the __import__("x") works when typed directly at the python
>> prompt.  Both 'x' and 'r' pass pychecker with no errors.  The same
>> error occurs with winpdb - the exception message says that the file
>> could not be found.  'file' reports that both files are text files, and
>> there aren't any strange file-access permissions/attributes.
>>
>> Here's the function that is failing:
>>
>> def named_import(fname, description) :
>>  import  os
>>  pname= fname + '.py'
>>  print "ENTRY FILE", pname, ": acces=", os.access(pname, os.R_OK)
>>  try :
>>  X=__import__(fname)
>>  x= [ X.cmnds, X.variables ]
>>  except ImportError :
>>  print "failed"
>>  return x
>>
>> This is the first time I've needed to import a file whose name couldn't
>> be specified in the script, so there's a chance that I've done
>> something wrong, but it seems very weird that it works in the CL
>> interpreter and not in my script.
>>
>> TIA for any hints or pointers to the relevant overlooked documentation!
> 
> I can't reproduce your problem on my configuration. Anyway, you should
> note that if x.pyc and r.pyc are present, __import__ will try to import
> them and not the files x.py and r.py. Try deleting x.pyc and r.pyc.
> 
> Kiuhnm

You are fast in replying!  I nuked my query (within a few minutes of 
posting) when I discovered the reason.  Perhaps it persisted in some
domain.

I'd forgotten that the python script containing the described function was
not the file itself, but a link to the script.  When I executed the script
(er, link to the script) - even with winpdb - apparently __import__ 
examined the directory where the actual file resided.  In _that_ 
directory, only 'r' existed, no 'x'.

So thanks for trying, there was no way you (or anyone) could have seen 
that the script was just a link to a script...

   -F
-- 
http://mail.python.org/mailman/listinfo/python-list


Minor issue with sqlite3 and datetime

2012-04-29 Thread Frank Millman
Hi all

I could not find a mailing list for sqlite3 - hope it is ok to post here.

My problem actually originates with a different problem relating to MS Sql 
Server. Python's datetime.datetime object uses a precision of microseconds. 
Sql Server's DATETIME type only uses a precision of milliseconds. When I try 
to insert a datetime object into a DATETIME column, the microsecond portion 
is rejected, but the rest succeeds, so the object is stored to the nearest 
second.

My current workaround is, instead of inserting the datetime object directly, 
I use the following -

  dtm = dtm.isoformat(sep=' ')[:23]

This converts it to a string, and strips off the last three digits of the 
microseconds, converting them to milliseconds. It is not elegant, but it 
works.

I also use PostgreSQL and sqlite3. I don't want to code different routines 
for each one, so I use the same workaround for all three. It works with 
PostgreSQL, but not with sqlite3. Here is why.

sqlite3/dbapi2.py contains the following -

def convert_timestamp(val):
[...]
if len(timepart_full) == 2:
microseconds = int(timepart_full[1])
else:
microseconds = 0

It assumes that 'timepart_full[1]' is a string containing 6 digits. After my 
workaround, it only contains 3 digits, so it gives the wrong result.

I think that it should right-pad the string with zeroes to bring it up to 6 
digits before converting to an int, like this -

microseconds = int('{:0<6}'.format(timepart_full[1]))

Any chance of this being accepted?

Frank Millman



-- 
http://mail.python.org/mailman/listinfo/python-list


Some posts do not show up in Google Groups

2012-04-29 Thread Frank Millman
Hi all

For a while now I have been using Google Groups to read this group, but on the 
odd occasion when I want to post a message, I use Outlook Express, as I know 
that some people reject all messages from Google Groups due to the high spam 
ratio (which seems to have improved recently, BTW).

>From time to time I see a thread where the original post is missing, but the 
>follow-ups do appear. My own posts have shown up with no problem.

Now, in the last month, I have posted two messages using Outlook Express, and 
neither of them have shown up in Google Groups. I can see replies in OE, so 
they are being accepted. I send to the group gmane.comp.python.general.

Does anyone know a reason for this, or have a solution?

Frank Millman
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Some posts do not show up in Google Groups

2012-04-30 Thread Frank Millman
On Apr 30, 8:20 am, Frank Millman  wrote:
> Hi all
>
> For a while now I have been using Google Groups to read this group, but on 
> the odd occasion when I want to post a message, I use Outlook Express, as I 
> know that some people reject all messages from Google Groups due to the high 
> spam ratio (which seems to have improved recently, BTW).
>
> From time to time I see a thread where the original post is missing, but the 
> follow-ups do appear. My own posts have shown up with no problem.
>
> Now, in the last month, I have posted two messages using Outlook Express, and 
> neither of them have shown up in Google Groups. I can see replies in OE, so 
> they are being accepted. I send to the group gmane.comp.python.general.
>
> Does anyone know a reason for this, or have a solution?
>
> Frank Millman

Thanks for the replies. I am also coming to the conclusion that Google
Groups is no longer fit-for-purpose.

Ironically, here are two replies that I can see in Outlook Express,
but do not appear in Google Groups.

Reply from Benjamin Kaplan -
> I believe the mail-to-news gateway has trouble with HTML messages. Try 
> sending everything as plain text and see if that works.

I checked, and all my posts were sent in plain text.

Reply from Terry Reedy -
> Read and post through news.gmane.org

I have had a look at this before, but there is one thing that Google
Groups does that no other reader seems to do, and that is that
messages are sorted according to thread-activity, not original posting
date. This makes it easy to see what has changed since the last time I
checked.

All the other ones I have looked at - Outlook Express, Thunderbird,
and gmane.org, sort by original posting date, so I have to go
backwards to see if any threads have had any new postings.

Maybe there is a setting that I am not aware of. Can anyone enlighten
me?

Thanks

Frank
-- 
http://mail.python.org/mailman/listinfo/python-list


OT, but very funny

2011-06-29 Thread Frank Millman

Hope you find the following as amusing as I did -

http://www.davidnaylor.co.uk/eu-cookies-directive-interactive-guide-to-25th-may-and-what-it-means-for-you.html

Background -

On 26 May new legislation came into force in the UK regulating how web sites 
use cookies. It is the outcome of amendments to the EU Privacy and 
Electronic Communications Directive, and is designed to help protect 
individual privacy.


The key effect of the legislation is that web site operators need user 
consent to store cookies on their devices. Where previously web sites had to 
offer users the facility to opt out of cookie use, now web site operators 
need to ask for permission before they implement any cookies. So the opt out 
system has been replaced with an opt in system.


Enjoy

Frank Millman


--
http://mail.python.org/mailman/listinfo/python-list


test

2011-06-30 Thread Frank Müller
something to test?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: subprocess & isql

2011-07-15 Thread Frank Millman


"peterff66"  wrote in message 
news:ivo9or+j...@egroups.com...

Hello Python community,

I am working on a project that retrieves data from remote Sybase. I have 
to use isql and subprocess to do this. Is the following correct?


1. call subprocess.popn to run isql, connect to Sybase
2. run sql ("select ...from ...")
3. write retrieved data to subprocess.pipe
4. retrieve data from pipe

Did anybody do similar work before? Any ideas (or code piece) to share?


I did something vaguely similar a while ago. I don't know how helpful this 
will be, but it may give you some ideas.


Firstly, these are the main differences between your scenario and mine -

1. I was using 'osql' to connect to MS SQL Server.
2. I used os.popen4, not subprocess. Hopefully you can follow the guidelines 
in the manual to translate to subprocess.
3. In my case, I built up a long command to create and populate various 
database tables using StringIO, passed the string to popen4.stdout, and then 
read popen4.stdin and displayed it on the screen to read any output.
4. On looking at my code, I see that I started a sub-thread to read 
popen4.stdin. I can't remember why I did that, but it may have been 
something to do with getting the output in realtime instead of waiting for 
the command to finish executing.


Here is the code that I used -

   import os
   import threading
   from cStringIO import StringIO

   def read_stdout(stdout):
   while True:
   line = stdout.readline()
   if not line:
   break
   print line

   os.environ['OSQLPASSWORD'] = pwd
   sql_stdin, sql_stdout = os.popen4('osql -U %s -d %s -n' % (user, 
database))


   s = StringIO()
   [call function to build up commands]

   threading.Thread(target=read_stdout, args=(sql_stdout,)).start()

   s.seek(0)
   sql_stdin.writelines(s.readlines())
   s.close()
   sql_stdin.close()

HTH

Frank Millman


--
http://mail.python.org/mailman/listinfo/python-list


Convert '165.0' to int

2011-07-21 Thread Frank Millman

Hi all

I want to convert '165.0' to an integer.

The obvious method does not work -


x = '165.0'
int(x)

Traceback (most recent call last):
 File "", line 1, in 
ValueError: invalid literal for int() with base 10: '165.0'

If I convert to a float first, it does work -


int(float(x))

165




Is there a short cut, or must I do this every time (I have lots of them!) ? 
I know I can write a function to do this, but is there anything built-in?


Thanks

Frank Millman


--
http://mail.python.org/mailman/listinfo/python-list


Re: Convert '165.0' to int

2011-07-21 Thread Frank Millman
On Jul 21, 11:47 am, Leo Jay  wrote:
> On Thu, Jul 21, 2011 at 5:31 PM, Frank Millman  wrote:
>
> > Hi all
>
> > I want to convert '165.0' to an integer.
>
> > The obvious method does not work -
>
> >>>> x = '165.0'
> >>>> int(x)
>
> > Traceback (most recent call last):
> >  File "", line 1, in 
> > ValueError: invalid literal for int() with base 10: '165.0'
>
> > If I convert to a float first, it does work -
>
> >>>> int(float(x))
>
> > 165
>
> > Is there a short cut, or must I do this every time (I have lots of them!) ? 
> > I know I can write a function to do this, but is there anything built-in?
>
> > Thanks
>
> > Frank Millman
>
> How about int(x[:-2])?
>
> --
> Best Regards,
> Leo Jay- Hide quoted text -
>
> - Show quoted text -

Nice idea, but it seems to be marginally slower[1] than int(float(x)),
so I think I will stick with the latter.

Frank

[1] See separate thread on apparent inconsisteny in timeit timings.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Convert '165.0' to int

2011-07-21 Thread Frank Millman
On Jul 21, 11:53 am, Thomas Jollans  wrote:
> On 21/07/11 11:31, Frank Millman wrote:
>
> > Hi all
>
> > I want to convert '165.0' to an integer.
>
> Well, it's not an integer. What does your data look like? How do you
> wish to convert it to int? Do they all represent decimal numbers? If so,
> how do you want to round them? What if you get '165.xyz' as input?
> Should that raise an exception? Should it evaluate to 165? Should it use
> base 36?
>
> > If I convert to a float first, it does work -
>
> >>>> int(float(x))
> > 165
>
> > Is there a short cut, or must I do this every time (I have lots of
> > them!) ? I know I can write a function to do this, but is there anything
> > built-in?
>
> What's wrong with this? It's relatively concise, and it shows exactly
> what you're trying to do.

I am processing an xml file from a third party, and it is full of co-
ordinates in the form 'x="165.0" y="229.0"'.

I don't mind using int(float(x)), I just wondered if there was a
shorter alternative.

If there is an alternative, I will be happy for it to raise an
exception if the fractional part is not 0.

Thanks

Frank
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Convert '165.0' to int

2011-07-21 Thread Frank Millman
>
> [1] See separate thread on apparent inconsisteny in timeit timings.- Hide 
> quoted text -
>

I must have done something wrong - it is consistent now.

Here are the results -

C:\Python32\Lib>timeit.py "int(float('165.0'))"
10 loops, best of 3: 3.51 usec per loop

C:\Python32\Lib>timeit.py "int('165.0'[:-2])"
10 loops, best of 3: 4.63 usec per loop

Frank
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Convert '165.0' to int

2011-07-21 Thread Frank Millman
On Jul 21, 10:00 pm, Terry Reedy  wrote:
> On 7/21/2011 10:13 AM, Grant Edwards wrote:
>
> > On 2011-07-21, Web Dreamer  wrote:
> >> Leo Jay a ?crit ce jeudi 21 juillet 2011 11:47 dans
>
> >> int(x.split('.')[0])
>
> >> But, the problem is the same as with int(float(x)), the integer number is
> >> still not as close as possible as the original float value.
>
> > Nobody said that "close as possible to the original float value" was
> > the goal.  Perhaps the OP just wants it truncated.
>
> The OP did not specify the domain of possible inputs nor the desired
> output for all possible inputs. Without that, function design is
> guessing. The appropriate response to the original post would have been
> a request for clarification.
>
> If the domain is strings with and int followed by '.0', then chopping
> off two chars is sufficient. This was sort of implied by the original
> post, since it was the only example, and assumed by the respondant.
>
> If the domain is int literals followed by '.' and some number of zeroes,
> then split works. So does int(float(s)). Split also works for non-digits
> following '.' whereas int(float(s)) does not.
>
> If the domain is all float literals, then ??.
>
> --
> Terry Jan Reedy

As the OP, I will clarify what *my* requirement is. This discussion
has gone off at various tangents beyond what I was asking for.

As suggested above, I am only talking about a string containing int
literals followed by '.' followed by zero or more zeros.

I think that there is a case for arguing that this is a valid
representation of an integer. It would therefore not be unreasonable
for python to accept int('165.0') and return 165. I would expect it to
raise an exception if there were any non-zero digits after the point.

However, the fact is that python does not accept this, and I am not
asking for a change.

int(float(x)) does the job, and I am happy with that. I was just
asking if there were any alternatives.

Frank
-- 
http://mail.python.org/mailman/listinfo/python-list


Question about timeit

2011-07-21 Thread Frank Millman

Hi all

I mentioned in a recent post that I noticed an inconsistency in timeit, and 
then reported that I must have made a mistake.


I have now identified my problem, but I don't understand it.

C:\Python32\Lib>timeit.py "int(float('165.0'))"
10 loops, best of 3: 3.52 usec per loop

C:\Python32\Lib>timeit.py "int(float('165.0'))"
10 loops, best of 3: 3.51 usec per loop

C:\Python32\Lib>timeit.py 'int(float("165.0"))'
1000 loops, best of 3: 0.0888 usec per loop

C:\Python32\Lib>timeit.py 'int(float("165.0"))'
1000 loops, best of 3: 0.0887 usec per loop

I ran them both twice just to be sure.

The first two use double-quote marks to surround the statement, and 
single-quote marks to surround the literal inside the statement.


The second two swap the quote marks around.

Can someone explain the difference?

I am using python 3.2 on Windows Server 2003.

Thanks

Frank Millman


--
http://mail.python.org/mailman/listinfo/python-list


Re: Question about timeit

2011-07-22 Thread Frank Millman
On Jul 22, 8:37 am, Stefan Behnel  wrote:
> Frank Millman, 22.07.2011 08:06:
>
>
>
>
>
> > I mentioned in a recent post that I noticed an inconsistency in timeit, and
> > then reported that I must have made a mistake.
>
> > I have now identified my problem, but I don't understand it.
>
> > C:\Python32\Lib>timeit.py "int(float('165.0'))"
> > 10 loops, best of 3: 3.52 usec per loop
>
> > C:\Python32\Lib>timeit.py "int(float('165.0'))"
> > 10 loops, best of 3: 3.51 usec per loop
>
> > C:\Python32\Lib>timeit.py 'int(float("165.0"))'
> > 1000 loops, best of 3: 0.0888 usec per loop
>
> > C:\Python32\Lib>timeit.py 'int(float("165.0"))'
> > 1000 loops, best of 3: 0.0887 usec per loop
>
> > I ran them both twice just to be sure.
>
> > The first two use double-quote marks to surround the statement, and
> > single-quote marks to surround the literal inside the statement.
>
> > The second two swap the quote marks around.
>
> > Can someone explain the difference?
>
> > I am using python 3.2 on Windows Server 2003.
>
> As expected, I can't reproduce this (on Linux). Maybe your processor
> switched from power save mode to performance mode right after running the
> test a second time? Or maybe you need a better console application that
> handles quotes in a more obvious way?
>
> Note that it's common to run timeit like this: "python -m timeit".
>
> Stefan- Hide quoted text -
>
> - Show quoted text -

I tried "python -m timeit", and got exactly the same result as before.

I am using a desktop, not a laptop, so there is no power-saving mode
going on.

I am using the standard Windows 'Command Prompt' console to run this.

I tried it with python 2.6, and still get the same result.

My guess is that it is something to do with the console, but I don't
know what. If I get time over the weekend I will try to get to the
bottom of it.

Frank
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Question about timeit

2011-07-22 Thread Frank Millman
On Jul 22, 10:34 am, Stefan Behnel  wrote:
> Thomas Rachel, 22.07.2011 10:08:
>
>
>
>
>
> > Am 22.07.2011 08:59 schrieb Frank Millman:
>
> >> My guess is that it is something to do with the console, but I don't
> >> know what. If I get time over the weekend I will try to get to the
> >> bottom of it.
>
> > I would guess that in the first case, python (resp. timeit.py) gets the
> > intended code for execution: int(float('165.0')). I. e., give the string to
> > float() and its result to int().
>
> > In the second case, however, timeit.py gets the string
> > 'int(float("165.0"))' and evaluates it - which is a matter of
> > sub-microseconds.
>
> > The reason for this is that the Windows "shell" removes the "" in the first
> > case, but not the '' in the second case.
>
> Good call. Or maybe it actually gets the code 'int(float(165.0))' in the
> second case, so it doesn't need to parse the string into a float. But given
> the huge difference in the timings, I would second your guess that it just
> evaluates the plain string itself instead of the code.
>
> Stefan- Hide quoted text -
>
> - Show quoted text -

This is what I get after modifying timeit.py as follows -

if args is None:
args = sys.argv[1:]
+   print(args)

C:\>python -m timeit int(float('165.0'))
["int(float('165.0'))"]
10 loops, best of 3: 3.43 usec per loop

C:\>python -m timeit int(float("165.0"))
['int(float(165.0))']
100 loops, best of 3: 1.97 usec per loop

C:\>python -m timeit "int(float('165.0'))"
["int(float('165.0'))"]
10 loops, best of 3: 3.45 usec per loop

It seems that the lesson is -

1. Use double-quotes around the command itself - may not be necessary
if the command does not contain spaces.
2. Use single-quotes for any literals in the command.

Frank
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Question about timeit

2011-07-22 Thread Frank Millman
On Jul 22, 2:43 pm, Thomas Jollans  wrote:
> On 22/07/11 14:30, Frank Millman wrote:
>
>
>
>
>
> > This is what I get after modifying timeit.py as follows -
>
> >     if args is None:
> >         args = sys.argv[1:]
> > +       print(args)
>
> > C:\>python -m timeit int(float('165.0'))
> > ["int(float('165.0'))"]
> > 10 loops, best of 3: 3.43 usec per loop
>
> > C:\>python -m timeit int(float("165.0"))
> > ['int(float(165.0))']
> > 100 loops, best of 3: 1.97 usec per loop
>
> > C:\>python -m timeit "int(float('165.0'))"
> > ["int(float('165.0'))"]
> > 10 loops, best of 3: 3.45 usec per loop
>
> > It seems that the lesson is -
>
> > 1. Use double-quotes around the command itself - may not be necessary
> > if the command does not contain spaces.
> > 2. Use single-quotes for any literals in the command.
>
> What about 'int(float("165.0"))' (single quotes around the argument)?
> Does that pass the single quotes around the argument to Python? Or does
> it eliminate all quotes?- Hide quoted text -
>
> - Show quoted text -

Here is the result -

C:\>python -m timeit 'int(float("165.0"))'
["'int(float(165.0))'"]
1000 loops, best of 3: 0.0891 usec per loop

Frank
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Convert '165.0' to int

2011-07-22 Thread Frank Millman
On Jul 22, 9:59 pm, Terry Reedy  wrote:
> On 7/22/2011 1:55 AM, Frank Millman wrote:
>
> > As the OP, I will clarify what *my* requirement is. This discussion
> > has gone off at various tangents beyond what I was asking for.
>
> Typical. Don't worry about it ;-).
>
> > As suggested above, I am only talking about a string containing int
> > literals followed by '.' followed by zero or more zeros.
> > int(float(x)) does the job,
>
> Not given that specification.
>
>  >>> s='123456789012345678901.0'
>  >>> int(float(s))
> 123456789012345683968
>
> > and I am happy with that.
>
> You should only be if you add 'with fewer than 18 digits' after 'int
> literals' to your spec.
>
> > I was just asking if there were any alternatives.
>
>  >>> int(s.split('.')[0])
> 123456789012345678901
>

The problem with that is that it will silently ignore any non-zero
digits after the point. Of course int(float(x)) does the same, which I
had overlooked.

I do not expect any non-zero digits after the point, but if there are,
I would want to be warned, as I should probably be treating it as a
float, not an int.

To recap, the original problem is that it would appear that some third-
party systems, when serialising int's into a string format, add a .0
to the end of the string. I am trying to get back to the original int
safely.

The ideal solution is the one I sketched out earlier - modify python's
'int' function to accept strings such as '165.0'.

Do you think this would get any traction if I proposed it? Or would it
fall foul of the moratorium?

Frank
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Convert '165.0' to int

2011-07-23 Thread Frank Millman
On Jul 23, 9:42 am, Chris Angelico  wrote:
> On Sat, Jul 23, 2011 at 4:53 PM, Frank Millman  wrote:
> > The problem with that is that it will silently ignore any non-zero
> > digits after the point. Of course int(float(x)) does the same, which I
> > had overlooked.
>
> If you know that there will always be a trailing point, you can trim
> off any trailing 0s, then trim off a trailing '.', and then cast to
> int:
>
> int(s.rstrip('0').rstrip('.'))
>

I like it. 100% solution to the problem, and neater than the
alternatives.

Thanks

Frank
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Convert '165.0' to int

2011-07-23 Thread Frank Millman
On Jul 23, 10:23 am, Steven D'Aprano  wrote:
> Frank Millman wrote:
> > To recap, the original problem is that it would appear that some third-
> > party systems, when serialising int's into a string format, add a .0
> > to the end of the string. I am trying to get back to the original int
> > safely.
>
> > The ideal solution is the one I sketched out earlier - modify python's
> > 'int' function to accept strings such as '165.0'.
>
> > Do you think this would get any traction if I proposed it? Or would it
> > fall foul of the moratorium?
>
> No, and no. It would not get any traction -- indeed, many people, including
> myself, would oppose it. And it would not fall foul of the moratorium,
> because that is over.
>
> I can only point you to what I wrote in reference to somebody else's idea
> that changing Python was the most "convenient solution":
>
> http://www.mail-archive.com/python-list%40python.org/msg315552.html
>
> Python is a general purpose programming language. If int("1.0") does not do
> what you want, write a function that does, and use it instead! You said
> that you want a function that ignores a trailing .0 but warns you if
> there's some other decimal value. Easy:
>
> def my_int(astring):
>     # Untested
>     if astring.endswith(".0"):
>         astring = astring[:-2]
>     return int(astring)
>
> my_int("165.0") will return 165, as expected, while still raising an
> exception for float values "165.1" or "165.1E9".
>
> 90% of programming is deciding *precisely* what behaviour you want. Once
> you've done that, the rest is easy.
>
> Apart from debugging and writing documentation and making it fast enough,
> which is the other 90%.
>
> *wink*
>
> --
> Steven

No argument with any of that.

Frank
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Convert '165.0' to int

2011-07-24 Thread Frank Millman
On Jul 23, 5:12 pm, Billy Mays  wrote:
> On 7/23/2011 3:42 AM, Chris Angelico wrote:
>
>
>
> > int(s.rstrip('0').rstrip('.'))
>
> Also, it will (in?)correct parse strings such as:
>
> '16500'
>
> to 165.
>
> --
> Bill

True enough.

If I really wanted to be 100% safe, how about this -

def get_int(s):
if '.' in s:
num, dec = s.split('.', 1)
if dec != '':
if int(dec) != 0:
    raise ValueError('Invalid literal for int')
return int(num)
return int(s)

Frank
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Convert '165.0' to int

2011-07-24 Thread Frank Millman
On Jul 24, 9:34 am, Steven D'Aprano  wrote:
> Frank Millman wrote:
> > If I really wanted to be 100% safe, how about this -
>
> >     def get_int(s):
> >         if '.' in s:
> >             num, dec = s.split('.', 1)
> >             if dec != '':
> >                 if int(dec) != 0:
> >                     raise ValueError('Invalid literal for int')
> >             return int(num)
> >         return int(s)
>
> Consider what happens if you pass s = "42.-0".
>

G!

Ok, what if I change
if int(dec) != 0:
to
if [_ for _ in list(dec) if _ != '0']:

If I do this, I can get rid of the previous line -
if dec != ''

Am I getting closer?

Frank
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Convert '165.0' to int

2011-07-24 Thread Frank Millman
On Jul 24, 10:07 am, Chris Angelico  wrote:
> On Sun, Jul 24, 2011 at 5:58 PM, Frank Millman  wrote:
> >  if int(dec) != 0:
> > to
> >    if [_ for _ in list(dec) if _ != '0']:
>
> if dec.rtrim('0')!='':
>
> ChrisA

I think you meant 'rstrip', but yes, neater and faster.

Thanks

Frank
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Convert '165.0' to int

2011-07-24 Thread Frank Millman
On Jul 23, 8:28 pm, rantingrick  wrote:
> On Jul 23, 1:53 am, Frank Millman  wrote:
>
> >--
> > The ideal solution is the one I sketched out earlier - modify python's
> > 'int' function to accept strings such as '165.0'.
> >--
>
> NO! You create your OWN casting function for special cases.
>
> PythonZEN: "Special cases aren't special enough to break the rules."

BUT

"Although practicality beats purity".

I know I am flogging a dead horse here, but IMHO, '165', '165.',
'165.0', and '165.00' are all valid string representations of the
integer 165.[1]

Therefore, for practical purposes, it would not be wrong for python's
'int' function to accept these without complaining.

Just for fun, imagine that this had been done from python 1.x. Would
people now be clamouring for this 'wart' to be removed in python 3, or
would they say 'yeah, why not?'.

Frank

[1] Don't ask me why anyone would do this. I am dealing with a third-
party product that does exactly that.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Convert '165.0' to int

2011-07-24 Thread Frank Millman
On Jul 24, 10:53 am, Ben Finney  wrote:
> Frank Millman  writes:
> > I know I am flogging a dead horse here, but IMHO, '165', '165.',
> > '165.0', and '165.00' are all valid string representations of the
> > integer 165.[1]
>
> I disagree entirely. Once you introduce a decimal point into the
> representation, you're no longer representing an integer.
>
> (They might be the same *number*, but that's not saying the same thing.)
>

Fair enough. I never did CS101, so I am looking at this from a
layman's perspective. I am happy to be corrected.

Frank
-- 
http://mail.python.org/mailman/listinfo/python-list


  1   2   3   4   5   6   7   8   9   10   >