Re: Why re.match()?

2009-07-06 Thread kj
In  a...@pythoncraft.com (Aahz) writes:

>In article , kj   wrote:

>You may find this enlightening:

>http://www.python.org/doc/1.4/lib/node52.html

Indeed.  Thank you.

kj
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Clarity vs. code reuse/generality

2009-07-06 Thread Martin Vilcans
On Fri, Jul 3, 2009 at 4:05 PM, kj wrote:
> I'm will be teaching a programming class to novices, and I've run
> into a clear conflict between two of the principles I'd like to
> teach: code clarity vs. code reuse.  I'd love your opinion about
> it.

In general, code clarity is more important than reusability.
Unfortunately, many novice programmers have the opposite impression. I
have seen too much convoluted code written by beginners who try to
make the code generic. Writing simple, clear, to-the-point code is
hard enough as it is, even when not aiming at making it reusable.

If in the future you see an opportunity to reuse the code, then and
only then is the time to make it generic.

YAGNI is a wonderful principle.

-- 
mar...@librador.com
http://www.librador.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: A Bug By Any Other Name ...

2009-07-06 Thread Gabriel Genellina
En Mon, 06 Jul 2009 03:33:36 -0300, Gary Herron  
 escribió:

Gabriel Genellina wrote:
En Mon, 06 Jul 2009 00:28:43 -0300, Steven D'Aprano  
 escribió:

On Mon, 06 Jul 2009 14:32:46 +1200, Lawrence D'Oliveiro wrote:


I wonder how many people have been tripped up by the fact that

++n

and

--n

fail silently for numeric-valued n.


What do you mean, "fail silently"? They do exactly what you should  
expect:

++5  # positive of a positive number is positive


I'm not sure what "bug" you're seeing. Perhaps it's your expectations
that are buggy, not Python.


Well, those expectations are taken seriously when new features are  
introduced into the language - and sometimes the feature is dismissed  
just because it would be confusing for some.
If a += 1 works, expecting ++a to have the same meaning is very  
reasonable (for those coming from languages with a ++ operator, like C  
or Java) - more when ++a is a perfectly valid expression.
If this issue isn't listed under the various "Python gotchas" articles,  
it should...


Well sure, it's not unreasonable to expect ++n and --n to behave as in  
other languages, and since they don't, perhaps they should be listed as  
a "Python gotcha". But even so, it's quite arrogant of the OP to flaunt  
his ignorance of the language by claiming this is a bug and a failure.   
It shouldn't have been all that hard for him to figure out what was  
really happening.


That depends on what you call a "bug". In his classical book "The art of  
software testing", Myers says that a program has a bug when it doesn't  
perform as the user expects reasonably it to do (not an exact quote, I  
don't have the book at hand). That's a lot broader than developers like to  
accept.


In this case, a note in the documentation warning about the potential  
confusion would be fine.


--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


Re: Clarity vs. code reuse/generality

2009-07-06 Thread Andre Engels
On Mon, Jul 6, 2009 at 9:44 AM, Martin Vilcans wrote:
> On Fri, Jul 3, 2009 at 4:05 PM, kj wrote:
>> I'm will be teaching a programming class to novices, and I've run
>> into a clear conflict between two of the principles I'd like to
>> teach: code clarity vs. code reuse.  I'd love your opinion about
>> it.
>
> In general, code clarity is more important than reusability.
> Unfortunately, many novice programmers have the opposite impression. I
> have seen too much convoluted code written by beginners who try to
> make the code generic. Writing simple, clear, to-the-point code is
> hard enough as it is, even when not aiming at making it reusable.
>
> If in the future you see an opportunity to reuse the code, then and
> only then is the time to make it generic.

Not just that, when you actually get to that point, making simple and
clear code generic is often easier than making
complicated-and-supposedly-generic code that little bit more generic
that you need.


-- 
André Engels, andreeng...@gmail.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: A Bug By Any Other Name ...

2009-07-06 Thread Tim Golden

Gabriel Genellina wrote:

[... re confusion over ++n etc ...]

In this case, a note in the documentation warning about the potential 
confusion would be fine.


The difficulty here is knowing where to put such a warning.
You obviously can't put it against the "++" operator as such
because... there isn't one. You could put it against the unary
plus operator, but who's going to look there? :)

I've wondered for a while whether it wouldn't be a good move
to include the official (or any other) Python FAQ into the
standard docs set. If we did, that would be the obvious place
for this piece of documentation, seems to me.

TJG
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python and webcam capture delay?

2009-07-06 Thread Stef Mientki

jack catcher (nick) wrote:

Hi,

I'm thinking of using Python for capturing and showing live webcam 
stream simultaneously between two computers via local area network. 
Operating system is Windows. I'm going to begin with VideoCapture 
extension, no ideas about other implementation yet. Do you have any 
suggestions on how short delay I should hope to achieve in showing the 
video? This would be part of a psychological experiment, so I would 
need to deliver the video stream with a reasonable delay (say, below 
100ms).

I would first check if video capture extension works anyway.
I couldn't get it working, ...
... it seems to start ok
... but moving the captured window,
... or sometimes even just moving the mouse hangs the program :-(

So I'm still looking for a good / open source workiing video capture in 
Python.

I can make a good working capture in delphi,
make a DLL or ActiveX from that and use it in Python,
but I'm not allowed to ditribute these.

cheers,
Stef
--
http://mail.python.org/mailman/listinfo/python-list


Re: generation of keyboard events

2009-07-06 Thread RAM
On 5 July, 17:12, Tim Harig  wrote:
> On 2009-07-05, RAM  wrote:
>
> > I need to start an external program and pass the keyboard events like
> > F1,Right arrow key etc to the program..I am trying to use the
> > subprocess module to invoke the external program. I am able to invoke
> > but not able to generate the keyboard events and pass them on to the
>
> catb.org/esr/faqs/smart-questions.html
>
> You have told us nothing about the environment where you are trying to
> accomplish this.  GUI, CLI, Unix, Windows, etc? So I suggest that you
> checkout the curses getch functions.  You can find them in the standard
> library documentation athttp://docs.python.org.  You should also reference
> documentation for the C version in your systems man pages.

Hi Tim,

I am trying to do this on windows. My program(executable) has been
written in VC++ and when I run this program, I need to click on one
button on the program GUI i,e just I am entering "Enter key" on the
key board. But this needs manual process. So i need to write a python
script which invokes my program and pass "Enter key" event to my
program so that it runs without manual intervention.

Thank in advance for the help.

regards
Sreerama V
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Code that ought to run fast, but can't due to Python limitations.

2009-07-06 Thread David M . Cooke
Martin v. Löwis  v.loewis.de> writes:

> > This is a good test for Python implementation bottlenecks.  Run
> > that tokenizer on HTML, and see where the time goes.
> 
> I looked at it with cProfile, and the top function that comes up
> for a larger document (52k) is
> ...validator.HTMLConformanceChecker.__iter__.
[...]
> With this simple optimization, I get a 20% speedup on my test
> case. In my document, there are no attributes - the same changes
> should be made to attribute validation routines.
> 
> I don't think this has anything to do with the case statement.

I agree. I ran cProfile over just the tokenizer step; essentially

tokenizer = html5lib.tokenizer.HTMLStream(htmldata)
for tok in tokenizer:
pass

It mostly *isn't* tokenizer.py that's taking the most time, it's
inputstream.py. (There is one exception:
tokenizer.py:HTMLStream.__init__ constructs a dictionary of states
each time -- this is unnecessary, replace all expressions like
self.states["attributeName"] with self.attributeNameState.)

I've done several optimisations -- I'll upload the patch to the
html5lib issue tracker. In particular,

* The .position property of EncodingBytes is used a lot. Every
self.position +=1 calls getPosition() and setPosition(). Another
getPosition() call is done in the self.currentByte property. Most of
these can be optimised away by using methods that move the position
and return the current byte.

* In HTMLInputStream, the current line number and column are updated
every time a new character is read with .char(). The current position
is *only* used in error reporting, so I reworked it to only calculate
the position when .position() is called, by keeping track of the
number of lines in previous read chunks, and computing the number of
lines to the current offset in the current chunk.

These give me about a 20% speedup.

This just illustrates that the first step in optimisation is profiling :D

As other posters have said, slurping the whole document into memory
and using a regexp-based parser (such as pyparsing) would likely give
you the largest speedups. If you want to keep the chunk- based
approach, you can still use regexp's, but you'd have to think about
matching on chunk boundaries. One way would be to guarantee a minimum
number of characters available, say 10 or 50 (unless end-of-file, of
course) -- long enough such that any *constant* string you'd want to
match like 

Re: A Bug By Any Other Name ...

2009-07-06 Thread Lawrence D'Oliveiro
In message , Tim Golden 
wrote:

> The difficulty here is knowing where to put such a warning.
> You obviously can't put it against the "++" operator as such
> because... there isn't one.

This bug is an epiphenomenon. :)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: A Bug By Any Other Name ...

2009-07-06 Thread Chris Rebert
On Mon, Jul 6, 2009 at 1:29 AM, Lawrence
D'Oliveiro wrote:
> In message , Tim Golden
> wrote:
>
>> The difficulty here is knowing where to put such a warning.
>> You obviously can't put it against the "++" operator as such
>> because... there isn't one.
>
> This bug is an epiphenomenon. :)

Well, like I suggested, it /could/ be made an operator (or rather, a
lexer token) which just causes a compile/parse error.

Cheers,
Chris
-- 
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: A Bug By Any Other Name ...

2009-07-06 Thread alex23
On Jul 6, 5:56 pm, Tim Golden  wrote:
> Gabriel Genellina wrote:
> > In this case, a note in the documentation warning about the potential
> > confusion would be fine.
>
> The difficulty here is knowing where to put such a warning.
> You obviously can't put it against the "++" operator as such
> because... there isn't one. You could put it against the unary
> plus operator, but who's going to look there? :)

The problem is: where do you stop? If you're going to add something to
the documentation to address every expectation someone might hold
coming from another language, the docs are going to get pretty big.

I think a language should be intuitive within itself, but not be
required to be intuitable based on _other_ languages (unless, of
course, that's an objective of the language). If I expect something in
language-A to operate the same way as completely-unrelated-language-B,
I'd see that as a failing on my behalf, especially if I hadn't read
language-A's documentation first. I'm not adverse to one language
being _explained_ in terms of another, but would much prefer to see
those relegated to "Python for  programmers" articles rather than
in the main docs themselves.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Code that ought to run fast, but can't due to Python limitations.

2009-07-06 Thread Lawrence D'Oliveiro
In message <4a4f91f9$0$1587$742ec...@news.sonic.net>, John Nagle wrote:

> ("It should be written in C" is not an acceptable answer.)

I don't see why not. State machines that have to process input byte by byte 
are well known to be impossible to implement efficiently in high-level 
languages. That's why lex/flex isn't worth using. Even GCC has been moving 
to hand-coded parsers, first the lexical analyzer, and more recently even 
for the syntax parser (getting rid of yacc/bison), certainly for C.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: generation of keyboard events

2009-07-06 Thread Simon Brunning
2009/7/6 RAM :

> I am trying to do this on windows. My program(executable) has been
> written in VC++ and when I run this program, I need to click on one
> button on the program GUI i,e just I am entering "Enter key" on the
> key board. But this needs manual process. So i need to write a python
> script which invokes my program and pass "Enter key" event to my
> program so that it runs without manual intervention.

Try .

-- 
Cheers,
Simon B.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: A Bug By Any Other Name ...

2009-07-06 Thread John Machin
On Jul 6, 12:32 pm, Lawrence D'Oliveiro  wrote:
> I wonder how many people have been tripped up by the fact that
>
>     ++n
>
> and
>
>     --n
>
> fail silently for numeric-valued n.

What fail? In Python, ++n and --n are fatuous expressions which
SUCCEED silently except for rare circiumstances e.g. --n will cause an
overflow exception on older CPython versions if isinstance(n, int) and
n == -sys.maxint - 1.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why is my code faster with append() in a loop than with a large list?

2009-07-06 Thread Vilya Harvey
2009/7/6 Xavier Ho :
> Why is version B of the code faster than version A? (Only three lines
> different)

Here's a guess:

As the number you're testing gets larger, version A is creating very
big list. I'm not sure exactly how much overhead each list entry has
in python, but I guess it's at least 8 bytes: a 32-bit reference for
each list entry, and 32 bits to hold the int value (assuming a 32-bit
version of python). The solution you're looking for is a large 8 digit
number; let's say 80,000,000, for the sake of easy calculation. That
means, as you get close to the solution, you'll be trying to allocate
almost 640 Mb of memory for every number you're checking. That's going
to make the garbage collector work extremely hard. Also, depending on
how much memory your computer has free, you'll probably start hitting
virtual memory too, which will slow you down even further. Finally,
the reduce step has to process all 80,000,000 elements which is
clearly going to take a while.

Version b creates a list which is only as long as the largest prime
factor, so at worst the list size will be approx. sqrt(80,000,000),
which is approx. 8900 elements or approx. 72 Kb or memory - a much
more manageable size.

Hope that helps,

Vil.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: A Bug By Any Other Name ...

2009-07-06 Thread Terry Reedy

Gabriel Genellina wrote:


In this case, a note in the documentation warning about the potential 
confusion would be fine.


How would that help someone who does not read the doc?


--
http://mail.python.org/mailman/listinfo/python-list


Tree structure consuming lot of memory

2009-07-06 Thread mayank gupta
Hi,

I am creating a tree data-structure in python; with nodes of the tree
created by a simple class :

class Node :
   def __init__(self ,  other attributes):
  # initialise the attributes here!!

But the problem is I am working with a huge tree (millions of nodes); and
each node is consuming much more memory than it should. After a little
analysis, I found out that in general it uses about 1.4 kb of memory for
each node!!
I will be grateful if someone could help me optimize the memory usage.
Thanks.

Regards,
Mayank


-- 
I luv to walk in rain bcoz no one can see me crying
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: A Bug By Any Other Name ...

2009-07-06 Thread Steven D'Aprano
On Mon, 06 Jul 2009 02:19:51 -0300, Gabriel Genellina wrote:

> En Mon, 06 Jul 2009 00:28:43 -0300, Steven D'Aprano
>  escribió:
>> On Mon, 06 Jul 2009 14:32:46 +1200, Lawrence D'Oliveiro wrote:
>>
>>> I wonder how many people have been tripped up by the fact that
>>>
>>> ++n
>>>
>>> and
>>>
>>> --n
>>>
>>> fail silently for numeric-valued n.
>>
>> What do you mean, "fail silently"? They do exactly what you should
>> expect:
> ++5  # positive of a positive number is positive
>>
>> I'm not sure what "bug" you're seeing. Perhaps it's your expectations
>> that are buggy, not Python.
> 
> Well, those expectations are taken seriously when new features are
> introduced into the language - and sometimes the feature is dismissed
> just because it would be confusing for some. If a += 1 works, expecting
> ++a to have the same meaning is very reasonable (for those coming from
> languages with a ++ operator, like C or Java) - more when ++a is a
> perfectly valid expression. If this issue isn't listed under the various
> "Python gotchas" articles, it should...

The fact that it isn't suggests strongly to me that it isn't that common 
a surprise even for Java and C programmers. This is the first time I've 
seen anyone raise it as an issue.

There are plenty of other languages other than Java and C. If we start 
listing every feature of Python that's different from some other 
language, we'll never end.

For what it's worth, Ruby appears to behave the same as Python:

$ irb
irb(main):001:0> n = 5
=> 5
irb(main):002:0> ++n
=> 5
irb(main):003:0> --n
=> 5
irb(main):004:0> -+n
=> -5



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tree structure consuming lot of memory

2009-07-06 Thread Chris Rebert
On Mon, Jul 6, 2009 at 2:55 AM, mayank gupta wrote:
> Hi,
>
> I am creating a tree data-structure in python; with nodes of the tree
> created by a simple class :
>
> class Node :
>    def __init__(self ,  other attributes):
>   # initialise the attributes here!!
>
> But the problem is I am working with a huge tree (millions of nodes); and
> each node is consuming much more memory than it should. After a little
> analysis, I found out that in general it uses about 1.4 kb of memory for
> each node!!
> I will be grateful if someone could help me optimize the memory usage.

(1) Use __slots__ (see http://docs.python.org/reference/datamodel.html#slots)
(2) Use some data structure other than a tree
(3) Rewrite your Node/Tree implementation in C

Cheers,
Chris
-- 
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why is my code faster with append() in a loop than with a large list?

2009-07-06 Thread Dave Angel

Xavier Ho wrote:

(Here's a short version of the long version below if you don't want to
read:)

Why is version B of the code faster than version A? (Only three lines
different)

Version A: http://pastebin.com/f14561243
Version B: http://pastebin.com/f1f657afc



I was doing the problems on Project Euler for practice with Python last
night. Problem 12 was to find the value of the first triangular number that
has over 500 divisors.
=

The sequence of triangle numbers is generated by adding the natural numbers.
So the 7[image: ^(]th[image: )] triangle number would be 1 + 2 + 3 + 4 + 5 +
6 + 7 = 28. The first ten terms would be:

1, 3, 6, 10, 15, 21, 28, 36, 45, 55, ...

Let us list the factors of the first seven triangle numbers:

* 1*: 1
* 3*: 1,3
* 6*: 1,2,3,6
*10*: 1,2,5,10
*15*: 1,3,5,15
*21*: 1,3,7,21
*28*: 1,2,4,7,14,28

We can see that 28 is the first triangle number to have over five divisors.

What is the value of the first triangle number to have over five hundred
divisors?
=

My initial code was to loop through from 1 to half the number and see which
were divisors, and as I find them I store them in a list. That would have
taken days.

My second try was factorising the number each time, and count the divisors
using the powers of each factor, plus 1, and multiply together.
The code is here (Version A): http://pastebin.com/f14561243

This worked, but it took overnight to compute. Before I went to bed a friend
of mine caught me online, and apparently left me a working version under 8
seconds with only 3 line difference.
The code is here (Version B): http://pastebin.com/f1f657afc

That was amazing. But I have no idea why his edit makes it so much faster. I
did a test to see whether if append() was faster (which I doubted) than
defining a list with a large size to begin with, and I was right:
http://pastebin.com/f4b46d0db
Which shows that appending is 40x slower, and was expected. But I still
can't puzzle out why his use of appending in Version B was so much faster
than mine.

Any insights would be welcome. I'm going on a family trip, though, so my
replies may delay.

Best regards,

Ching-Yun "Xavier" Ho, Technical Artist

Contact Information
Mobile: (+61) 04 3335 4748
Skype ID: SpaXe85
Email: cont...@xavierho.com
Website: http://xavierho.com/

  
Just by inspection, it would seem the bottleneck in your first version 
is that you return a huge list of nearly all zeroes, from factorize().  
This slows down countDivisors() a great deal.


It would probably save some time to not bother storing the zeroes in the 
list at all.  And it should help if you were to step through a list of 
primes, rather than trying every possible int.  Or at least constrain 
yourself to odd numbers (after the initial case of 2).


DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: Tree structure consuming lot of memory

2009-07-06 Thread mayank gupta
Thanks for the other possibilites. I would consider option (2) and (3) to
improve my code.

But out of curiosity, I would still like to know why does an object of a
Python-class consume "so" much of memory (1.4 kb), and this memory usage has
nothing to do with its attributes.

Thanks

Regards.

On Mon, Jul 6, 2009 at 12:03 PM, Chris Rebert  wrote:

> On Mon, Jul 6, 2009 at 2:55 AM, mayank gupta wrote:
> > Hi,
> >
> > I am creating a tree data-structure in python; with nodes of the tree
> > created by a simple class :
> >
> > class Node :
> >def __init__(self ,  other attributes):
> >   # initialise the attributes here!!
> >
> > But the problem is I am working with a huge tree (millions of nodes); and
> > each node is consuming much more memory than it should. After a little
> > analysis, I found out that in general it uses about 1.4 kb of memory for
> > each node!!
> > I will be grateful if someone could help me optimize the memory usage.
>
> (1) Use __slots__ (see
> http://docs.python.org/reference/datamodel.html#slots)
> (2) Use some data structure other than a tree
> (3) Rewrite your Node/Tree implementation in C
>
> Cheers,
> Chris
> --
> http://blog.rebertia.com
>



-- 
I luv to walk in rain bcoz no one can see me crying
-- 
http://mail.python.org/mailman/listinfo/python-list


Opening a SQLite database in readonly mode

2009-07-06 Thread Paul Moore
The SQLite documentation mentions a flag, SQLITE_OPEN_READONLY, to
open a database read only. I can't find any equivalent documented in
the Python standard library documentation for the sqlite3 module (or,
for that matter, on the pysqlite library's website).

Is it possible to open a sqlite database in readonly mode, in Python?

Thanks,
Paul.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why is my code faster with append() in a loop than with a large list?

2009-07-06 Thread Xavier Ho
Thanks for the response all, I finally got my 'net working on the mountains,
and I think your reasons are quite sound. I'll keep that in mind for the
future.

Best regards,

Ching-Yun "Xavier" Ho, Technical Artist

Contact Information
Mobile: (+61) 04 3335 4748
Skype ID: SpaXe85
Email: cont...@xavierho.com
Website: http://xavierho.com/


On Mon, Jul 6, 2009 at 6:09 PM, Dave Angel  wrote:

> Xavier Ho wrote:
>
>> (Here's a short version of the long version below if you don't want to
>> read:)
>>
>> Why is version B of the code faster than version A? (Only three lines
>> different)
>>
>> Version A: http://pastebin.com/f14561243
>> Version B: http://pastebin.com/f1f657afc
>>
>> 
>>
>> I was doing the problems on Project Euler for practice with Python last
>> night. Problem 12 was to find the value of the first triangular number
>> that
>> has over 500 divisors.
>>
>> =
>>
>> The sequence of triangle numbers is generated by adding the natural
>> numbers.
>> So the 7[image: ^(]th[image: )] triangle number would be 1 + 2 + 3 + 4 + 5
>> +
>>
>> 6 + 7 = 28. The first ten terms would be:
>>
>> 1, 3, 6, 10, 15, 21, 28, 36, 45, 55, ...
>>
>> Let us list the factors of the first seven triangle numbers:
>>
>> * 1*: 1
>> * 3*: 1,3
>> * 6*: 1,2,3,6
>> *10*: 1,2,5,10
>> *15*: 1,3,5,15
>> *21*: 1,3,7,21
>> *28*: 1,2,4,7,14,28
>>
>> We can see that 28 is the first triangle number to have over five
>> divisors.
>>
>> What is the value of the first triangle number to have over five hundred
>> divisors?
>>
>> =
>>
>> My initial code was to loop through from 1 to half the number and see
>> which
>> were divisors, and as I find them I store them in a list. That would have
>> taken days.
>>
>> My second try was factorising the number each time, and count the divisors
>> using the powers of each factor, plus 1, and multiply together.
>> The code is here (Version A): http://pastebin.com/f14561243
>>
>> This worked, but it took overnight to compute. Before I went to bed a
>> friend
>> of mine caught me online, and apparently left me a working version under 8
>> seconds with only 3 line difference.
>> The code is here (Version B): http://pastebin.com/f1f657afc
>>
>> That was amazing. But I have no idea why his edit makes it so much faster.
>> I
>> did a test to see whether if append() was faster (which I doubted) than
>> defining a list with a large size to begin with, and I was right:
>> http://pastebin.com/f4b46d0db
>> Which shows that appending is 40x slower, and was expected. But I still
>> can't puzzle out why his use of appending in Version B was so much faster
>> than mine.
>>
>> Any insights would be welcome. I'm going on a family trip, though, so my
>> replies may delay.
>>
>> Best regards,
>>
>> Ching-Yun "Xavier" Ho, Technical Artist
>>
>> Contact Information
>> Mobile: (+61) 04 3335 4748
>> Skype ID: SpaXe85
>> Email: cont...@xavierho.com
>> Website: http://xavierho.com/
>>
>>
>>
> Just by inspection, it would seem the bottleneck in your first version is
> that you return a huge list of nearly all zeroes, from factorize().  This
> slows down countDivisors() a great deal.
>
> It would probably save some time to not bother storing the zeroes in the
> list at all.  And it should help if you were to step through a list of
> primes, rather than trying every possible int.  Or at least constrain
> yourself to odd numbers (after the initial case of 2).
>
> DaveA
>
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: A Bug By Any Other Name ...

2009-07-06 Thread Hendrik van Rooyen
"Terry Reedy"  wrote:

> Gabriel Genellina wrote:
> >
> > In this case, a note in the documentation warning about the potential 
> > confusion would be fine.
> 
> How would that help someone who does not read the doc?

It obviously won't.

All it will do, is that it will enable people on this group,
who may read the manual, to tell people who complain,
to RTFM.

 I agree that it would be a good idea to make it an 
error, or a warning - "this might not do what you
think it does", or an "are you sure?" exception.

  :-)

- Hendrik



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How Python Implements "long integer"?

2009-07-06 Thread Pedram
OK, fine, I read longobject.c at last! :)
I found that longobject is a structure like this:

struct _longobject {
struct _object *_ob_next;
struct _object *_ob_prev;
Py_ssize_t ob_refcnt;
struct _typeobject *ob_type;
digit ob_digit[1];
}

And a digit is a 15-item array of C's unsigned short integers.
Am I right? Or I missed something! Is this structure is constant in
all environments (Linux, Windows, Mobiles, etc.)?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Clarity vs. code reuse/generality

2009-07-06 Thread Scott David Daniels

Andre Engels wrote:

On Mon, Jul 6, 2009 at 9:44 AM, Martin Vilcans wrote:

On Fri, Jul 3, 2009 at 4:05 PM, kj wrote:

I'm will be teaching a programming class to novices, and I've run
into a clear conflict between two of the principles I'd like to
teach: code clarity vs. code reuse.  I'd love your opinion about
it.

In general, code clarity is more important than reusability.
Unfortunately, many novice programmers have the opposite impression. I
have seen too much convoluted code written by beginners who try to
make the code generic. Writing simple, clear, to-the-point code is
hard enough as it is, even when not aiming at making it reusable.

If in the future you see an opportunity to reuse the code, then and
only then is the time to make it generic.


Not just that, when you actually get to that point, making simple and
clear code generic is often easier than making
complicated-and-supposedly-generic code that little bit more generic
that you need.


First, a quote which took me a bit to find:
Thomas William Körner paraphrasing Polya and Svego
in A Companion to Analysis:
Recalling that 'once is a trick, twice is a method,
thrice is a theorem, and four times a theory,' we
seek to codify this insight.

Let us apply this insight:
Suppose in writing code, we pretty much go with that.
A method is something you notice, a theorem is a function, and
a theory is a generalized function.

Even though we like DRY ("don't repeat yourself") as a maxim, let
it go the first time and wait until you see the pattern (a possible
function).  I'd go with a function first, a pair of functions, and
only then look to abstracting the function.

--Scott David Daniels
scott.dani...@acm.org
--
http://mail.python.org/mailman/listinfo/python-list


Re: Clarity vs. code reuse/generality

2009-07-06 Thread Jean-Michel Pichavant

kj wrote:

 I've rewritten it like this:

sense = cmp(func(hi), func(lo))
assert sense != 0, "func is not strictly monotonic in [lo, hi]"

Thanks for your feedback!

kj
  


As already said before, unlike other languages, sense in english does 
**not** mean direction. You should rewrite this part using a better 
name. Wrong informations are far worse than no information at all.


JM
--
http://mail.python.org/mailman/listinfo/python-list


Re: Creating alot of class instances?

2009-07-06 Thread Scott David Daniels

Steven D'Aprano wrote:

... That's the Wrong Way to do it --
you're using a screwdriver to hammer a nail


Don't knock tool abuse (though I agree with you here).
Sometimes tool abuse can produce good results.  For
example, using hammers to drive screws for temporary
strong holds led to making better nails.

--Scott David Daniels
scott.dani...@acm.org
--
http://mail.python.org/mailman/listinfo/python-list


VirtualEnv

2009-07-06 Thread Ronn Ross
I'm attempting to write a bootstrap script for virtualenv. I just want to do
a couple of easy_install's after the environment is created. It was fairly
easy to create the script, but I can't figure out how to implement it. The
documentation was not of much help. Can someone please point me in the right
direction?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Code that ought to run fast, but can't due to Python limitations.

2009-07-06 Thread Jean-Michel Pichavant



protocol = {"start":initialiser,"hunt":hunter,"classify":classifier,other
states}

def state_machine():
next_step = protocol["start"]()
while True:
next_step = protocol[next_step]()

  
Woot ! I'll keep this one in my mind, while I may not be that concerned 
by speed unlike the OP, I still find this way of doing very simple and 
so intuitive (one will successfully argue how I was not figuring this 
out by myself if it was so intuitive).
Anyway I wanted to participated to this thread, as soon as I saw 'due to 
python limitations' in the title, I foretold a hell of a thread ! This 
is just provocation ! :-)


JM
--
http://mail.python.org/mailman/listinfo/python-list


Re: A Bug By Any Other Name ...

2009-07-06 Thread pdpi
On Jul 6, 1:12 pm, "Hendrik van Rooyen"  wrote:
> "Terry Reedy"  wrote:
> > Gabriel Genellina wrote:
>
> > > In this case, a note in the documentation warning about the potential
> > > confusion would be fine.
>
> > How would that help someone who does not read the doc?
>
> It obviously won't.
>
> All it will do, is that it will enable people on this group,
> who may read the manual, to tell people who complain,
> to RTFM.
>
>  I agree that it would be a good idea to make it an
> error, or a warning - "this might not do what you
> think it does", or an "are you sure?" exception.
>
>   :-)
>
> - Hendrik

I dunno. Specifically recognizing (and emitting code code for) a token
that's not actually part of the language because people coming from
other languages think it exists seems like the start of a fustercluck.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How Python Implements "long integer"?

2009-07-06 Thread Mark Dickinson
On Jul 6, 1:24 pm, Pedram  wrote:
> OK, fine, I read longobject.c at last! :)
> I found that longobject is a structure like this:
>
> struct _longobject {
>     struct _object *_ob_next;
>     struct _object *_ob_prev;

For current CPython, these two fields are only present in debug
builds;  for a normal build they won't exist.

>     Py_ssize_t ob_refcnt;
>     struct _typeobject *ob_type;

You're missing an important field here (see the definition of
PyObject_VAR_HEAD):

Py_ssize_t ob_size; /* Number of items in variable part */

For the current implementation of Python longs, the absolute value of
this field gives the number of digits in the long;  the sign gives the
sign of the long (0L is represented with zero digits).

>     digit ob_digit[1];

Right.  This is an example of the so-called 'struct hack' in C; it
looks as though there's just a single digit, but what's intended here
is that there's an array of digits tacked onto the end of the struct;
for any given PyLongObject, the size of this array is determined at
runtime.  (C99 allows you to write this as simply ob_digit[], but not
all compilers support this yet.)

> }

> And a digit is a 15-item array of C's unsigned short integers.

No: a digit is a single unsigned short, which is used to store 15 bits
of the Python long.  Python longs are stored in sign-magnitude format,
in base 2**15.  So each of the base 2**15 'digits' is an integer in
the range [0, 32767).  The unsigned short type is used to store those
digits.

Exception: for Python 2.7+ or Python 3.1+, on 64-bit machines, Python
longs are stored in base 2**30 instead of base 2**15, using a 32-bit
unsigned integer type in place of unsigned short.

> Is this structure is constant in
> all environments (Linux, Windows, Mobiles, etc.)?

I think it would be dangerous to rely on this struct staying constant,
even just for CPython.  It's entirely possible that the representation
of Python longs could change in Python 2.8 or 3.2.  You should use the
public, documented C-API whenever possible.

Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


How to map size_t using ctypes?

2009-07-06 Thread Philip Semanchuk

Hi all,
I can't figure out how to map a C variable of size_t via Python's  
ctypes module. Let's say I have a C function like this:


void populate_big_array(double *the_array, size_t element_count) {...}

How would I pass parameter 2? A long (or ulong) will (probably) work  
(on most platforms), but I like my code to be more robust than that.  
Furthermore, this is scientific code and it's entirely possible that  
someone will want to pass a huge array with more elements than can be  
described by a 32-bit long.



Suggestions appreciated.

Thanks
Philip
--
http://mail.python.org/mailman/listinfo/python-list


Re: finding most common elements between thousands of multiple arrays.

2009-07-06 Thread Scott David Daniels

Peter Otten wrote:

Scott David Daniels wrote:


Scott David Daniels wrote:



 t = timeit.Timer('sum(part[:-1]==part[1:])',
  'from __main__ import part')


What happens if you calculate the sum in numpy? Try

t = timeit.Timer('(part[:-1]==part[1:]).sum()',
 'from __main__ import part')


Good idea, I hadn't thought of adding numpy bools.
(part[:-1]==part[1:]).sum()
is only a slight improvement over
len(part[part[:-1]==part[1:]])
when there are few elements, but it is almost twice
as fast when there are a lot (reflecting the work
of allocating and copying).

>>> import numpy
>>> import timeit
>>> original = numpy.random.normal(0, 100, (1000, 1000)).astype(int)
>>> data = original.flatten()
>>> data.sort()
>>> t = timeit.Timer('sum(part[:-1]==part[1:])',
 'from __main__ import part')
>>> u = timeit.Timer('len(part[part[:-1]==part[1:]])',
 'from __main__ import part')
>>> v = timeit.Timer('(part[:-1]==part[1:]).sum()',
 'from __main__ import part')

>>> part = data[::100]
>>> (part[:-1]==part[1:]).sum()
9390
>>> t.repeat(3, 10)
[0.56368281443587875, 0.55615057220961717, 0.55465764503594528]
>>> u.repeat(3, 1000)
[0.89576580263690175, 0.89276374511291579, 0.8937328626963108]
>>> v.repeat(3, 1000)
[0.24798598704592223, 0.24715431709898894, 0.24498979618920202]
>>>
>>> part = original.flatten()[::100]
>>> (part[:-1]==part[1:]).sum()
27
>>> t.repeat(3, 10)
[0.57576898739921489, 0.56410158274297828, 0.56988248506445416]
>>> u.repeat(3, 1000)
[0.27312186325366383, 0.27315007913011868, 0.27214492344683094]
>>> v.repeat(3, 1000)
[0.28410342655297427, 0.28374053126867693, 0.28318990262732768]
>>>

Net result: go back to former definition of candidates (a number,
not the actual entries), but calculate that number as matches.sum(),
not len(part[matches]).

Now the latest version of this (compressed) code:
> ...
> sampled = data[::stride]
> matches = sampled[:-1] == sampled[1:]
> candidates = sum(matches) # count identified matches
> while candidates > N * 10: # 10 -- heuristic
> stride *= 2 # # heuristic increase
> sampled = data[::stride]
> matches = sampled[:-1] == sampled[1:]
> candidates = sum(matches)
> while candidates < N * 3: # heuristic slop for long runs
> stride //= 2 # heuristic decrease
> sampled = data[::stride]
> matches = sampled[:-1] == sampled[1:]
> candidates = sum(matches)
> former = None
> past = 0
> for value in sampled[matches]:
> ...
is:
  ...
  sampled = data[::stride]
  matches = sampled[:-1] == sampled[1:]
  candidates = matches.sum() # count identified matches
  while candidates > N * 10: # 10 -- heuristic
  stride *= 2 # # heuristic increase
  sampled = data[::stride]
  matches = sampled[:-1] == sampled[1:]
  candidates = matches.sum()
  while candidates < N * 3: # heuristic slop for long runs
  stride //= 2 # heuristic decrease
  sampled = data[::stride]
  matches = sampled[:-1] == sampled[1:]
  candidates = matches.sum()
  former = None
  past = 0
  for value in sampled[matches]:
  ...

Now I think I can let this problem go, esp. since it was
mclovin's problem in the first place.

--Scott David Daniels
scott.dani...@acm.org
--
http://mail.python.org/mailman/listinfo/python-list


Re: A Bug By Any Other Name ...

2009-07-06 Thread Mark Dickinson
On Jul 6, 3:32 am, Lawrence D'Oliveiro  wrote:
> I wonder how many people have been tripped up by the fact that
>
>     ++n
>
> and
>
>     --n
>
> fail silently for numeric-valued n.

Recent python-ideas discussion on this subject:

http://mail.python.org/pipermail/python-ideas/2009-March/003741.html

Mark
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Code that ought to run fast, but can't due to Python limitations.

2009-07-06 Thread Hendrik van Rooyen
"Jean-Michel Pichavant"  wrote:

> Woot ! I'll keep this one in my mind, while I may not be that concerned 
> by speed unlike the OP, I still find this way of doing very simple and 
> so intuitive (one will successfully argue how I was not figuring this 
> out by myself if it was so intuitive).
> Anyway I wanted to participated to this thread, as soon as I saw 'due to 
> python limitations' in the title, I foretold a hell of a thread ! This 
> is just provocation ! :-)

The OP was not being provocative - he has a real problem, and the
code he is complaining about already does more or less what my
snippet showed, as I rushed in where angels fear to tread...

The bit that was not clearly shown in what I proposed, is that you
should stay in the individual states, testing for the reasons for the
state transitions, until it is time to change - so there is a while loop
in each of the individual states too.  It becomes a terribly big structure
if you have a lot of states, it duplicates a lot of tests across the different
states, and it is very awkward if the states nest.

Have a look at the one without the dict too - it is even faster as it
avoids the dict lookup.

That, however, is a bit like assembler code, as it kind of "jumps" 
from state to state, and there is no central thing to show what does,
and what does not, belong together, as there is no dict.  Not an easy
beast to fix if it's big and it's wrong.

- Hendrik



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why is my code faster with append() in a loop than with a large list?

2009-07-06 Thread MRAB

Dave Angel wrote:
[snip]
It would probably save some time to not bother storing the zeroes in the 
list at all.  And it should help if you were to step through a list of 
primes, rather than trying every possible int.  Or at least constrain 
yourself to odd numbers (after the initial case of 2).



Or stop looking for more factors when you've passed the square root of
num. I don't know what effect there'll be on the time if you recalculate
the square root when num changes (expensive calculation vs smaller
search space).
--
http://mail.python.org/mailman/listinfo/python-list


Help to find a regular expression to parse po file

2009-07-06 Thread gialloporpora

Hi all,
I would like to extract string from a PO file. To do this I have created 
a little python function to parse po file and extract string:


import re
regex=re.compile("msgid (.*)\\nmsgstr (.*)\\n\\n")
m=r.findall(s)

where s is a po file like this:

msgctxt "write ubiquity commands.description"
msgid "Takes you to the Ubiquity href=\"chrome://ubiquity/content/editor.html\">command editor page."
msgstr "Apre l'editor 
dei comandi di Ubiquity."



#. list ubiquity commands command:
#. use | to separate multiple name values:
msgctxt "list ubiquity commands.names"
msgid "list ubiquity commands"
msgstr "elenco comandi disponibili"

msgctxt "list ubiquity commands.description"
msgid "Opens the 
list\n"

"  of all Ubiquity commands available and what they all do."
msgstr "Apre una href=\"chrome://ubiquity/content/cmdlist.html\">pagina\n"
"  in cui sono elencati tutti i comandi disponibili e per ognuno 
viene spiegato in breve a cosa serve."




#. change ubiquity settings command:
#. use | to separate multiple name values:
msgctxt "change ubiquity settings.names"
msgid "change ubiquity settings|change ubiquity preferences|change 
ubiquity skin"
msgstr "modifica impostazioni di ubiquity|modifica preferenze di 
ubiquity|modifica tema di ubiquity"


msgctxt "change ubiquity settings.description"
msgid "Takes you to the href=\"chrome://ubiquity/content/settings.html\">settings page,\n"

"  where you can change your skin, key combinations, etc."
msgstr "Apre la pagina  href=\"chrome://ubiquity/content/settings.html\">delle impostazioni 
di Ubiquity,\n"
" dalla quale è possibile modificare la combinazione da tastiera 
utilizzata per richiamare Ubiquity, il tema, ecc."




but, obviusly,  with the code above the  last string is not matched. If 
I use re.DOTALL to match also new line character it not works because it 
match the entire file, I would like to stop the matching when "msgstr" 
is found.


regex=re.compile("msgid (.*)\\nmsgstr (.*)\\n\\n\\n",re.DOTALL)

is it possible or not ?




--
http://mail.python.org/mailman/listinfo/python-list


ANN: GMPY 1.10 alpha with support for Python 3

2009-07-06 Thread casevh
An alpha release of GMPY that supports Python 2 and 3 is available.
GMPY is a wrapper for the GMP multiple-precision arithmetic
library. The MPIR multiple-precision arithmetic library is also
supported. GMPY is available for download from
http://code.google.com/p/gmpy/

Support for Python 3 required many changes to the logic used to
convert between different numerical types. The result type of some
combinations has changed. For example, 'mpz' + 'float' now returns
an 'mpf' instead of a 'float'. See the file "changes.txt" for more
information.

In addition to support for Python 3, there are several other
changes and bug fixes:

- Bug fixes in mpz.binary() and mpq.binary().

- Unicode strings are accepted as input on Python 2.
  (Known bug: works for mpz, fails for mpq and mpf)

- The overhead for calling GMPY routines has been reduced.
  If one operand in a small integer, it is not converted to mpz.

- 'mpf' and 'mpq' now support % and divmod.

Comments on provided binaries

The 32-bit Windows installers were compiled using MPIR 1.2.1 and
will automatically recognize the CPU type and use code optimized for
that CPU.

Please test with your applications and report any issues found!

casevh
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: A Bug By Any Other Name ...

2009-07-06 Thread Rhodri James
On Mon, 06 Jul 2009 10:58:21 +0100, Steven D'Aprano  
 wrote:



On Mon, 06 Jul 2009 02:19:51 -0300, Gabriel Genellina wrote:


En Mon, 06 Jul 2009 00:28:43 -0300, Steven D'Aprano
 escribió:

On Mon, 06 Jul 2009 14:32:46 +1200, Lawrence D'Oliveiro wrote:


I wonder how many people have been tripped up by the fact that

++n

and

--n

fail silently for numeric-valued n.


What do you mean, "fail silently"? They do exactly what you should
expect:

++5  # positive of a positive number is positive


I'm not sure what "bug" you're seeing. Perhaps it's your expectations
that are buggy, not Python.


Well, those expectations are taken seriously when new features are
introduced into the language - and sometimes the feature is dismissed
just because it would be confusing for some. If a += 1 works, expecting
++a to have the same meaning is very reasonable (for those coming from
languages with a ++ operator, like C or Java) - more when ++a is a
perfectly valid expression. If this issue isn't listed under the various
"Python gotchas" articles, it should...


The fact that it isn't suggests strongly to me that it isn't that common
a surprise even for Java and C programmers. This is the first time I've
seen anyone raise it as an issue.


Indeed, arguably it's a bug for C compilers to fail to find the valid
parsing of "++5" as "+(+5)".  All I can say is that I've never even
accidentally typed that in twenty years of C programming.

--
Rhodri James *-* Wildebeest Herder to the Masses
--
http://mail.python.org/mailman/listinfo/python-list


Re: Help to find a regular expression to parse po file

2009-07-06 Thread Hallvard B Furuseth
gialloporpora writes:
> I would like to extract string from a PO file. To do this I have created
> a little python function to parse po file and extract string:
>
> import re
> regex=re.compile("msgid (.*)\\nmsgstr (.*)\\n\\n")
> m=r.findall(s)

I don't know the syntax of a po file, but this works for the
snippet you posted:

arg_re = r'"[^\\\"]*(?:\\.[^\\\"]*)*"'
arg_re = '%s(?:\s+%s)*' % (arg_re, arg_re)
find_re = re.compile(
r'^msgid\s+(' + arg_re + ')\s*\nmsgstr\s+(' + arg_re + ')\s*\n', re.M)

However, can \ quote a newline? If so, replace \\. with \\[\s\S] or
something.
Can there be other keywords between msgid and msgstr?  If so,
add something like (?:\w+\s+\s*\n)*? between them.
Can msgstr come before msgid? If so, forget using a single regexp.
Anything else to the syntax to look out for?  Single quotes, maybe?

Is it a problem if the regexp isn't quite right and doesn't match all
cases, yet doesn't report an error when that happens?

All in all, it may be a bad idea to sqeeze this into a single regexp.
It gets ugly real fast.  Might be better to parse the file in a more
regular way, maybe using regexps just to extract each (keyword, "value")
pair.

-- 
Hallvard
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Code that ought to run fast, but can't due to Python limitations.

2009-07-06 Thread J Kenneth King
a...@pythoncraft.com (Aahz) writes:

> In article ,
> Hendrik van Rooyen  wrote:
>>
>>But wait - maybe if he passes an iterator around - the equivalent of
>>for char in input_stream...  Still no good though, unless the next call
>>to the iterator is faster than an ordinary python call.
>
> Calls to iterators created by generators are indeed faster than an
> ordinary Python call, because the stack frame is already mostly set up.

I think Beazely demonstrated this in his talk on using the python 2.5
co-routines to setup an xml parser.  I believe he benchmarked it roughly
and the initial results were rather impressive.

http://www.dabeaz.com/coroutines/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help to find a regular expression to parse po file

2009-07-06 Thread MRAB

gialloporpora wrote:

Hi all,
I would like to extract string from a PO file. To do this I have created 
a little python function to parse po file and extract string:


import re
regex=re.compile("msgid (.*)\\nmsgstr (.*)\\n\\n")
m=r.findall(s)

where s is a po file like this:

msgctxt "write ubiquity commands.description"
msgid "Takes you to the Ubiquity href=\"chrome://ubiquity/content/editor.html\">command editor page."
msgstr "Apre l'editor 
dei comandi di Ubiquity."



#. list ubiquity commands command:
#. use | to separate multiple name values:
msgctxt "list ubiquity commands.names"
msgid "list ubiquity commands"
msgstr "elenco comandi disponibili"

msgctxt "list ubiquity commands.description"
msgid "Opens the 
list\n"

"  of all Ubiquity commands available and what they all do."
msgstr "Apre una href=\"chrome://ubiquity/content/cmdlist.html\">pagina\n"
"  in cui sono elencati tutti i comandi disponibili e per ognuno 
viene spiegato in breve a cosa serve."




#. change ubiquity settings command:
#. use | to separate multiple name values:
msgctxt "change ubiquity settings.names"
msgid "change ubiquity settings|change ubiquity preferences|change 
ubiquity skin"
msgstr "modifica impostazioni di ubiquity|modifica preferenze di 
ubiquity|modifica tema di ubiquity"


msgctxt "change ubiquity settings.description"
msgid "Takes you to the href=\"chrome://ubiquity/content/settings.html\">settings page,\n"

"  where you can change your skin, key combinations, etc."
msgstr "Apre la pagina  href=\"chrome://ubiquity/content/settings.html\">delle impostazioni 
di Ubiquity,\n"
" dalla quale è possibile modificare la combinazione da tastiera 
utilizzata per richiamare Ubiquity, il tema, ecc."




but, obviusly,  with the code above the  last string is not matched. If 
I use re.DOTALL to match also new line character it not works because it 
match the entire file, I would like to stop the matching when "msgstr" 
is found.


regex=re.compile("msgid (.*)\\nmsgstr (.*)\\n\\n\\n",re.DOTALL)

is it possible or not ?


You could try:

regex = re.compile(r"msgid (.*(?:\n".*")*)\nmsgstr (.*(?:\n".*")*)$")

and then, if necessary, tidy what you get.
--
http://mail.python.org/mailman/listinfo/python-list


Re: How Python Implements "long integer"?

2009-07-06 Thread Pedram
Hello Mr. Dickinson. Glad to see you again :)

On Jul 6, 5:46 pm, Mark Dickinson  wrote:
> On Jul 6, 1:24 pm, Pedram  wrote:
>
> > OK, fine, I read longobject.c at last! :)
> > I found that longobject is a structure like this:
>
> > struct _longobject {
> >     struct _object *_ob_next;
> >     struct _object *_ob_prev;
>
> For current CPython, these two fields are only present in debug
> builds;  for a normal build they won't exist.

I couldn't understand the difference between them. What are debug
build and normal build themselves? And You mean in debug build
PyLongObject is a doubly-linked-list but in normal build it is just an
array (Or if not how it'll store in this mode)?

> >     Py_ssize_t ob_refcnt;
> >     struct _typeobject *ob_type;
>
> You're missing an important field here (see the definition of
> PyObject_VAR_HEAD):
>
>     Py_ssize_t ob_size; /* Number of items in variable part */
>
> For the current implementation of Python longs, the absolute value of
> this field gives the number of digits in the long;  the sign gives the
> sign of the long (0L is represented with zero digits).

Oh, you're right. I missed that. Thanks :)

> >     digit ob_digit[1];
>
> Right.  This is an example of the so-called 'struct hack' in C; it
> looks as though there's just a single digit, but what's intended here
> is that there's an array of digits tacked onto the end of the struct;
> for any given PyLongObject, the size of this array is determined at
> runtime.  (C99 allows you to write this as simply ob_digit[], but not
> all compilers support this yet.)

WOW! I didn't know anything about 'struct hacks'! I read about them
and they were very wonderful. Thanks for your point. :)

> > }
> > And a digit is a 15-item array of C's unsigned short integers.
>
> No: a digit is a single unsigned short, which is used to store 15 bits
> of the Python long.  Python longs are stored in sign-magnitude format,
> in base 2**15.  So each of the base 2**15 'digits' is an integer in
> the range [0, 32767).  The unsigned short type is used to store those
> digits.
>
> Exception: for Python 2.7+ or Python 3.1+, on 64-bit machines, Python
> longs are stored in base 2**30 instead of base 2**15, using a 32-bit
> unsigned integer type in place of unsigned short.
>
> > Is this structure is constant in
> > all environments (Linux, Windows, Mobiles, etc.)?
>
> I think it would be dangerous to rely on this struct staying constant,
> even just for CPython.  It's entirely possible that the representation
> of Python longs could change in Python 2.8 or 3.2.  You should use the
> public, documented C-API whenever possible.
>
> Mark

Thank you a lot Mark :)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why is my code faster with append() in a loop than with a large list?

2009-07-06 Thread Piet van Oostrum
> Dave Angel  (DA) wrote:

>DA> It would probably save some time to not bother storing the zeroes in the
>DA> list at all.  And it should help if you were to step through a list of
>DA> primes, rather than trying every possible int.  Or at least constrain
>DA> yourself to odd numbers (after the initial case of 2).

The first and the last save a constant factor (slightly less than 2):

def factorise(num):
"""Returns a list of prime factor powers. For example:
factorise(6) will return
[2, 2] (the powers are returned one higher than the actual value)
as in, 2^1 * 3^1 = 6."""
powers = []
factor = 2
while num > 1:
power = 0
while num % factor == 0:
power += 1
num /= factor
if power > 0:
powers.append(power+1)
factor += 1
return powers

...
return reduce(mul, powers)

or to skip the odd factors:

def factorise(num):
"""Returns a list of prime factor powers. For example:
factorise(6) will return
[2, 2] (the powers are returned one higher than the actual value)
as in, 2^1 * 3^1 = 6."""
powers = []
factor = 2
while num > 1:
power = 0
while num % factor == 0:
power += 1
num /= factor
if power > 0:
powers.append(power+1)
factor = 3 if factor == 2 else factor + 2
return powers

This can be slightly optimised by taking factor 2 out of the loop.

def factorise(num):
"""Returns a list of prime factor powers. For example:
factorise(6) will return
[2, 2] (the powers are returned one higher than the actual value)
as in, 2^1 * 3^1 = 6."""
powers = []
power = 0
while num % 2 == 0:
power += 1
num /= 2
if power > 0:
powers.append(power+1)
factor = 3
while num > 1:
power = 0
while num % factor == 0:
power += 1
num /= factor
if power > 0:
powers.append(power+1)
factor += 2
return powers

To restrict the search to primes you would have to use a
sieve of Eratosthenes or something similar.
My first attempt (with a sieve from
http://code.activestate.com/recipes/117119/) only gave a speed decrease!!
But this had the sieve recreated for every triangle number. A global
sieve that is reused at each triangle number is better. But the speed
increase relative to the odd factors only is not dramatical.


# Based upon http://code.activestate.com/recipes/117119/

D = {9: 6} # contains composite numbers
Dlist = [2, 3] # list of already generated primes

def sieve():
'''generator that yields all prime numbers'''
global D
global Dlist
for p in Dlist:
yield p
q = Dlist[-1]+2
while True:
if q in D:
p = D[q]
x = q + p
while x in D: x += p
D[x] = p
else:
Dlist.append(q)
yield q
D[q*q] = 2*q
q += 2

def factorise(num):
"""Returns a list of prime factor powers. For example:
factorise(6) will return
[2, 2] (the powers are returned one higher than the actual value)
as in, 2^1 * 3^1 = 6."""
powers = []
power = 0
for factor in sieve():
power = 0
while num % factor == 0:
power += 1
num /= factor
if power > 0:
# if you really want the factors then append((factor, power))
powers.append(power+1)
if num == 1:
break
return powers

-- 
Piet van Oostrum 
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python and webcam capture delay?

2009-07-06 Thread Rhodri James
On Mon, 06 Jul 2009 07:10:38 +0100, jack catcher (nick)   
wrote:



Tim Roberts kirjoitti:

"jack catcher (nick)"  wrote:
I'm thinking of using Python for capturing and showing live webcam  
stream simultaneously between two computers via local area network.  
Operating system is Windows. I'm going to begin with VideoCapture  
extension, no ideas about other implementation yet. Do you have any  
suggestions on how short delay I should hope to achieve in showing the  
video? This would be part of a psychological experiment, so I would  
need to deliver the video stream with a reasonable delay (say, below  
100ms).
 You need to do the math on this.  Remember that a full 640x480 RGB  
stream

at 30 frames per second runs 28 megabytes per second.  That's more than
twice what a 100 megabit network can pump.
 You can probably use Python to oversee this, but you might want to  
consider

using lower-level code to control the actual hardware.  If you are
targeting Windows, for example, you could write a DirectShow graph to  
pump

into a renderer that transmits out to a network, then another graph to
receive from the network and display it.
 You can manage the network latency by adding a delays in the local  
graph.


Thanks Tim, you're correct about the math. What is your main point about  
DirectShow: that it is generally faster and more reliable than doing the  
job high-level, or that one could use coding/decoding in DirectShow to  
speed up the transmission? I think the latter would be a great idea if  
the latency were tolerable. On the other hand, I'd like to keep things  
simple and do all the programming in Python. I've got no experience with  
DirectShow, but I guess the filters need to be programmed in C++ and  
called from Python?


Another option might be to use resolution 320x...@15fps.


Does the webcam just deliver frames, or are you getting frames out of
a decoder layer?  If it's the latter, you want to distribute the encoded
video, which should be much lower bandwidth.  Exactly how you do that
depends a bit on what format the webcam claims to deliver.

--
Rhodri James *-* Wildebeest Herder to the Masses
--
http://mail.python.org/mailman/listinfo/python-list


Re: Code that ought to run fast, but can't due to Python limitations.

2009-07-06 Thread Jean-Michel Pichavant

Hendrik van Rooyen wrote:

"Jean-Michel Pichavant"  wrote:

  
Woot ! I'll keep this one in my mind, while I may not be that concerned 
by speed unlike the OP, I still find this way of doing very simple and 
so intuitive (one will successfully argue how I was not figuring this 
out by myself if it was so intuitive).
Anyway I wanted to participated to this thread, as soon as I saw 'due to 
python limitations' in the title, I foretold a hell of a thread ! This 
is just provocation ! :-)



The OP was not being provocative - he has a real problem, 


I was just kidding, asserting for python limitations in this list 
guarantees that the thread will last for several days, whether or not 
the assertion is right.


JM
--
http://mail.python.org/mailman/listinfo/python-list


Re: Clarity vs. code reuse/generality

2009-07-06 Thread Tim Rowe
2009/7/4 kj :

> Precisely.  As I've stated elsewhere, this is an internal helper
> function, to be called only a few times under very well-specified
> conditions.  The assert statements checks that these conditions
> are as intended.  I.e. they are checks against the module writer's
> programming errors.

Good for you. I'm convinced that you have used the assertion
appropriately, and the fact that so many here are unable to see that
looks to me like a good case for teaching the right use of assertions.
For what it's worth, I read assertions at the beginning of a procedure
as part of the specification of the procedure, and I use them there in
order to document the procedure. An assertion in that position is for
me a statement to the user of the procedure "it's your responsibility
to make sure that you never call this procedure in such a way as to
violate these conditions". They're part of a contract, as somebody
(maybe you) pointed out.

As somebody who works in the safety-critical domain, it's refreshing
to see somebody teaching students to think about the circumstances in
which a procedure can legitimately be called. The hostility you've
received to that idea is saddening, and indicative of why there's so
much buggy software out there.
-- 
Tim Rowe
-- 
http://mail.python.org/mailman/listinfo/python-list


try -> except -> else -> except?

2009-07-06 Thread David House
Hi all,

I'm looking for some structure advice. I'm writing something that
currently looks like the following:

try:

except KeyError:

else:


This is working fine. However, I now want to add a call to a function
in the `else' part that may raise an exception, say a ValueError. So I
was hoping to do something like the following:

try:

except KeyError:

else:

except ValueError:


However, this isn't allowed in Python.

An obvious way round this is to move the `else' clause into the `try', i.e.,

try:


except KeyError:

except ValueError:


However, I am loath to do this, for two reasons:

(i) if I modify the  block at some point in
the future so that it may raise a KeyError, I have to somehow tell
this exception from the one that may be generated from the  line.
(ii) it moves the error handler for the  bit miles away from the line that might generate the
error, making it unclear which code the KeyError error handler is an
error handler for.

What would be the best way to structure this?

-- 
-David
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why is my code faster with append() in a loop than with a large list?

2009-07-06 Thread Piet van Oostrum
Sorry, there was an error in the sieve in my last example. Here is a
corrected version:

D = {9: 6} # contains composite numbers
Dlist = [2, 3] # list of already generated primes
def sieve():
'''generator that yields all prime numbers'''
global D
global Dlist
for q in Dlist:
yield q
while True:
q += 2
p = D.pop(q, 0)
if p:
x = q + p
while x in D: x += p
D[x] = p
else:
Dlist.append(q)
D[q*q] = 2*q
yield q

-- 
Piet van Oostrum 
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: try -> except -> else -> except?

2009-07-06 Thread Piet van Oostrum
> David House  (DH) wrote:

>DH> Hi all,
>DH> I'm looking for some structure advice. I'm writing something that
>DH> currently looks like the following:

>DH> try:
>DH> 
>DH> except KeyError:
>DH> 
>DH> else:
>DH> 

>DH> This is working fine. However, I now want to add a call to a function
>DH> in the `else' part that may raise an exception, say a ValueError. So I
>DH> was hoping to do something like the following:

>DH> try:
>DH> 
>DH> except KeyError:
>DH> 
>DH> else:
>DH> 
>DH> except ValueError:
>DH> 

>DH> However, this isn't allowed in Python.

>DH> An obvious way round this is to move the `else' clause into the `try', 
>i.e.,

>DH> try:
>DH> 
>DH> 
>DH> except KeyError:
>DH> 
>DH> except ValueError:
>DH> 

>DH> However, I am loath to do this, for two reasons:

>DH> (i) if I modify the  block at some point in
>DH> the future so that it may raise a KeyError, I have to somehow tell
>DH> this exception from the one that may be generated from the DH> amount of code that may raise a KeyError> line.
>DH> (ii) it moves the error handler for the DH> raise a KeyError> bit miles away from the line that might generate the
>DH> error, making it unclear which code the KeyError error handler is an
>DH> error handler for.

>DH> What would be the best way to structure this?

try:

except KeyError:

else:
try:

except ValueError:


-- 
Piet van Oostrum 
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to map size_t using ctypes?

2009-07-06 Thread Diez B. Roggisch
Philip Semanchuk wrote:

> Hi all,
> I can't figure out how to map a C variable of size_t via Python's
> ctypes module. Let's say I have a C function like this:
> 
> void populate_big_array(double *the_array, size_t element_count) {...}
> 
> How would I pass parameter 2? A long (or ulong) will (probably) work
> (on most platforms), but I like my code to be more robust than that.
> Furthermore, this is scientific code and it's entirely possible that
> someone will want to pass a huge array with more elements than can be
> described by a 32-bit long.

from ctypes import c_size_t

doesn't work for you? On my system, it's aliased to c_ulong, but I guess
that's depending on the platfrom of course.

Diez
-- 
http://mail.python.org/mailman/listinfo/python-list


Python-URL! - weekly Python news and links (Jul 6)

2009-07-06 Thread Gabriel Genellina
QOTW:  "Simulating a shell with hooks on its I/O should be so complicated that
a 'script kiddie' has trouble writing a Trojan." - Scott David Daniels
http://groups.google.com/group/comp.lang.python/msg/1c0f70d5fc69b5aa


Python 3.1 final was released last week - congratulations!

http://groups.google.com/group/comp.lang.python/browse_thread/thread/b37a041ce01d4168/

Clarity of code vs. reusability in an introductory course:

http://groups.google.com/group/comp.lang.python/browse_thread/thread/b1a6ca84d3bfc483/

A piece of code analyzed, looking for ways to improve speed:

http://groups.google.com/group/comp.lang.python/browse_thread/thread/a420417a1e215cf0/

Beginner question: How to define an indeterminate number of variables?
How to create objects, and later reference them, when one doesn't know
how many of them will be needed?

http://groups.google.com/group/comp.lang.python/browse_thread/thread/c40e3c0c843ce6fb/

Simple things made simple: correct use of Unicode helps a lot in this case:

http://groups.google.com/group/comp.lang.python/browse_thread/thread/9cbb01aacf23fa8f/

super() and multiple inheritance: __init__ is hard to get right. Also,
Carl Banks explains how mixin classes are best used:

http://groups.google.com/group/comp.lang.python/browse_thread/thread/c934715ac83dfbd/

iter(function, sentinel) is a handy way to iterate in some cases, but
beware of the fine print:

http://groups.google.com/group/comp.lang.python/browse_thread/thread/b977c3b39abd98b0

The testing code should not merely repeat the code being tested (this
fact isn't obvious to everyone):

http://groups.google.com/group/comp.lang.python/browse_thread/thread/8978c54b83f80bd6/

Organize code so it can be run both in the source tree and after being
installed:

http://groups.google.com/group/comp.lang.python/browse_thread/thread/62284434d2ca5e1f/

Mark Dickinson explains the C implementation of long integers:

http://groups.google.com/group/comp.lang.python/browse_thread/thread/aa1d06d371658135/



Everything Python-related you want is probably one or two clicks away in
these pages:

Python.org's Python Language Website is the traditional
center of Pythonia
http://www.python.org
Notice especially the master FAQ
http://www.python.org/doc/FAQ.html

PythonWare complements the digest you're reading with the
marvelous daily python url
 http://www.pythonware.com/daily

Just beginning with Python?  This page is a great place to start:
http://wiki.python.org/moin/BeginnersGuide/Programmers

The Python Papers aims to publish "the efforts of Python enthusiasts":
http://pythonpapers.org/
The Python Magazine is a technical monthly devoted to Python:
http://pythonmagazine.com

Readers have recommended the "Planet" sites:
http://planetpython.org
http://planet.python.org

comp.lang.python.announce announces new Python software.  Be
sure to scan this newsgroup weekly.
http://groups.google.com/group/comp.lang.python.announce/topics

Python411 indexes "podcasts ... to help people learn Python ..."
Updates appear more-than-weekly:
http://www.awaretek.com/python/index.html

The Python Package Index catalogues packages.
http://www.python.org/pypi/

Much of Python's real work takes place on Special-Interest Group
mailing lists
http://www.python.org/sigs/

Python Success Stories--from air-traffic control to on-line
match-making--can inspire you or decision-makers to whom you're
subject with a vision of what the language makes practical.
http://www.pythonology.com/success

The Python Software Foundation (PSF) has replaced the Python
Consortium as an independent nexus of activity.  It has official
responsibility for Python's development and maintenance.
http://www.python.org/psf/
Among the ways you can support PSF is with a donation.
http://www.python.org/psf/donations/

The Summary of Python Tracker Issues is an automatically generated
report summarizing new bugs, closed ones, and patch submissions. 

http://search.gmane.org/?author=status%40bugs.python.org&group=gmane.comp.python.devel&sort=date

Although unmaintained since 2002, the Cetus collection of Python
hyperlinks retains a few gems.
http://www.cetus-links.org/oo_python.html

Python FAQTS
http://python.faqts.com/

The Cookbook is a collaborative effort to capture useful and
interesting recipes.
http://code.activestate.com/recipes/langs/python/

Many Python conferences around the world are in preparation.
Watch this space for links to them.

Among seve

Re: try -> except -> else -> except?

2009-07-06 Thread David House
2009/7/6 Python :
> as far as I know try has no 'else'

It does:
http://docs.python.org/reference/compound_stmts.html#the-try-statement

> it's 'finally'

There is a `finally', too, but they are semantically different. See
the above link.

-- 
-David
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: try -> except -> else -> except?

2009-07-06 Thread Python


On 6 jul 2009, at 18:14, David House wrote:


2009/7/6 Python :

as far as I know try has no 'else'


It does:
http://docs.python.org/reference/compound_stmts.html#the-try-statement


it's 'finally'


There is a `finally', too, but they are semantically different. See
the above link.

--
-David



ah yeah you;re right, sry

shouldn't the else statement come after the except ones maybe?
--
http://mail.python.org/mailman/listinfo/python-list


Re: How to map size_t using ctypes?

2009-07-06 Thread Philip Semanchuk


On Jul 6, 2009, at 12:10 PM, Diez B. Roggisch wrote:


Philip Semanchuk wrote:


Hi all,
I can't figure out how to map a C variable of size_t via Python's
ctypes module. Let's say I have a C function like this:

void populate_big_array(double *the_array, size_t element_count)  
{...}


How would I pass parameter 2? A long (or ulong) will (probably) work
(on most platforms), but I like my code to be more robust than that.
Furthermore, this is scientific code and it's entirely possible that
someone will want to pass a huge array with more elements than can be
described by a 32-bit long.


from ctypes import c_size_t

doesn't work for you? On my system, it's aliased to c_ulong, but I  
guess

that's depending on the platfrom of course.


D'oh! [slaps forehead]

That will teach me to RTFM. In my 2.5 doc, it's not listed in the  
"Fundamental data types" section in the tutorial, but it is mentioned  
in "Fundamental data types" in the ctypes reference. You'd be  
surprised at the amount of Googling I did without learning this on my  
own.


Thanks
Philip
--
http://mail.python.org/mailman/listinfo/python-list


Re: generation of keyboard events

2009-07-06 Thread Tim Harig
On 2009-07-06, RAM  wrote:
> I am trying to do this on windows. My program(executable) has been
> written in VC++ and when I run this program, I need to click on one
> button on the program GUI i,e just I am entering "Enter key" on the
> key board. But this needs manual process. So i need to write a python
> script which invokes my program and pass "Enter key" event to my
> program so that it runs without manual intervention.

This can be done using the WScript.WshShell.SendKeys() method when running
Python with Windows Scripting Host by using the .pyw extension:

http://msdn.microsoft.com/en-us/library/8c6yea83(VS.85).aspx
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help to find a regular expression to parse po file

2009-07-06 Thread gialloporpora

Risposta al messaggio di Hallvard B Furuseth :




I don't know the syntax of a po file, but this works for the
snippet you posted:

arg_re = r'"[^\\\"]*(?:\\.[^\\\"]*)*"'
arg_re = '%s(?:\s+%s)*' % (arg_re, arg_re)
find_re = re.compile(
 r'^msgid\s+(' + arg_re + ')\s*\nmsgstr\s+(' + arg_re + ')\s*\n', re.M)

However, can \ quote a newline? If so, replace \\. with \\[\s\S] or
something.
Can there be other keywords between msgid and msgstr?  If so,
add something like (?:\w+\s+\s*\n)*? between them.
Can msgstr come before msgid? If so, forget using a single regexp.
Anything else to the syntax to look out for?  Single quotes, maybe?

Is it a problem if the regexp isn't quite right and doesn't match all
cases, yet doesn't report an error when that happens?

All in all, it may be a bad idea to sqeeze this into a single regexp.
It gets ugly real fast.  Might be better to parse the file in a more
regular way, maybe using regexps just to extract each (keyword, "value")
pair.

Thank you very much, Haldvard, it seem to works, there is a strange 
match in the file header but I could skip the first match.



The po files have this structure:
http://bit.ly/18qbVc

msgid "string to translate"
"   second string to match"
"   n string to match"
msgstr "translated sting"
"   second translated string"
"  n translated string"
One or more new line before the next group.

In past I have created a Python script to parse PO files  where msgid 
and msgstr are in two sequential lines, for example:


msgid "string to translate"
msgstr "translated string"

now the problem is how to match also (optional) string between msgid and 
msgstr.


Sandro





--
http://mail.python.org/mailman/listinfo/python-list


Re: Why is my code faster with append() in a loop than with a large list?

2009-07-06 Thread Scott David Daniels

Piet van Oostrum wrote:

Dave Angel  (DA) wrote:



DA> It would probably save some time to not bother storing the zeroes in the
DA> list at all.  And it should help if you were to step through a list of
DA> primes, rather than trying every possible int.  Or at least constrain
DA> yourself to odd numbers (after the initial case of 2).


...
# Based upon http://code.activestate.com/recipes/117119/

D = {9: 6} # contains composite numbers

XXX Dlist = [2, 3] # list of already generated primes
  Elist = [(2, 4), (3, 9)] # list of primes and their squares



XXX def sieve():
XXX   '''generator that yields all prime numbers'''
XXX   global D
XXX   global Dlist
 def sieve2():
 '''generator that yields all primes and their squares'''
 # No need for global declarations, we alter, not replace
XXX   for p in Dlist:
XXX   yield p
XXX   q = Dlist[-1]+2

  for pair in Elist:
  yield pair
  q = pair[0] + 2


while True:
if q in D:
p = D[q]
x = q + p
while x in D: x += p
D[x] = p
else:

XXX   Dlist.append(q)
XXX   yield q
XXX   D[q*q] = 2*q
  square = q * q
  pair = q, square
  Elist.append(pair)
  yield pair
  D[square] = 2 * q

q += 2

def factorise(num):
"""Returns a list of prime factor powers. For example:
factorise(6) will return
[2, 2] (the powers are returned one higher than the actual value)
as in, 2^1 * 3^1 = 6."""
powers = []
power = 0

XXX   for factor in sieve():
  for factor, limit in sieve2():

power = 0
while num % factor == 0:
power += 1
num /= factor

XXX   if power > 0:
  if power: # good enough here, and faster

# if you really want the factors then append((factor, power))
powers.append(power+1)

XXX   if num == 1:
XXX   break
XXX   return powers
  if num < limit:
  if num > 1:
  # if you really want the factors then append((num, 1))
  powers.append(2)
  return powers

OK, that's a straightforward speedup, _but_:
 factorize(6) == [2, 2] == factorize(10) ==  factorize(15)
So I am not sure exactly what you are calculating.


--Scott David Daniels
scott.dani...@acm.org
--
http://mail.python.org/mailman/listinfo/python-list


Re: Re: Why is my code faster with append() in a loop than with a large list?

2009-07-06 Thread Dave Angel

MRAB wrote:
Dave 
Angel wrote:

[snip]
It would probably save some time to not bother storing the zeroes in 
the list at all.  And it should help if you were to step through a 
list of primes, rather than trying every possible int.  Or at least 
constrain yourself to odd numbers (after the initial case of 2).



Or stop looking for more factors when you've passed the square root of
num. I don't know what effect there'll be on the time if you recalculate
the square root when num changes (expensive calculation vs smaller
search space).



But if I remember the code, it stopped when the quotient is one, which 
is usually sooner than the square root.  And no need to precalculate the 
square root.


--
http://mail.python.org/mailman/listinfo/python-list


Re: Re: A Bug By Any Other Name ...

2009-07-06 Thread Dave Angel

Rhodri James wrote:
On Mon, 
06 Jul 2009 10:58:21 +0100, Steven D'Aprano 
 wrote:



On Mon, 06 Jul 2009 02:19:51 -0300, Gabriel Genellina wrote:


En Mon, 06 Jul 2009 00:28:43 -0300, Steven D'Aprano
 escribió:

On Mon, 06 Jul 2009 14:32:46 +1200, Lawrence D'Oliveiro wrote:


I wonder how many people have been tripped up by the fact that

++n

and

--n

fail silently for numeric-valued n.


What do you mean, "fail silently"? They do exactly what you should
expect:

++5  # positive of a positive number is positive


I'm not sure what "bug" you're seeing. Perhaps it's your expectations
that are buggy, not Python.


Well, those expectations are taken seriously when new features are
introduced into the language - and sometimes the feature is dismissed
just because it would be confusing for some. If a += 1 works, expecting
++a to have the same meaning is very reasonable (for those coming from
languages with a ++ operator, like C or Java) - more when ++a is a
perfectly valid expression. If this issue isn't listed under the 
various

"Python gotchas" articles, it should...


The fact that it isn't suggests strongly to me that it isn't that common
a surprise even for Java and C programmers. This is the first time I've
seen anyone raise it as an issue.


Indeed, arguably it's a bug for C compilers to fail to find the valid
parsing of "++5" as "+(+5)".  All I can say is that I've never even
accidentally typed that in twenty years of C programming.

But the C language specifically defines the tokenizer as doing a 
max-match, finding the longest legal token at any point.  That's how 
many things that would otherwise be ambiguous are well-defined.  For 
example, if you want to divide two integers, given pointers to them, you 
need a space between the slash and the start.

*p1/*p2begins a comment,  while   *p1/ *p2   does a division


--
http://mail.python.org/mailman/listinfo/python-list


Re: try -> except -> else -> except?

2009-07-06 Thread Python


On 6 jul 2009, at 17:46, David House wrote:


Hi all,

I'm looking for some structure advice. I'm writing something that
currently looks like the following:

try:
   
except KeyError:
   
else:
   

This is working fine. However, I now want to add a call to a function
in the `else' part that may raise an exception, say a ValueError. So I
was hoping to do something like the following:

try:
   
except KeyError:
   
else:
   
except ValueError:
   

However, this isn't allowed in Python.

An obvious way round this is to move the `else' clause into the  
`try', i.e.,


try:
   
   
except KeyError:
   
except ValueError:
   

However, I am loath to do this, for two reasons:

(i) if I modify the  block at some point in
the future so that it may raise a KeyError, I have to somehow tell
this exception from the one that may be generated from the  line.
(ii) it moves the error handler for the  bit miles away from the line that might generate the
error, making it unclear which code the KeyError error handler is an
error handler for.

What would be the best way to structure this?

--
-David



as far as I know try has no 'else'
it's 'finally'

try:
a
except:
b
except:
c
finally:
d

gr
Arno

--
http://mail.python.org/mailman/listinfo/python-list


Re: Why is my code faster with append() in a loop than with a large list?

2009-07-06 Thread Dave Angel

Scott David Daniels wrote:
Piet van 
Oostrum wrote:

Dave Angel  (DA) wrote:


DA> It would probably save some time to not bother storing the 
zeroes in the
DA> list at all.  And it should help if you were to step through a 
list of
DA> primes, rather than trying every possible int.  Or at least 
constrain

DA> yourself to odd numbers (after the initial case of 2).


...
# Based upon http://code.activestate.com/recipes/117119/

D = {9: 6} # contains composite numbers

XXX Dlist = [2, 3] # list of already generated primes
  Elist = [(2, 4), (3, 9)] # list of primes and their squares



XXX def sieve():
XXX   '''generator that yields all prime numbers'''
XXX   global D
XXX   global Dlist
 def sieve2():
 '''generator that yields all primes and their squares'''
 # No need for global declarations, we alter, not replace
XXX   for p in Dlist:
XXX   yield p
XXX   q = Dlist[-1]+2

  for pair in Elist:
  yield pair
  q = pair[0] + 2


while True:
if q in D:
p = D[q]
x = q + p
while x in D: x += p
D[x] = p
else:

XXX   Dlist.append(q)
XXX   yield q
XXX   D[q*q] = 2*q
  square = q * q
  pair = q, square
  Elist.append(pair)
  yield pair
  D[square] = 2 * q

q += 2

def factorise(num):
"""Returns a list of prime factor powers. For example:
factorise(6) will return
[2, 2] (the powers are returned one higher than the actual 
value)

as in, 2^1 * 3^1 = 6."""
powers = []
power = 0

XXX   for factor in sieve():
  for factor, limit in sieve2():

power = 0
while num % factor == 0:
power += 1
num /= factor

XXX   if power > 0:
  if power: # good enough here, and faster
# if you really want the factors then append((factor, 
power))

powers.append(power+1)

XXX   if num == 1:
XXX   break
XXX   return powers
  if num < limit:
  if num > 1:
  # if you really want the factors then append((num, 1))
  powers.append(2)
  return powers

OK, that's a straightforward speedup, _but_:
 factorize(6) == [2, 2] == factorize(10) ==  factorize(15)
So I am not sure exactly what you are calculating.


--Scott David Daniels
scott.dani...@acm.org



The OP only needed the number of factors, not the actual factors.  So 
the zeroes in the list are unneeded.  6, 10, and 15 each have 4 factors.



--
http://mail.python.org/mailman/listinfo/python-list


Re: Tree structure consuming lot of memory

2009-07-06 Thread Simon Forman
On Mon, Jul 6, 2009 at 6:12 AM, mayank gupta wrote:
> Thanks for the other possibilites. I would consider option (2) and (3) to
> improve my code.
>
> But out of curiosity, I would still like to know why does an object of a
> Python-class consume "so" much of memory (1.4 kb), and this memory usage has
> nothing to do with its attributes.
>
> Thanks
>
> Regards.
>
> On Mon, Jul 6, 2009 at 12:03 PM, Chris Rebert  wrote:
>>
>> On Mon, Jul 6, 2009 at 2:55 AM, mayank gupta wrote:
>> > Hi,
>> >
>> > I am creating a tree data-structure in python; with nodes of the tree
>> > created by a simple class :
>> >
>> > class Node :
>> >    def __init__(self ,  other attributes):
>> >   # initialise the attributes here!!
>> >
>> > But the problem is I am working with a huge tree (millions of nodes);
>> > and
>> > each node is consuming much more memory than it should. After a little
>> > analysis, I found out that in general it uses about 1.4 kb of memory for
>> > each node!!
>> > I will be grateful if someone could help me optimize the memory usage.
>>
>> (1) Use __slots__ (see
>> http://docs.python.org/reference/datamodel.html#slots)
>> (2) Use some data structure other than a tree
>> (3) Rewrite your Node/Tree implementation in C
>>
>> Cheers,
>> Chris
>> --
>> http://blog.rebertia.com
>

For option 2 you should try using the built in types, list tuple or
dict.  You might get better results.

I'm curious too as to why the class/instance code should take so much
memory, could you mention more about the code?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why is my code faster with append() in a loop than with a large list?

2009-07-06 Thread MRAB

Dave Angel wrote:

MRAB wrote:
Dave 
Angel wrote:

[snip]
It would probably save some time to not bother storing the zeroes in 
the list at all.  And it should help if you were to step through a 
list of primes, rather than trying every possible int.  Or at least 
constrain yourself to odd numbers (after the initial case of 2).



Or stop looking for more factors when you've passed the square root of
num. I don't know what effect there'll be on the time if you recalculate
the square root when num changes (expensive calculation vs smaller
search space).



But if I remember the code, it stopped when the quotient is one, which 
is usually sooner than the square root.  And no need to precalculate the 
square root.



If the number is a large prime then the code will try all the numbers up
to that, eg if num == 103 then it'll try 2..103 even though it
could've given up after 1000.
--
http://mail.python.org/mailman/listinfo/python-list


updating, adding new pages to confluence remotely, using python

2009-07-06 Thread pescadero10
Hello,

I am new to python and have been trying to figure out how to remotely
add new pages to my confluence
wiki space. I'm running my python script from a linux rhel4 machine
and using confluence version 2.10. As a test I tried to read from
stdin and write a page but it fails- that is, the script runs without
errors but nothing is added.  Does anyone have an example of how this
is done?  Here is my script:

--- begin script ---
#!/usr/local/bin/python
#
# Reads from standard input, dumps it onto a Confluence page
# You'll need to modify the URL/username/password/spacekey/page title
# below, because I'm too lazy to bother with argv.

import sys
from xmlrpclib import Server

# Read the text of the page from standard input
content = sys.stdin.read()

s = Server("http://confluence.slac.stanford.edu/display/GO/Home";)
token = s.confluence1.login("chee", "**")
page = s.confluence1.getPage(token, "SPACEKEY", "TEST Python-2-
Confluence")
page["content"] = content
s.confluence1.storePage(token, page)

newpagedata = {"title":"New Page","content":"new
content","space":"spaceKey"}
newpage = s.confluence1.storePage(token, newpagedata);
 end script 

Any help would be greatly appreciated.

thanks, Pescadero10
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python and webcam capture delay?

2009-07-06 Thread jack catcher (nick)

Rhodri James kirjoitti:
On Mon, 06 Jul 2009 07:10:38 +0100, jack catcher (nick) 
 wrote:



Tim Roberts kirjoitti:

"jack catcher (nick)"  wrote:
I'm thinking of using Python for capturing and showing live webcam 
stream simultaneously between two computers via local area network. 
Operating system is Windows. I'm going to begin with VideoCapture 
extension, no ideas about other implementation yet. Do you have any 
suggestions on how short delay I should hope to achieve in showing 
the video? This would be part of a psychological experiment, so I 
would need to deliver the video stream with a reasonable delay (say, 
below 100ms).
 You need to do the math on this.  Remember that a full 640x480 RGB 
stream

at 30 frames per second runs 28 megabytes per second.  That's more than
twice what a 100 megabit network can pump.
 You can probably use Python to oversee this, but you might want to 
consider

using lower-level code to control the actual hardware.  If you are
targeting Windows, for example, you could write a DirectShow graph to 
pump

into a renderer that transmits out to a network, then another graph to
receive from the network and display it.
 You can manage the network latency by adding a delays in the local 
graph.


Thanks Tim, you're correct about the math. What is your main point 
about DirectShow: that it is generally faster and more reliable than 
doing the job high-level, or that one could use coding/decoding in 
DirectShow to speed up the transmission? I think the latter would be a 
great idea if the latency were tolerable. On the other hand, I'd like 
to keep things simple and do all the programming in Python. I've got 
no experience with DirectShow, but I guess the filters need to be 
programmed in C++ and called from Python?


Another option might be to use resolution 320x...@15fps.


Does the webcam just deliver frames, or are you getting frames out of
a decoder layer?  If it's the latter, you want to distribute the encoded
video, which should be much lower bandwidth.  Exactly how you do that
depends a bit on what format the webcam claims to deliver.


Well, getting already encoded video from the webcam sounds almost like a 
free lunch (which it probably is not). At least I wouldn't want to get 
too long a delay because of the encoding.


I haven't got the webcam(s) yet, and I guess I can basically purchase 
any ones I find suitable for getting the job done. Any recommendations?

--
http://mail.python.org/mailman/listinfo/python-list


Re: Help to find a regular expression to parse po file

2009-07-06 Thread gialloporpora

Risposta al messaggio di MRAB :


gialloporpora wrote:

Hi all,
I would like to extract string from a PO file. To do this I have created
a little python function to parse po file and extract string:

import re
regex=re.compile("msgid (.*)\\nmsgstr (.*)\\n\\n")
m=r.findall(s)

where s is a po file like this:

msgctxt "write ubiquity commands.description"
msgid "Takes you to the Ubiquitycommand editor  page."
msgstr "Apre l'editor
dei comandi  di Ubiquity."


#. list ubiquity commands command:
#. use | to separate multiple name values:
msgctxt "list ubiquity commands.names"
msgid "list ubiquity commands"
msgstr "elenco comandi disponibili"

msgctxt "list ubiquity commands.description"
msgid "Opensthe
list\n"
"  of all Ubiquity commands available and what they all do."
msgstr "Apre unapagina\n"
"  in cui sono elencati tutti i comandi disponibili e per ognuno
viene spiegato in breve a cosa serve."



#. change ubiquity settings command:
#. use | to separate multiple name values:
msgctxt "change ubiquity settings.names"
msgid "change ubiquity settings|change ubiquity preferences|change
ubiquity skin"
msgstr "modifica impostazioni di ubiquity|modifica preferenze di
ubiquity|modifica tema di ubiquity"

msgctxt "change ubiquity settings.description"
msgid "Takes you to thesettings  page,\n"
"  where you can change your skin, key combinations, etc."
msgstr "Apre la paginadelle impostazioni
di Ubiquity,\n"
" dalla quale è possibile modificare la combinazione da tastiera
utilizzata per richiamare Ubiquity, il tema, ecc."



but, obviusly,  with the code above the  last string is not matched. If
I use re.DOTALL to match also new line character it not works because it
match the entire file, I would like to stop the matching when "msgstr"
is found.

regex=re.compile("msgid (.*)\\nmsgstr (.*)\\n\\n\\n",re.DOTALL)

is it possible or not ?


You could try:

regex = re.compile(r"msgid (.*(?:\n".*")*)\nmsgstr (.*(?:\n".*")*)$")

and then, if necessary, tidy what you get.



MRAB, thank you for your help, I have tried the code posted by Hallvard 
because I have seen it before and it works. Now I'll check also your 
suggestions.

Sandro

--
*Pink Floyd – The Great Gig in the Sky* - http://sn.im/kggo7
* FAQ* di /it-alt.comp.software.mozilla/: http://bit.ly/1MZ04d
--
http://mail.python.org/mailman/listinfo/python-list


Re: Re: Why is my code faster with append() in a loop than with a large list?

2009-07-06 Thread Dave Angel

MRAB wrote:
Dave 
Angel wrote:

MRAB wrote:
Dave 
Angel wrote:

[snip]
It would probably save some time to not bother storing the zeroes 
in the list at all.  And it should help if you were to step through 
a list of primes, rather than trying every possible int.  Or at 
least constrain yourself to odd numbers (after the initial case of 2).



Or stop looking for more factors when you've passed the square root of
num. I don't know what effect there'll be on the time if you 
recalculate

the square root when num changes (expensive calculation vs smaller
search space).



But if I remember the code, it stopped when the quotient is one, 
which is usually sooner than the square root.  And no need to 
precalculate the square root.



If the number is a large prime then the code will try all the numbers up
to that, eg if num == 103 then it'll try 2..103 even though it
could've given up after 1000.



That's one reason I suggested (and Piet implemented) a sieve.  You can 
stop dividing when the square of the next prime exceeds your quotient.


--
http://mail.python.org/mailman/listinfo/python-list


Re: Clarity vs. code reuse/generality

2009-07-06 Thread David Niergarth
I remember in college taking an intro programming class (C++) where
the professor started us off writing a program to factor polynomials;
he probably also incorporated binary search into an assignment. But
people don't generally use Python to implement binary search or factor
polynomials so maybe you should start with a problem more germane to
typical novice users (and less algorithm-y). Wouldn't starting them
off with string processing or simple calculations be a practical way
to get comfortable with the language?

--David

On Jul 3, 9:05 am, kj  wrote:
> I'm will be teaching a programming class to novices, and I've run
> into a clear conflict between two of the principles I'd like to
> teach: code clarity vs. code reuse.  I'd love your opinion about
> it.
>
> The context is the concept of a binary search.  In one of their
> homeworks, my students will have two occasions to use a binary
> search.  This seemed like a perfect opportunity to illustrate the
> idea of abstracting commonalities of code into a re-usable function.
> So I thought that I'd code a helper function, called _binary_search,
> that took five parameters: a lower limit, an upper limit, a
> one-parameter function, a target value, and a tolerance (epsilon).
> It returns the value of the parameter for which the value of the
> passed function is within the tolerance of the target value.
>
> This seemed straightforward enough, until I realized that, to be
> useful to my students in their homework, this _binary_search function
> had to handle the case in which the passed function was monotonically
> decreasing in the specified interval...
>
> The implementation is still very simple, but maybe not very clear,
> particularly to programming novices (docstring omitted):
>
> def _binary_search(lo, hi, func, target, epsilon):
>     assert lo < hi
>     assert epsilon > 0
>     sense = cmp(func(hi), func(lo))
>     if sense == 0:
>         return None
>     target_plus = sense * target + epsilon
>     target_minus = sense * target - epsilon
>     while True:
>         param = (lo + hi) * 0.5
>         value = sense * func(param)
>         if value > target_plus:
>             hi = param
>         elif value < target_minus:
>             lo = param
>         else:
>             return param
>
>         if lo == hi:
>             return None
>
> My question is: is the business with sense and cmp too "clever"?
>
> Here's the rub: the code above is more general (hence more reusable)
> by virtue of this trick with the sense parameter, but it is also
> a bit harder to understand.
>
> This not an unusual situation.  I find that the processing of
> abstracting out common logic often results in code that is harder
> to read, at least for the uninitiated...
>
> I'd love to know your opinions on this.
>
> TIA!
>
> kj

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: A Bug By Any Other Name ...

2009-07-06 Thread Pablo Torres N.
On Mon, Jul 6, 2009 at 07:12, Hendrik van Rooyen wrote:
> "Terry Reedy"  wrote:
>
>> Gabriel Genellina wrote:
>> >
>> > In this case, a note in the documentation warning about the potential
>> > confusion would be fine.
>>
>> How would that help someone who does not read the doc?
>
> It obviously won't.
>
> All it will do, is that it will enable people on this group,
> who may read the manual, to tell people who complain,
> to RTFM.

Yes, it's their problem if they don't read the docs.

>
>  I agree that it would be a good idea to make it an
> error, or a warning - "this might not do what you
> think it does", or an "are you sure?" exception.
>
>  :-)
>
> - Hendrik

That would be even harder than adding a line to the docs.  Besides,
the problem that Mr. alex23 pointed: "where do you stop?" would really
get out of hand.

-- 
Pablo Torres N.
-- 
http://mail.python.org/mailman/listinfo/python-list


Ctypes to wrap libusb-1.0

2009-07-06 Thread Scott Sibley
I have been having issues trying to wrap libusb-1.0 with ctypes. Actually,
there's not much of an issue if I keep everything synchronous, but I need
this to be asynchronous and that is where my problem arises.

Please refer to the following link on Stackoverflow for a full overview of
the issue.

http://stackoverflow.com/questions/1060305/usb-sync-vs-async-vs-semi-async-partially-answered-now

As mentioned in the question on Stackoverflow, synchronous transfers work
great. I wrote an asynchronous C version that works fine. usbmon's output
suggests the transfers are making it through. libusb's debug output shows
nothing out of the ordinary.

I've also asked about this on the libusb mailing list. I've hesitated asking
on the Python mailing list up till now.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python and webcam capture delay?

2009-07-06 Thread Nobody
On Mon, 06 Jul 2009 20:41:03 +0300, jack catcher (nick) wrote:

>> Does the webcam just deliver frames, or are you getting frames out of
>> a decoder layer?  If it's the latter, you want to distribute the encoded
>> video, which should be much lower bandwidth.  Exactly how you do that
>> depends a bit on what format the webcam claims to deliver.
> 
> Well, getting already encoded video from the webcam sounds almost like a 
> free lunch (which it probably is not). At least I wouldn't want to get 
> too long a delay because of the encoding.
> 
> I haven't got the webcam(s) yet, and I guess I can basically purchase 
> any ones I find suitable for getting the job done. Any recommendations?

The webcam is bound to do some encoding; most of them use USB "full speed"
(12Mbit/sec), which isn't enough for raw 640x480x24...@30fps data.

Chroma subsampling and JPEG compression will both reduce the bandwidth
without introducing noticable latency (the compression time will be no
more than one frame).

Temporal compression will reduce the bandwidth further. Using B-frames
(frames which contain the differences from a predicted frame which is
based upon both previous and subsequent frames) will provide more 
compression, but increases the latency significantly. For this reason, the
compression built into video cameras normally only uses P-frames (frames
which contain the differences from a predicted frame which is based only
upon previous frames).

The biggest issue is likely to be finding latency specifications, followed
by finding a camera which meets your latency requirement.

Also, any "read frame, write frame" loops will add an additional frame of
latency, as you can't start writing the first byte of the frame until
after you've read the last byte of the frame. Video APIs which let you get
rows as they're decoded are rare.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: try -> except -> else -> except?

2009-07-06 Thread Bruno Desthuilliers

David House a écrit :

Hi all,

I'm looking for some structure advice. I'm writing something that
currently looks like the following:

try:

except KeyError:

else:


This is working fine. However, I now want to add a call to a function
in the `else' part that may raise an exception, say a ValueError.


If your error handler terminates the function (which is usually the case 
when using the else clause), you can just skip the else statement, ie:


try:

except KeyError:



Then adding one or more try/except is just trivial.



So I
was hoping to do something like the following:

try:

except KeyError:

else:

except ValueError:


However, this isn't allowed in Python.


Nope. But this is legal:


try:

except KeyError:

else:
try:

except ValueError:





An obvious way round this is to move the `else' clause into the `try'


"obvious" but not necessarily the best thing to do.

(snip - cf above for simple answers)


--
http://mail.python.org/mailman/listinfo/python-list


Re: try -> except -> else -> except?

2009-07-06 Thread Bruno Desthuilliers

Python a écrit :
(snip whole OP)



as far as I know try has no 'else'


Then you may want to RTFM.

--
http://mail.python.org/mailman/listinfo/python-list


Re: How Python Implements "long integer"?

2009-07-06 Thread Bruno Desthuilliers

Mark Dickinson a écrit :

On Jul 5, 1:09 pm, Pedram  wrote:

Thanks for reply,
Sorry I can't explain too clear! I'm not English ;)


That's shocking.  Everyone should be English. :-)


Mark, tu sors !
--
http://mail.python.org/mailman/listinfo/python-list


Re: A Bug By Any Other Name ...

2009-07-06 Thread Terry Reedy

Mark Dickinson wrote:

On Jul 6, 3:32 am, Lawrence D'Oliveiro  wrote:

I wonder how many people have been tripped up by the fact that

++n

and

--n

fail silently for numeric-valued n.


Rather few, it seems.


Recent python-ideas discussion on this subject:

http://mail.python.org/pipermail/python-ideas/2009-March/003741.html


To add to what I wrote in that thread: it is C, not Python, that is out 
of step with standard usage in math and most languages. --x = x; 1/1/x = 
x; non not a = a; inverse(inverse(f)) = f; etc. And ++x= +x = x 
corresponded to I(I(x)) = I(x) = x, where I is identity function.


In C as high-level assembler, the inc and dec functions are written as 
-- and ++ operators for some mix of the following reasons. 1) They 
correspond to machine language / assembler functions. 2) They need to be 
efficient since they are used in inner loops. Function calls have 
overhead. Original C did not allow inlining of function calls as best I 
remember. 3) They save keystrokes; 4) 2 versions that return pre and 
post change values are needed. My impression is that some opcode sets 
(and hence assemblers) have only one, and efficient code requires 
allowing direct access to whichever one a particular processor supports. 
Basic routines can usually be written either way.


These reasons do not apply to Python or do not fit Python's style.
Anyone who spends 15 minutes skimming the chapter on expressions could 
notice that 5.5. Unary arithmetic and bitwise operations has only +,-, 
and ~ or that the Summary operator table at the end has no -- or ++.


Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: Tree structure consuming lot of memory

2009-07-06 Thread Antoine Pitrou
mayank gupta  gmail.com> writes:
> 
> After a little analysis, I found out that in general it uses about
> 1.4 kb of memory for each node!!

How did you measure memory use? Python objects are not very compact, but 1.4KB
per object seems a bit too much (I would expect more about 150-200 bytes/object
in 32-bit mode, or 300-400 bytes/object in 64-bit mode).

One of the solutions is to use __slots__ as already suggested. Another, which
will have similar benefits, is to use a namedtuple. Both suppress the instance
dictionnary (`instance`.__dict__), which is a major contributor to memory
consumption. Illustration (64-bit mode, by the way):

>>> import sys
>>> from collections import namedtuple

# First a normal class
>>> class Node(object): pass
... 
>>> o = Node()
>>> o.value = 1
>>> o.children = ()
>>> 
>>> sys.getsizeof(o)
64
>>> sys.getsizeof(o.__dict__)
280
# The object seems to take a mere 64 bytes, but the attribute dictionnary
# adds a whoppy 280 bytes and bumps actual size to 344 bytes!

# Now a namedtuple (a tuple subclass with property accessors for the various
# tuple items)
>>> Node = namedtuple("Node", "value children")
>>> 
>>> o = Node(value=1, children=())
>>> sys.getsizeof(o)
72
>>> sys.getsizeof(o.__dict__)
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'Node' object has no attribute '__dict__'

# The object doesn't have a __dict__, so 72 bytes is its real total size.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: updating, adding new pages to confluence remotely, using python

2009-07-06 Thread Terry Reedy

pescadero10 wrote:


I am new to python and have been trying to figure out how to remotely
add new pages to my confluence
wiki space. I'm running my python script from a linux rhel4 machine
and using confluence version 2.10. As a test I tried to read from
stdin and write a page but it fails- that is, the script runs without
errors but nothing is added.  Does anyone have an example of how this
is done?  Here is my script:

--- begin script ---
#!/usr/local/bin/python
#
# Reads from standard input, dumps it onto a Confluence page
# You'll need to modify the URL/username/password/spacekey/page title
# below, because I'm too lazy to bother with argv.

import sys
from xmlrpclib import Server

# Read the text of the page from standard input
content = sys.stdin.read()

s = Server("http://confluence.slac.stanford.edu/display/GO/Home";)
token = s.confluence1.login("chee", "**")
page = s.confluence1.getPage(token, "SPACEKEY", "TEST Python-2-
Confluence")
page["content"] = content
s.confluence1.storePage(token, page)

newpagedata = {"title":"New Page","content":"new
content","space":"spaceKey"}
newpage = s.confluence1.storePage(token, newpagedata);
 end script 

Any help would be greatly appreciated.


You neglected to specify Python version.

As near as I can tell from 2.x docs, xmlrpclib has ServerProxy class but 
no Server class. Whoops, just saw "Server is retained as an alias for 
ServerProxy for backwards compatibility. New code should use 
ServerProxy". Good idea -- calling a client 'server' is confusing.


Do you have access to logs on remote machine to see what was received? 
Or to a sysadmin?


tjr

--
http://mail.python.org/mailman/listinfo/python-list


Re: regex question on .findall and \b

2009-07-06 Thread Ethan Furman
Many thanks to all who replied!  And, yes, I will *definitely* use raw 
strings from now on.  :)


~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list


Cleaning up after failing to contructing objects

2009-07-06 Thread brasse
Hello!

I have been thinking about how write exception safe constructors in
Python. By exception safe I mean a constructor that does not leak
resources when an exception is raised within it. The following is an
example of one possible way to do it:

class Foo(object):
def __init__(self, name, fail=False):
self.name = name
if not fail:
print '%s.__init__(%s)' % (self.__class__.__name__,
self.name)
else:
print '%s.__init__(%s), FAIL' % (self.__class__.__name__,
 self.name)
raise Exception()

def close(self):
print '%s.close(%s)' % (self.__class__.__name__, self.name)

class Bar(object):
def __init__(self):
try:
self.a = Foo('a')
self.b = Foo('b', fail=True)
except:
self.close()


def close(self):
if hasattr(self, 'a'):
self.a.close()
if hasattr(self, 'b'):
self.b.close()

bar = Bar()

As you can see this is less than straight forward. Is there some kind
of best practice that I'm not aware of?

:.:: mattias
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why re.match()?

2009-07-06 Thread kj
In <4a4e2227$0$7801$426a7...@news.free.fr> Bruno Desthuilliers 
 writes:

>kj a écrit :
>(snipo
>> To have a special-case
>> re.match() method in addition to a general re.search() method is
>> antithetical to language minimalism,

>FWIW, Python has no pretention to minimalism.

Assuming that you mean by this that Python's authors have no such
pretensions:

"There is real value in having a small language."

Guido van Rossum, 2007.07.03

http://mail.python.org/pipermail/python-3000/2007-July/008663.html

So there.

BTW, that's just one example.  I've seen similar sentiments expressed
by Guido over and over and over: any new proposed enhancement to
Python must be good enough in his mind to justify cluttering the
language.  That attitude counts as minimalism in my book.

The best explanation I have found so far for re.match is that it
is an unfortunate bit of legacy, something that would not be there
if the design of Python did not have to be mindful of keeping old
code chugging along...

kj
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Opening a SQLite database in readonly mode

2009-07-06 Thread Joshua Kugler
Paul Moore wrote:
> The SQLite documentation mentions a flag, SQLITE_OPEN_READONLY, to
> open a database read only. I can't find any equivalent documented in
> the Python standard library documentation for the sqlite3 module (or,
> for that matter, on the pysqlite library's website).
> 
> Is it possible to open a sqlite database in readonly mode, in Python?

Yes, but most likely not with pysqlite.  The python sqlite3 module in the
standard library, and the pysqlite module are both DB-API compliant, which
means they probably do not have a method to open the DB read only (as that
is usually enforced at the user permission level).

If you want to use that flag, take a look at APSW. It is a very thin layer
on top of the SQLite C API, and thus supports everything that the C API
supports.  It is not, however, DB API compliant, but is very close so not
hard to pick up.

BTW, APSW is written by the same author as pysqlite.

j


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tree structure consuming lot of memory

2009-07-06 Thread mayank gupta
I worked out a small code which initializes about 1,000,000 nodes with some
attributes, and saw the memory usage on my linux machine (using 'top'
command). Then just later I averaged out the memory usage per node. I know
this is not the most accurate way but just for estimated value.

The kind of Node class I am working on in my original code is like :

class Node:
 def __init__(self, #attributes ):
 self.coordinates = coordinates
 self.index = index
 self.sibNum = sibNum
 self.branchNum - branchNum

#here 'coordinates' and 'index' are LISTS with length = "dimension", where
"dimension" is a user-input.

The most shocking part of it after the memory-analysis was that, the memory
usage was never dependent on the "dimension". Yeah it varied a bit, but
there wasnt any significant changes in the memory usage even when the
"dimension" was doubled

-- Any clues?

Thank you for all your suggestions till this point.

Regards.




On Tue, Jul 7, 2009 at 1:28 AM, Antoine Pitrou  wrote:

> mayank gupta  gmail.com> writes:
> >
> > After a little analysis, I found out that in general it uses about
> > 1.4 kb of memory for each node!!
>
> How did you measure memory use? Python objects are not very compact, but
> 1.4KB
> per object seems a bit too much (I would expect more about 150-200
> bytes/object
> in 32-bit mode, or 300-400 bytes/object in 64-bit mode).
>
> One of the solutions is to use __slots__ as already suggested. Another,
> which
> will have similar benefits, is to use a namedtuple. Both suppress the
> instance
> dictionnary (`instance`.__dict__), which is a major contributor to memory
> consumption. Illustration (64-bit mode, by the way):
>
> >>> import sys
> >>> from collections import namedtuple
>
> # First a normal class
> >>> class Node(object): pass
> ...
> >>> o = Node()
> >>> o.value = 1
> >>> o.children = ()
> >>>
> >>> sys.getsizeof(o)
> 64
> >>> sys.getsizeof(o.__dict__)
> 280
> # The object seems to take a mere 64 bytes, but the attribute dictionnary
> # adds a whoppy 280 bytes and bumps actual size to 344 bytes!
>
> # Now a namedtuple (a tuple subclass with property accessors for the
> various
> # tuple items)
> >>> Node = namedtuple("Node", "value children")
> >>>
> >>> o = Node(value=1, children=())
> >>> sys.getsizeof(o)
> 72
> >>> sys.getsizeof(o.__dict__)
> Traceback (most recent call last):
>  File "", line 1, in 
> AttributeError: 'Node' object has no attribute '__dict__'
>
> # The object doesn't have a __dict__, so 72 bytes is its real total size.
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>



-- 
I luv to walk in rain bcoz no one can see me crying
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Tree structure consuming lot of memory

2009-07-06 Thread Chris Rebert
> On Tue, Jul 7, 2009 at 1:28 AM, Antoine Pitrou  wrote:
>>
>> mayank gupta  gmail.com> writes:
>> >
>> > After a little analysis, I found out that in general it uses about
>> > 1.4 kb of memory for each node!!
>>
>> How did you measure memory use? Python objects are not very compact, but
>> 1.4KB
>> per object seems a bit too much (I would expect more about 150-200
>> bytes/object
>> in 32-bit mode, or 300-400 bytes/object in 64-bit mode).
>>
>> One of the solutions is to use __slots__ as already suggested. Another,
>> which
>> will have similar benefits, is to use a namedtuple. Both suppress the
>> instance
>> dictionnary (`instance`.__dict__), which is a major contributor to memory
>> consumption. Illustration (64-bit mode, by the way):
>>
>> >>> import sys
>> >>> from collections import namedtuple
>>
>> # First a normal class
>> >>> class Node(object): pass
>> ...
>> >>> o = Node()
>> >>> o.value = 1
>> >>> o.children = ()
>> >>>
>> >>> sys.getsizeof(o)
>> 64
>> >>> sys.getsizeof(o.__dict__)
>> 280
>> # The object seems to take a mere 64 bytes, but the attribute dictionnary
>> # adds a whoppy 280 bytes and bumps actual size to 344 bytes!
>>
>> # Now a namedtuple (a tuple subclass with property accessors for the
>> various
>> # tuple items)
>> >>> Node = namedtuple("Node", "value children")
>> >>>
>> >>> o = Node(value=1, children=())
>> >>> sys.getsizeof(o)
>> 72
>> >>> sys.getsizeof(o.__dict__)
>> Traceback (most recent call last):
>>  File "", line 1, in 
>> AttributeError: 'Node' object has no attribute '__dict__'
>>
>> # The object doesn't have a __dict__, so 72 bytes is its real total size.
On Mon, Jul 6, 2009 at 1:30 PM, mayank gupta wrote:
> I worked out a small code which initializes about 1,000,000 nodes with some
> attributes, and saw the memory usage on my linux machine (using 'top'
> command). Then just later I averaged out the memory usage per node. I know
> this is not the most accurate way but just for estimated value.

You should try the more accurate sys.getsizeof() function:
http://docs.python.org/library/sys.html#sys.getsizeof

Cheers,
Chris

P.S. Please don't top-post in the future.
-- 
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Cleaning up after failing to contructing objects

2009-07-06 Thread Scott David Daniels

brasse wrote:

I have been thinking about how write exception safe constructors in
Python. By exception safe I mean a constructor that does not leak
resources when an exception is raised within it. 

...
> As you can see this is less than straight forward. Is there some kind
> of best practice that I'm not aware of?

Not so tough.  Something like this tweaked version of your example:

class Foo(object):
def __init__(self, name, fail=False):
self.name = name
if not fail:
print '%s.__init__(%s)' % (type(self).__name__, name)
else:
print '%s.__init__(%s), FAIL' % (type(self).__name__, name)
raise ValueError('Asked to fail: %r' % fail)

def close(self):
print '%s.close(%s)' % (type(self).__name__, self.name)


class Bar(object):
def __init__(self):
unwind = []
try:
self.a = Foo('a')
unwind.append(a)
self.b = Foo('b', fail=True)
unwind.append(b)
...
except Exception, why:
while unwind):
unwind.pop().close()
raise

bar = Bar()

--Scott David Daniels
scott.dani...@acm.org
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why re.match()?

2009-07-06 Thread Diez B. Roggisch

kj schrieb:

In <4a4e2227$0$7801$426a7...@news.free.fr> Bruno Desthuilliers 
 writes:


kj a �crit :
(snipo

To have a special-case
re.match() method in addition to a general re.search() method is
antithetical to language minimalism,



FWIW, Python has no pretention to minimalism.


Assuming that you mean by this that Python's authors have no such
pretensions:

"There is real value in having a small language."

Guido van Rossum, 2007.07.03

http://mail.python.org/pipermail/python-3000/2007-July/008663.html

So there.

BTW, that's just one example.  I've seen similar sentiments expressed
by Guido over and over and over: any new proposed enhancement to
Python must be good enough in his mind to justify cluttering the
language.  That attitude counts as minimalism in my book.

The best explanation I have found so far for re.match is that it
is an unfortunate bit of legacy, something that would not be there
if the design of Python did not have to be mindful of keeping old
code chugging along...


language != libraries.

Diez
--
http://mail.python.org/mailman/listinfo/python-list


Catching control-C

2009-07-06 Thread Michael Mossey
What is required in a python program to make sure it catches a control-
c on the command-line? Do some i/o? The OS here is Linux.

Thanks,
Mike
-- 
http://mail.python.org/mailman/listinfo/python-list


getting text from webpage that has embedded flash

2009-07-06 Thread Oisin
HI,

Im trying to parse a bands myspace page and get the total number of
plays for their songs. e.g. http://www.myspace.com/mybloodyvalentine

The problem is that I cannot use urllib2 as the "Total plays" string
does not appear in the page source.

Any idea of ways around this?

Thanks,
O
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Catching control-C

2009-07-06 Thread Chris Rebert
On Mon, Jul 6, 2009 at 2:37 PM, Michael Mossey wrote:
> What is required in a python program to make sure it catches a control-
> c on the command-line? Do some i/o? The OS here is Linux.

try:
#code that reads input
except KeyboardInterrupt:
#Ctrl-C was pressed

Cheers,
Chris
-- 
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Catching control-C

2009-07-06 Thread Philip Semanchuk


On Jul 6, 2009, at 5:37 PM, Michael Mossey wrote:

What is required in a python program to make sure it catches a  
control-

c on the command-line? Do some i/o? The OS here is Linux.


You can use a try/except to catch a KeyboardInterrupt exception, or  
you can trap it using the signal module:

http://docs.python.org/library/signal.html

You want to trap SIGINT.


HTH
Philip
--
http://mail.python.org/mailman/listinfo/python-list


Re: Catching control-C

2009-07-06 Thread Michael Mossey
On Jul 6, 2:47 pm, Philip Semanchuk  wrote:
> On Jul 6, 2009, at 5:37 PM, Michael Mossey wrote:
>
> > What is required in a python program to make sure it catches a  
> > control-
> > c on the command-line? Do some i/o? The OS here is Linux.
>
> You can use a try/except to catch a KeyboardInterrupt exception, or  
> you can trap it using the signal 
> module:http://docs.python.org/library/signal.html
>
> You want to trap SIGINT.
>
> HTH
> Philip

Thanks to both of you. However, my question is also about whether I
need to be doing i/o or some similar operation for my program to
notice in any shape or form that Control-C has been pressed. In the
past, I've written Python programs that go about their business
ignoring Ctrl-C. Other programs respond to it immediately by exiting.
I think the difference is that the latter programs are doing i/o. But
I want to understand better what the "secret" is to responding to a
ctrl-C in any shape or form.

For example, does trapping SIGINT always work, regardless of what my
process is doing?

Thanks,
Mike
-- 
http://mail.python.org/mailman/listinfo/python-list


Semi-Newbie needs a little help

2009-07-06 Thread Nile
I am trying to write a simple little program to do some elementary
stock market analysis.  I read lines, send each line to a function and
then the function returns a date which serves as a key to a
dictionary. Each time a date is returned I want to increment the value
associated with that date. The function seems to be working properly.
By means of a print statement I have inserted just before the return
value I can see there are three dates that are returned which is
correct.  The dictionary only seems to capture the last date. My test
data consists of five stocks, each stock with five days. The correct
answer would be a count of 5 for the second day, the third day, and
the last day -- 11/14/2008.

Here is the a code, followed by a portion of the output.  I know
enough to write simple little programs like this with no problems up
until now but I don't know enough to figure out what I am doing
wrong.

Code

for x in range(len(file_list)):
d = open(file_list[x] , "r")
data = d.readlines()
k = above_or_below(data)# This
function seems to work correctly
print "here is the value that was returned " , k
dict[k] = dict.get(k,0) + 1

dict_list = dict.values()
print "here is a list of the dictionary values ", dict_list
print "the length of the dictionary is ", len(dict)

And here is some output
Function will return k which = 11/11/2008   #  These 3 lines are
printed from the function just before the return
Function will return k which = 11/12/2008   #   This sample shows
stocks 4 and 5 but 1,2,3 are the same.
Function will return k which = 11/14/2008
here is the value that was returned 11/14/2008 # printed from
code above  - only the last day seems to be
Function will return k which = 11/11/2008 #
recognized.
Function will return k which = 11/12/2008
Function will return k which = 11/14/2008
here is the value that was returned 11/14/2008
here is a list of the dictionary values [5] # dict has
counted only the last day for 5 stocks
the length of the dictionary is 1
>Exit code: 0



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Catching control-C

2009-07-06 Thread Philip Semanchuk


On Jul 6, 2009, at 6:02 PM, Michael Mossey wrote:


On Jul 6, 2:47 pm, Philip Semanchuk  wrote:

On Jul 6, 2009, at 5:37 PM, Michael Mossey wrote:


What is required in a python program to make sure it catches a
control-
c on the command-line? Do some i/o? The OS here is Linux.


You can use a try/except to catch a KeyboardInterrupt exception, or
you can trap it using the signal 
module:http://docs.python.org/library/signal.html

You want to trap SIGINT.

HTH
Philip


Thanks to both of you. However, my question is also about whether I
need to be doing i/o or some similar operation for my program to
notice in any shape or form that Control-C has been pressed. In the
past, I've written Python programs that go about their business
ignoring Ctrl-C. Other programs respond to it immediately by exiting.
I think the difference is that the latter programs are doing i/o. But
I want to understand better what the "secret" is to responding to a
ctrl-C in any shape or form.

For example, does trapping SIGINT always work, regardless of what my
process is doing?


Hi Mike,
Sorry, I don't know the Python internals well enough to answer your  
question.


Good luck
Philip

--
http://mail.python.org/mailman/listinfo/python-list


Re: Semi-Newbie needs a little help

2009-07-06 Thread Chris Rebert
On Mon, Jul 6, 2009 at 3:02 PM, Nile wrote:
> I am trying to write a simple little program to do some elementary
> stock market analysis.  I read lines, send each line to a function and
> then the function returns a date which serves as a key to a
> dictionary. Each time a date is returned I want to increment the value
> associated with that date. The function seems to be working properly.
> By means of a print statement I have inserted just before the return
> value I can see there are three dates that are returned which is
> correct.  The dictionary only seems to capture the last date. My test
> data consists of five stocks, each stock with five days. The correct
> answer would be a count of 5 for the second day, the third day, and
> the last day -- 11/14/2008.
>
> Here is the a code, followed by a portion of the output.  I know
> enough to write simple little programs like this with no problems up
> until now but I don't know enough to figure out what I am doing
> wrong.

>    for x in range(len(file_list)):

for filename in file_list:
#I'm assuming the lack of indentation on the subsequent lines is a
mere transcription error...

>    d = open(file_list[x] , "r")

d = open(filename , "r")

>    data = d.readlines()
>    k = above_or_below(data)                                # This
> function seems to work correctly
>    print "here is the value that was returned " , k
>    dict[k] = dict.get(k,0) + 1

`dict` is the name of a builtin type. Please rename this variable to
avoid shadowing the type.
Also, where is this variable even initialized? It's not in this code
snippet you gave.
Further, I would recommend using a defaultdict
(http://docs.python.org/dev/library/collections.html#collections.defaultdict)
rather than a regular dictionary; this would make the
count-incrementing part nicer.

Taking these changes into account, your code becomes:

from collections import defaultdict

counts = defaultdict(lambda: 0)

for filename in file_list:
d = open(filename , "r")
data = d.readlines()
k = above_or_below(data) # This function seems to work correctly
print "here is the value that was returned " , k
counts[k] += 1

values = counts.values()
print "here is a list of the dictionary values ", values
print "the length of the dictionary is ", len(counts)


I don't immediately see what's causing your problem, but guess that it
might've be related to the initialization of the `dict` variable.

Cheers,
Chris
-- 
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Semi-Newbie needs a little help

2009-07-06 Thread Pablo Torres N.
On Mon, Jul 6, 2009 at 17:02, Nile wrote:
> Code
>
>    for x in range(len(file_list)):
>    d = open(file_list[x] , "r")
>    data = d.readlines()
>    k = above_or_below(data)                                # This
> function seems to work correctly
>    print "here is the value that was returned " , k
>    dict[k] = dict.get(k,0) + 1
>
>    dict_list = dict.values()
>    print "here is a list of the dictionary values ", dict_list
>    print "the length of the dictionary is ", len(dict)

Correcting your indentation errors and moving your comments above the
line they reference will attract more help from others in this list
;-)

Also, I'd recommend limiting your line length to 80 chars, since lines
are wrapped anyway.


-- 
Pablo Torres N.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Catching control-C

2009-07-06 Thread Ben Charrow
Michael Mossey wrote:
> On Jul 6, 2:47 pm, Philip Semanchuk  wrote:
>> On Jul 6, 2009, at 5:37 PM, Michael Mossey wrote:
>> 
>>> What is required in a python program to make sure it catches a control- 
>>> c on the command-line? Do some i/o? The OS here is Linux.
>> You can use a try/except to catch a KeyboardInterrupt exception, or you 
>> can trap it using the signal 
>> module:http://docs.python.org/library/signal.html
>> 
>> You want to trap SIGINT.
>> 
>> HTH Philip
> 
> Thanks to both of you. However, my question is also about whether I need to 
> be doing i/o or some similar operation for my program to notice in any shape
>  or form that Control-C has been pressed.

You don't need to be doing I/O in order to raise a KeyboardIneterrupt.  For
example, the following program should use up a lot of your CPU until you
hit Ctrl-C.

>>> while True:
... pass
...
^CTraceback (most recent call last):
  File "", line 1, in 
KeyboardInterrupt

> In the past, I've written Python programs that go about their business 
> ignoring Ctrl-C.

Can you be more specific?  Can you share the relevant sections of these
programs?  Were these programs multi-threaded?

> But I want to understand better what the "secret" is to responding to a 
> ctrl-C in any shape or form.

Strictly speaking, I don't think you can always respond to a Ctrl-C in any
shape or form.  Quoting from the signal module:

Although Python signal handlers are called asynchronously as far as the Python
user is concerned, they can only occur between the "atomic" instructions of the
Python interpreter. This means that signals arriving during long calculations
implemented purely in C (such as regular expression matches on large bodies of
text) may be delayed for an arbitrary amount of time.

HTH,
Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Semi-Newbie needs a little help

2009-07-06 Thread MRAB

Chris Rebert wrote:

On Mon, Jul 6, 2009 at 3:02 PM, Nile wrote:

I am trying to write a simple little program to do some elementary
stock market analysis.  I read lines, send each line to a function and
then the function returns a date which serves as a key to a
dictionary. Each time a date is returned I want to increment the value
associated with that date. The function seems to be working properly.
By means of a print statement I have inserted just before the return
value I can see there are three dates that are returned which is
correct.  The dictionary only seems to capture the last date. My test
data consists of five stocks, each stock with five days. The correct
answer would be a count of 5 for the second day, the third day, and
the last day -- 11/14/2008.

Here is the a code, followed by a portion of the output.  I know
enough to write simple little programs like this with no problems up
until now but I don't know enough to figure out what I am doing
wrong.



   for x in range(len(file_list)):


for filename in file_list:
#I'm assuming the lack of indentation on the subsequent lines is a
mere transcription error...


   d = open(file_list[x] , "r")


d = open(filename , "r")


   data = d.readlines()
   k = above_or_below(data)# This
function seems to work correctly
   print "here is the value that was returned " , k
   dict[k] = dict.get(k,0) + 1


`dict` is the name of a builtin type. Please rename this variable to
avoid shadowing the type.
Also, where is this variable even initialized? It's not in this code
snippet you gave.
Further, I would recommend using a defaultdict
(http://docs.python.org/dev/library/collections.html#collections.defaultdict)
rather than a regular dictionary; this would make the
count-incrementing part nicer.

Taking these changes into account, your code becomes:

from collections import defaultdict

counts = defaultdict(lambda: 0)


Better is:

counts = defaultdict(int)


for filename in file_list:
d = open(filename , "r")
data = d.readlines()
k = above_or_below(data) # This function seems to work correctly
print "here is the value that was returned " , k
counts[k] += 1

values = counts.values()
print "here is a list of the dictionary values ", values
print "the length of the dictionary is ", len(counts)


I don't immediately see what's causing your problem, but guess that it
might've be related to the initialization of the `dict` variable.


It might be that the indentation was wrong where the count is
incremented, but I can't tell because none of the lines were shown
indented.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Opening a SQLite database in readonly mode

2009-07-06 Thread Paul Moore
2009/7/6 Joshua Kugler :
> Paul Moore wrote:
>> The SQLite documentation mentions a flag, SQLITE_OPEN_READONLY, to
>> open a database read only. I can't find any equivalent documented in
>> the Python standard library documentation for the sqlite3 module (or,
>> for that matter, on the pysqlite library's website).
>>
>> Is it possible to open a sqlite database in readonly mode, in Python?
>
> Yes, but most likely not with pysqlite.  The python sqlite3 module in the
> standard library, and the pysqlite module are both DB-API compliant, which
> means they probably do not have a method to open the DB read only (as that
> is usually enforced at the user permission level).
>
> If you want to use that flag, take a look at APSW. It is a very thin layer
> on top of the SQLite C API, and thus supports everything that the C API
> supports.  It is not, however, DB API compliant, but is very close so not
> hard to pick up.
>
> BTW, APSW is written by the same author as pysqlite.

Excellent, thanks. I'll have to think whether I want the extra
dependency but at least I know it's possible now.

Thanks for the pointer.
Paul.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Semi-Newbie needs a little help

2009-07-06 Thread Nile
On Jul 6, 5:30 pm, "Pablo Torres N."  wrote:
> On Mon, Jul 6, 2009 at 17:02, Nile wrote:
> > Code
>
> >    for x in range(len(file_list)):
> >    d = open(file_list[x] , "r")
> >    data = d.readlines()
> >    k = above_or_below(data)                                # This
> > function seems to work correctly
> >    print "here is the value that was returned " , k
> >    dict[k] = dict.get(k,0) + 1
>
> >    dict_list = dict.values()
> >    print "here is a list of the dictionary values ", dict_list
> >    print "the length of the dictionary is ", len(dict)
>
> Correcting your indentation errors and moving your comments above the
> line they reference will attract more help from others in this list
> ;-)
>
> Also, I'd recommend limiting your line length to 80 chars, since lines
> are wrapped anyway.
>
> --
> Pablo Torres N.

Yup - Sorry, first post ever - next ones will be better formatted
-- 
http://mail.python.org/mailman/listinfo/python-list


  1   2   >