Re: Fast recursive generators?

2011-10-29 Thread 88888 Dihedral
I am thinking the bye code compiler in python can be faster  if all known 
immutable instances up to the executionare compiled immutable objects to be 
assigned. 
-- 
http://mail.python.org/mailman/listinfo/python-list


ANN: psutil 0.4.0 released

2011-10-29 Thread Giampaolo Rodolà
Hi folks,
I'm pleased to announce the 0.4.0 release of psutil:
http://code.google.com/p/psutil

=== About ===

psutil is a module providing an interface for retrieving information
on all running processes and system utilization (CPU, disk, memory,
network) in a portable way by using Python, implementing many
functionalities offered by command line tools such as ps, top, free,
lsof and others.
It works on Linux, Windows, OSX and FreeBSD, both 32-bit and 64-bit,
with Python versions from 2.4 to 3.3 by using a single code base.

=== Major enhancements ===

Aside from fixing different high priority bugs this release introduces
two new important features: disks and network IO counters.
With these, you can monitor disks usage and network traffic.

2 scripts were added to provide an example of what kind of
applications can be written with these two hooks:
http://code.google.com/p/psutil/source/browse/trunk/examples/iotop.py
http://code.google.com/p/psutil/source/browse/trunk/examples/nettop.py

...and here you can see some screenshots:
http://code.google.com/p/psutil/#Example_applications

=== Other enhancements ==

- Process.get_connections() has a new 'kind' parameter to filters for
connections by using different criteria.
- timeout=0 parameter can now be passed to Process.wait() to make it
return immediately (non blocking).
- Python 3.2 installer for Windows 64 bit is now provided in downloads section.
- (FreeBSD) addeed support for Process.getcwd()
- (FreeBSD) Process.get_open_files() has been rewritten in C and no
longer relies on lsof.
- various crashes on module import across differnt platforms were fixed.

For a complete list of features and bug fixes see:
http://psutil.googlecode.com/svn/trunk/HISTORY


=== New features by example ===

>>> import psutil
>>>
>>> psutil.disk_io_counters()
iostat(read_count=8141, write_count=2431, read_bytes=290203,
   write_bytes=537676, read_time=5868, write_time=94922)
>>>
>>> psutil.disk_io_counters(perdisk=True)
{'sda1' :iostat(read_count=8141, write_count=2431, read_bytes=290203,
 write_bytes=537676, read_time=5868, write_time=94922),
 'sda2' :iostat(read_count=811241, write_count=31, read_bytes=1245,
 write_bytes=11246, read_time=768008, write_time=922)}
>>>
>>>
>>> psutil.network_io_counters()
iostat(bytes_sent=1270374, bytes_recv=7828365,
   packets_sent=9810, packets_recv=11794)
>>>
>>> psutil.network_io_counters(pernic=True)
{'lo': iostat(bytes_sent=800251705, bytes_recv=800251705,
  packets_sent=455778, packets_recv=455778),
 'eth0': iostat(bytes_sent=813731756, bytes_recv=4183672213,
packets_sent=3771021, packets_recv=4199213)}
>>>
>>>
>>> import os
>>> p = psutil.Process(os.getpid())
>>> p.get_connections(kind='tcp')
[connection(fd=115, family=2, type=1, local_address=('10.0.0.1', 48776),
remote_address=('93.186.135.91', 80), status='ESTABLISHED')]
>>> p.get_connections(kind='udp6')
[]
>>> p.get_connections(kind='inet6')
[]
>>>

=== Links ===

* Home page: http://code.google.com/p/psutil
* Source tarball: http://psutil.googlecode.com/files/psutil-0.4.0.tar.gz
* Api Reference: http://code.google.com/p/psutil/wiki/Documentation


As a final note I'd like to thank Jeremy Whitlock, who kindly
contributed disk/network io counters code for OSX and Windows.

Please try out this new release and let me know if you experience any
problem by filing issues on the bug tracker.
Thanks in advance.


--- Giampaolo Rodola'

http://code.google.com/p/pyftpdlib/
http://code.google.com/p/psutil/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Need Windows user / developer to help with Pynguin

2011-10-29 Thread Lee Harr

Thanks to all those who tested and replied!


> For Windows users who want to just run Pyguin (not modify or tinker
> with the source code), it would be best to bundle Pynguin up with
> Py2exe

I considered that, but I agree that licensing issues would make it
problematic.


> the Python installer associates .py and .pyw files with
> python.exe and pythonw.exe respectively, so if you add the extension as
> Terry mentioned, it should work.

I think this is the best way to go.


> pynguin => pynguin.py

Would it be better to go with pynguin.pyw ?


Actually, I was thinking of just copying that script to something
like run_pynguin.pyw

Anyone able to test if just making that single change makes it
so that a user can just unpack the .zip file and double click on
run_pynguin.pyw to run the app?



> Win7, with zip built in, just
> treats x.zip as a directory in Explorer

So, is that going to be a problem?

The user just sees it as a folder, goes in and double-clicks on
run_pynguin.pyw and once python is running what does
it see? Are the contents of the virtually "unpacked" folder
going to be available to the script?



> You pynguin.zip contains one top level file -- a directory called
> pynguin that contains multiple files

I feel that this is the correct way to create a .zip file.

I have run in to so many poorly formed .zip files (ones that extract

all of their files to the current directory) that when extracting any
.zip I always create a dummy folder and put the .zip in there before
extracting.

Mine is actually called pynguin-0.12.zip and extracts to a folder


called pynguin-0.12


> Extracting pynguin.zip to a
> pynguin directory in the same directory as pynguin.zip, the default
> behavior with Win7 at least, creates a new pynguin directory that
> contains the extracted pynguin directory.

So, windows now creates the dummy folder automatically?

Is the problem that the .zip has the same name (minus the extension)?

Would it solve the problem to just change the name of the archive
to something like pynguin012.zip ?



> README => README.txt

Just out of curiosity, what happens if you double-click the
README sans .txt? Does it make you choose which app to open with?

  
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Assigning generator expressions to ctype arrays

2011-10-29 Thread Patrick Maupin
On Oct 28, 3:24 pm, Terry Reedy  wrote:
> On 10/28/2011 2:05 PM, Patrick Maupin wrote:
>
> > On Oct 27, 10:23 pm, Terry Reedy  wrote:
> >> I do not think everyone else should suffer substantial increase in space
> >> and run time to avoid surprising you.
>
> > What substantial increase?
>
> of time and space, as I said, for the temporary array that I think would
> be needed and which I also described in the previous paragraph that you
> clipped

That's because I don't think it needs a temporary array.  A temporary
array would provide some invariant guarantees that are nice but not
necessary in a lot of real-world cases.
>
> >  There's already a check that winds up
> > raising an exception.  Just make it empty an iterator instead.
>
> It? I have no idea what you intend that to refer to.

Sorry, code path.

There is already a "code path" that says "hey I can't handle this."
To modify this code path to handle the case of a generic iterable
would add a tiny bit of code, but would not add any appreciable space
(no temp array needed for my proposal) and would not add any runtime
to people who are not passing in iterables or doing other things that
currently raise exceptions.

> I doubt it would be very many because it is *impossible* to make it work
> in the way that I think people would want it to.

How do you know?  I have a use case that I really don't think is all
that rare.  I know exactly how much data I am generating, but I am
generating it piecemeal using iterators.

> >> It could, but at some cost. Remember, people use ctypes for efficiency,
> > yes, you just made my argument for me.  Thank you.  It is incredibly
> > inefficient to have to create a temp array.

No, I don't think I did "make your argument for you."  I am currently
making a temp list because I have to, and am proposing that with a
small change to the ctypes library, that wouldn't always need to be
done.

> But necessary to work with blank box iterators.

With your own preconceived set of assumptions. (Which I will admit,
you have given quite a bit of thought to, which I appreciate.)

> Now you are agreeing with my argument.

Nope, still not doing that.

> If ctype_array slice assignment were to be augmented to work with
> iterators, that would, in my opinion (and see below),

That's better for not being absolute.  Thank you for admitting other
possibilities.

> require use of
> temporary arrays. Since slice assignment does not use temporary arrays
> now (see below), that augmentation should be conditional on the source
> type being a non-sequence iterator.

I don't think any temporary array is required, but in any case, yes
the code path through the ctypes array library __setslice__ would have
to be modified where it gives up now, in order to decide to do
something different if it is passed an iterable.

> CPython comes with immutable fixed-length arrays (tuples) that do not
> allow slice assignment and mutable variable-length arrays (lists) that
> do. The definition is 'replace the indicated slice with a new slice
> built from all values from an iterable'. Point 1: This works for any
> properly functioning iterable that produces any finite number of items.

Agreed.

> Iterators are always exhausted.

And my proposal would continue to exhaust iterators, or would raise an
exception if the iterator wasn't exhausted.

> Replace can be thought of as delete follewed by add, but the
> implementation is not that naive.

Sure, on a mutable length item.

> Point 2: If anything goes wrong and an
> exception is raised, the list is unchanged.

This may be true on lists, and is quite often true (and is nice when
it happens), but it isn't always true in general.  For example, with
the current tuple packing/unpacking protocol across an assignment, the
only real guarantee is that everything is gathered up into a single
object before the assignment is done.  It is not the case that nothing
will be unpacked unless everything can be unpacked.  For example:

>>>
>>> a,b,c,d,e,f,g,h,i = range(100,109)
>>> (a,b,c,d), (e,f), (g,h,i) = (1,2,3,4), (5,6,7), (8,9)
Traceback (most recent call last):
  File "", line 1, in 
ValueError: too many values to unpack
>>> a,b,c,d,e,f,g,h,i
(1, 2, 3, 4, 104, 105, 106, 107, 108)
>>>

> This means that there must
> be temporary internal storage of either old or new references.

As I show with the tuple unpacking example, it is not an inviolate law
that Python won't unpack a little unless it can unpack everything.

> An
> example that uses an improperly functioning generator. (snip)

Yes, I agree that lists are wondrously robust.  But one of the reasons
for this is the "flexible" interpretation of slice start and end
points, that can be as surprising to a beginner as anything I'm
proposing.

> A c_uint array is a new kind of beast: a fixed-length mutable
> array. So it has to have a different definition of slice
> assignment than lists.  Thomas Heller, the ctypes author,
> apparently chose 'replacement by a sequence with exactly

Re: Need Windows user / developer to help with Pynguin

2011-10-29 Thread Andrew Berg
On 10/29/2011 9:43 AM, Lee Harr wrote:
> So, windows now creates the dummy folder automatically?
That is the default choice, but users are given a prompt to choose an
arbitrary directory. Note that this only applies to the ZIP extractor in
Explorer; other archive programs have their own behavior. I agree with
you on having a top-level directory in an archive, but MS figures users
are more likely to be annoyed with files scattered around the current
directory than a nested directory. Unfortunately, many archives out in
the wild have a top-level directory while many others don't, so one can
rarely ever be certain how a given archive is organized without opening it.

> Is the problem that the .zip has the same name (minus the extension)?
Not at all.

> Just out of curiosity, what happens if you double-click the
> README sans .txt? Does it make you choose which app to open with?
Typically, that is the case because files without extensions are not
registered by default.

-- 
CPython 3.2.2 | Windows NT 6.1.7601.17640 | Thunderbird 7.0
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: save tuple of simple data types to disk (low memory foot print)

2011-10-29 Thread Gelonida N
On 10/29/2011 03:00 AM, Steven D'Aprano wrote:
> On Fri, 28 Oct 2011 22:47:42 +0200, Gelonida N wrote:
> 
>> Hi,
>>
>> I would like to save many dicts with a fixed amount of keys tuples to a
>> file  in a memory efficient manner (no random, but only sequential
>> access is required)

> 
> What do you mean "keys tuples"?
Corrected phrase:
I would like to save many dicts with a fixed (and known) amount of keys
in a memory efficient manner (no random, but only sequential access is
required) to a file (which can later be sent over a slow expensive
network to other machines)

Example:
Every dict will have the keys 'timestamp', 'floatvalue', 'intvalue',
'message1', 'message2'
'timestamp' is an integer
'floatvalue' is a float
'intvalue' an int
'message1' is a string with a length of max 2000 characters, but can
often be very short
'message2' the same as message1

so a typical dict will look like
{ 'timetamp' : 12, 'floatvalue': 3.14159, 'intvalue': 42,
 'message1' : '', 'message2' : '=' * 1999 }


>
> What do you call "many"? Fifty? A thousand? A thousand million? How many
> items in each dict? Ten? A million?

File size can be between 100kb and over 100Mb per file. Files will be
accumulated over months.

I just want to use the smallest possible space, as the data is collected
over a certain time (days / months)  and will be transferred  via UMTS /
EDGE / GSM network, where the transfer takes already for quite small
data sets several minutes.

I want to reduce the transfer time, when requesting files on demand (and
the amount of data in order to not exceed the monthly quota)



>> As the keys are the same for each entry  I considered converting them to
>> tuples.
> 
> I don't even understand what that means. You're going to convert the keys 
> to tuples? What will that accomplish?

>> As the keys are the same for each entry  I considered converting them
(the before mentioned dicts) to tuples.

so the dict { 'timetamp' : 12, 'floatvalue': 3.14159, 'intvalue': 42,
 'message1' : '', 'message2' : '=' * 1999 }

would become
[ 12, 3.14159, 42, '', ''=' * 1999 ]
> 
> 
>> The tuples contain only strings, ints (long ints) and floats (double)
>> and the data types for each position within the tuple are fixed.
>>
>> The fastest and simplest way is to pickle the data or to use json. Both
>> formats however are not that optimal.
> 
> How big are your JSON files? 10KB? 10MB? 10GB?
> 
> Have you tried using pickle's space-efficient binary format instead of 
> text format? Try using protocol=2 when you call pickle.Pickler.

No. This is probably already a big step forward.

As I know the data types if each element in the tuple I would however
prefer a representation, which is not storing the data types for each
typle over and over again (as they are the same for each dict / tuple)

> 
> Or have you considered simply compressing the files?

Compression makes sense but the inital file format should be already
rather 'compact'

> 
>> I could store ints and floats with pack. As strings have variable length
>> I'm not sure how to save them efficiently (except adding a length first
>> and then the string.
> 
> This isn't 1980 and you're very unlikely to be using 720KB floppies. 
> Premature optimization is the root of all evil. Keep in mind that when 
> you save a file to disk, even if it contains only a single bit of data, 
> the actual space used will be an entire block, which on modern hard 
> drives is very likely to be 4KB. Trying to compress files smaller than a 
> single block doesn't actually save you any space.

> 
> 
>> Is there already some 'standard' way or standard library to store such
>> data efficiently?
> 
> Yes. Pickle and JSON plus zip or gzip.
> 

pickle protocol-2 + gzip of the tuple derived from the dict, might be
good enough for the start.

I have to create a little more typical data in order to see how many
percent of my payload would consist of repeating the data types for each
tuple.




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: save tuple of simple data types to disk (low memory foot print)

2011-10-29 Thread Gelonida N
On 10/29/2011 01:08 AM, Roy Smith wrote:
> In article ,
>  Gelonida N  wrote:
> 
>> I would like to save many dicts with a fixed amount of keys
>> tuples to a file  in a memory efficient manner (no random, but only
>> sequential access is required)
> 
> There's two possible scenarios here.  One, which you seem to be 
> exploring, is to carefully study your data and figure out the best way 
> to externalize it which reduces volume.
> 
> The other is to just write it out in whatever form is most convenient 
> (JSON is a reasonable thing to try first), and compress the output.  Let 
> the compression algorithms worry about extracting the entropy.  You may 
> be surprised at how well it works.  It's also an easy experiment to try, 
> so if it doesn't work well, at least it didn't cost you much to find out.


Yes I have to make some more tests to see the defference between
just compressing aplain format (JSON / pickle) and compressing the
'optimized' representation.




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: save tuple of simple data types to disk (low memory foot print)

2011-10-29 Thread Tim Chase

On 10/29/11 11:44, Gelonida N wrote:

I would like to save many dicts with a fixed (and known) amount of keys
in a memory efficient manner (no random, but only sequential access is
required) to a file (which can later be sent over a slow expensive
network to other machines)

Example:
Every dict will have the keys 'timestamp', 'floatvalue', 'intvalue',
'message1', 'message2'
'timestamp' is an integer
'floatvalue' is a float
'intvalue' an int
'message1' is a string with a length of max 2000 characters, but can
often be very short
'message2' the same as message1

so a typical dict will look like
{ 'timetamp' : 12, 'floatvalue': 3.14159, 'intvalue': 42,
  'message1' : '', 'message2' : '=' * 1999 }




What do you call "many"? Fifty? A thousand? A thousand million? How many
items in each dict? Ten? A million?


File size can be between 100kb and over 100Mb per file. Files will be
accumulated over months.


If Steven's pickle-protocol2 solution doesn't quite do what you 
need, you can do something like the code below.  Gzip is pretty 
good at addressing...



Or have you considered simply compressing the files?

Compression makes sense but the inital file format should be
already rather 'compact'


...by compressing out a lot of the duplicate aspects.  Which also 
mitigates some of the verbosity of CSV.


It serializes the data to a gzipped CSV file then unserializes 
it.  Just point it at the appropriate data-source, adjust the 
column-names and data-types


-tkc

from gzip import GzipFile
from csv import writer, reader

data = [ # use your real data here
{
'timestamp': 12,
'floatvalue': 3.14159,
'intvalue': 42,
'message1': 'hello world',
'message2': '=' * 1999,
},
] * 1


f = GzipFile('data.gz', 'wb')
try:
w = writer(f)
for row in data:
w.writerow([
row[name] for name in (
# use your real col-names here
'timestamp',
'floatvalue',
'intvalue',
'message1',
'message2',
)])
finally:
f.close()

output = []
for row in reader(GzipFile('data.gz')):
d = dict((
(name, f(row[i]))
for i, (f,name) in enumerate((
# adjust for your column-names/data-types
(int, 'timestamp'),
(float, 'floatvalue'),
(int, 'intvalue'),
(str, 'message1'),
(str, 'message2'),

output.append(d)

# or

output = [
dict((
(name, f(row[i]))
for i, (f,name) in enumerate((
# adjust for your column-names/data-types
(int, 'timestamp'),
(float, 'floatvalue'),
(int, 'intvalue'),
(str, 'message1'),
(str, 'message2'),

for row in reader(GzipFile('data.gz'))
]
--
http://mail.python.org/mailman/listinfo/python-list


Re: Review Python site with useful code snippets

2011-10-29 Thread Jason Friedman
On Wed, Oct 26, 2011 at 3:51 PM, Chris Hall  wrote:
> I am looking to get reviews, comments, code snippet suggestions, and
> feature requests for my site.
> I intend to grow out this site with all kinds of real world code
> examples to learn from and use in everyday coding.
> The site is:
>
> http://www.pythonsnippet.com
>
> If you have anything to contribute or comment, please post it on the
> site or email me directly.

Great sentiment, but there is already http://code.activestate.com/,
http://code.google.com/p/python-code-snippets/ and
http://stackoverflow.com/questions/2032462/python-code-snippets.

Pretty site you put up, though.
-- 
http://mail.python.org/mailman/listinfo/python-list


Customizing class attribute access in classic classes

2011-10-29 Thread Geoff Bache
Hi,

I'm wondering if there is any way to customize class attribute access
on classic classes?

So this works:

class Meta(type):
def __getattr__(cls, name):
return "Customized " + name

class A:
__metaclass__ = Meta

print A.blah

but it turns A into a new-style class.

If "Meta" does not inherit from type, the customization works but A
ends up not being a class at all, severely restricting its usefulness.
I then hoped I could get "Meta" to inherit from types.ClassType but
that wasn't allowed either.

Is there any way to do this or is it just a limitation of classic
classes?

Regards,
Geoff Bache
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Customizing class attribute access in classic classes

2011-10-29 Thread Ben Finney
Geoff Bache  writes:

> I'm wondering if there is any way to customize class attribute access
> on classic classes?

Why do that? What is it you're hoping to achieve, and why limit it to
classic classes only?

> So this works:
>
> class Meta(type):
> def __getattr__(cls, name):
> return "Customized " + name
>
> class A:
> __metaclass__ = Meta
>
> print A.blah
>
> but it turns A into a new-style class.

Yes, A is a new-style class *because* it inherits from ‘type’
http://docs.python.org/reference/datamodel.html#new-style-and-classic-classes>.

Why does that not meet your needs?

-- 
 \ Contentsofsignaturemaysettleduringshipping. |
  `\   |
_o__)  |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Convert DDL to ORM

2011-10-29 Thread Lie Ryan

On 10/25/2011 03:30 AM, Alec Taylor wrote:

Good morning,

I'm often generating DDLs from EER->Logical diagrams using tools such
as PowerDesigner and Oracle Data Modeller.

I've recently come across an ORM library (SQLalchemy), and it seems
like a quite useful abstraction.

Is there a way to convert my DDL to ORM code?


It's called reverse engineering. Some ORMs, e.g. Django's ORM can 
reverse engineer the database into Django Models by using `./manage.py 
inspectdb`. I believe the equivalent in SQLalchemy would be SQL 
Autocode, see 
http://turbogears.org/2.1/docs/main/Utilities/sqlautocode.html and 
http://code.google.com/p/sqlautocode/


--
http://mail.python.org/mailman/listinfo/python-list