Re: number of different lines in a file

2006-05-19 Thread Ben Stroud

>
>It never occured to me to use the Python dict/set approach.  Now I
>wonder if it would've worked better somehow.  Of course my file was
>26,000 X larger than the one in this problem, and definitely would
>not fit in memory.  I suspect that there were as many as a million
>duplicates for some messages in that file.  Would the generator
>version above have helped me out, I wonder?
>
>
>  
>

You could use a dbm file approach which would provide a external 
dict/set interface through Python bindings.  This would use less memory.

1.  Add records to dbm as keys
2.  dbm (if configured correctly) will only keep unique keys
3.  Count keys

Cheers,
Ben

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OLAP and pivot tables

2006-05-26 Thread Ben Stroud
George Sakkis wrote:

>After a brief search, I didn't find any python package related to OLAP
>and pivot tables. Did I miss anything ? To be more precise, I'm not so
>interested in a full-blown OLAP server with an RDBMS backend, but
>rather a pythonic API for constructing datacubes in memory, slicing and
>dicing them, drilling down or up dimensions and exposing them in some
>suitable form to a presentation layer. I've hacked a first cut of a
>pivot table implementation and an XHTML generator that produces
>hierarchical html tables but it's not particularly general or easily
>extensible so far. Is there any interest at all on a pythonic version
>of something like JOLAP or XMLA ?
>
>George
>
>  
>
I'd be interested as well.  I posted a similar question to the ruby 
mailing list a few months ago to no avail.  Ideally, someone much more 
talented than myself would create a open OLAP library in C that could be 
interfaced with dynamic languages easily (I ordered some OLAP books and 
started in on this, and decided I was in over my head for now).  As far 
as free software, all I've been able to find is java-based Mondrian.  
Maybe it could serve as a reference implementation for someone.

Cheers,
Ben
-- 
http://mail.python.org/mailman/listinfo/python-list