Re: Hashed lookups for tabular data

2015-01-19 Thread Joseph L. Casale
> If you want an sql-like interface, you can simply create an in-memory > sqlite3 database. > > import sqlite3 > db = sqlite3.Connection(':memory:') > > You can create indexes as you need, and query using SQL. Later, if you > find the data getting too big to fit in memory, you can switch to us

Re: Hashed lookups for tabular data

2015-01-19 Thread Joseph L. Casale
> Why not take a look at pandas as see if there's anything there you could > use? Excellent docs here http://pandas.pydata.org/pandas-docs/stable/ > and the mailing list is available at gmane.comp.python.pydata amongst > other places. Mark, Actually it was the first thing that came to mind. I did

Re: Hashed lookups for tabular data

2015-01-19 Thread Joseph L. Casale
> The IDs of the objects prove that they're actually all the same > object. The memory requirement for this is just what the dictionaries > themselves require; their keys and values are all shared with other > usage. Chris, I would have never imagined that, much appreciated for that! jlc -- http

Re: Hashed lookups for tabular data

2015-01-19 Thread Kushal Kumaran
"Joseph L. Casale" writes: >> So presumably your data's small enough to fit into memory, right? If >> it isn't, going back to the database every time would be the best >> option. But if it is, can you simply keep three dictionaries in sync? > > Hi Chris, > Yeah the data can fit in memory and henc

Re: Hashed lookups for tabular data

2015-01-19 Thread Mark Lawrence
On 19/01/2015 17:09, Joseph L. Casale wrote: This is actually far simpler than I had started imagining, however the row data is duplicated. I am hacking away at an attempt with references to one copy of the row. Its kind of hard to recreate an sql like object in Python with indexes and the inhe

Re: Hashed lookups for tabular data

2015-01-19 Thread Chris Angelico
On Tue, Jan 20, 2015 at 4:09 AM, Joseph L. Casale wrote: >> row = (foo, bar, quux) # there could be duplicate quuxes but not foos or bars >> foo_dict = {} >> bar_dict = {} >> quux_dict = collections.defaultdict(set) >> >> foo_dict[row[0]] = row >> bar_dict[row[1]] = row >> quux_dict[row[2]].add(ro

Re: Hashed lookups for tabular data

2015-01-19 Thread Joseph L. Casale
> So presumably your data's small enough to fit into memory, right? If > it isn't, going back to the database every time would be the best > option. But if it is, can you simply keep three dictionaries in sync? Hi Chris, Yeah the data can fit in memory and hence the desire to avoid a trip here. >

Re: Hashed lookups for tabular data

2015-01-19 Thread Chris Angelico
On Tue, Jan 20, 2015 at 1:13 AM, Joseph L. Casale wrote: > No surprise the data originates from a database however the data is utilized > in > a recursive processing loop where the user accessing the data benefits from a > simplified and quick means to access it. Currently its done with classes t

Hashed lookups for tabular data

2015-01-19 Thread Joseph L. Casale
I have some tabular data for example 3 tuples that I need to build a container for where lookups into any one of the three fields are O(1). Does something in the base library exist, or if not is there an efficient implementation of such a container that has been implemented before I give it a go?