Hi Vlasta. I had a look at your original mail.
I think your simpler (than XML) format is a good idea for now. At a later stage you could change it to something like this: <CUSTOM_TAG KC=12 KNJ=13 VCC=0...etc>text goes here, some more text</CUSTOM_TAG> <CUSTOM_TAG KC=13 KNJ=14 VCC=1...etc>text goes here, some more text</CUSTOM_TAG> And so on. This isn't much more verbose than your current syntax, and it is (I think) valid XML that you can feed into other tools, etc. As for internal representation, the sqllite table-like format seems to suite your app well. Also, there may be a performance advantage (for queries) since the actual searching takes place in optimized C code (in C-friendly structures), rather than vanilla Python code (which uses string dict lookups for a lot of things which C doesn't). Have you benchmarked how long the sqllite version takes, as opposed to a simple list of dicts like this?: [{'line':123, 'KC':12, 'VCC':0, KNJ: 12, text: u'some text'}, {'line':124, 'KC':13, 'VCC':1, KNJ: 13, text: u'some more text'}, {'line':125, 'KC':14, 'VCC':2, KNJ: 14, text: u'some more text'}, ... etc .... ] (Note that I'm storing some values as ints rather than unicode) In theory, if you enable psyco, this structure (even without dict lookups) should be almost (or within a few orders at least) as fast as sqlite (without db indexes). Could you try that, and let me know how it perfoms? As for optimization, here's a simple method which might help. I'm basically simulating db indexes on all the fields: 1) Use the simple structure, as above (a list of dicts) 2) Add a dict for each field, which gives you a list of indexes into the above structure, for each unique value example (simplified) structure: lookups = { 'line': {123: [0,1,2], 124: [3,4,5].... etc} 'KC': {12: [0,4,6], 13: [7,8,9]...etc} 'VC': {14: [0,3,4], 15: [10,11,12] ... etc} ... etc } 3) When the user searches on various criteria, look up all the matches, and get intersections (using the 'set' module). eg: user searches for line 123, VC 15: Your code: a) Looks up line 13 indexes (0,1,2), and adds them to a set object b) Looks up VC 15 indexes (10,11,12), and adds them to another set object c) Gets the intersection of those sets (see python docs) (in this case no results, but you get the idea) d) Gives those results to the user In theory this should perform well. Could you try this and let me know how well it works? David -- http://mail.python.org/mailman/listinfo/python-list