"Martin MOKREJŠ" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Hi, > I have sets.Set() objects having up to 20E20 items, > each is composed of up to 20 characters. Keeping > them in memory on !GB machine put's me quickly into swap. > I don't want to use dictionary approach, as I don't see a sense > to store None as a value. The items in a set are unique. > > How can I write them efficiently to disk? To be more exact, > I have 20 sets. _set1 has 1E20 keys of size 1 character. > > alphabet = ('G', 'A', 'V', 'L', 'I', 'P', 'S', 'T', 'C', 'M', 'A', 'Q', 'F', 'Y', 'W', 'K', 'R', 'H', 'D', 'E') > for aa1 in alphabet: > # l = [aa1] > #_set1.add(aa1) > for aa2 in alphabet: > # l.append(aa2) > #_set2.add(''.join(l)) > [cut] > > The reason I went for sets instead of lists is the speed, > availability of unique, common and other methods. > What would you propose as an elegant solution? > Actually, even those nested for loops take ages. :( > M.
_set1 only has 19 keys of size 1 character - 'A' is duplicated. Assuming you replace 'A' with another character (say 'Z'), then here is what you get: _set1 - 20 elements (20**1) _set2 - 400 elements (20**2) _set3 - 8000 elements (20**3) ... _set20 - 20**20 ~ 10 ^ (1.301*20) or 1E26 If you have no duplicates in your alphabet, then you wont need to use sets, every combination will be unique. In this case, just use a generator. Here's a recursive generator approach that may save you a bunch of nested editing (set maxDepth as high as you dare): alphabet = ('G', 'A', 'V', 'L', 'I', 'P', 'S', 'T', 'C', 'M', 'Z', 'Q', 'F', 'Y', 'W', 'K', 'R', 'H', 'D', 'E') maxDepth = 3 def genNextCombo(root=list(),depth=1): for a in alphabet: thisRoot = root + [a] yield "".join( thisRoot ) if depth < maxDepth: for c in genNextCombo(thisRoot, depth+1): yield c for c in genNextCombo(): print c # or write to file, or whatever -- Paul -- http://mail.python.org/mailman/listinfo/python-list