On 2018-06-27 11:11:37 +1200, Gregory Ewing wrote: > Bart wrote: > > x = set(range(10_000_000)) > > > > This used up 460MB of RAM (the original 100M I tried exhausted the memory). > > > > The advantage of Pascal-style sets is that that same set will occupy > > only 1.25MB, as it is a bit-map. > > That's true, but they're also extremely limited compared to > the things you can do with Python sets. They're really two > quite different things, designed for different use cases. > > Compact sets of integers are possible in Python, but not > as a built-in type. This suggests that they're not needed > much
Also, when such sets are needed, a simple array of bits may not be sufficient. For example, sets may be sparse, or they may have almost all except a few bits set. In these cases a tree or RLE representation is much smaller and faster. There are a number of such implementations (Judy Arrays and Roaring Bitmaps come to mind[1]). Each has its advantages and limitations, so its probably better to leave the choice to the author. hp [1] Judy is a C library. Roaring bitmaps are a data structure which has been implemented in several languages. I haven't checked whether there are packages on pypi. Shouldn't be too hard to implement if one needs it. -- _ | Peter J. Holzer | we build much bigger, better disasters now |_|_) | | because we have much more sophisticated | | | h...@hjp.at | management tools. __/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
signature.asc
Description: PGP signature
-- https://mail.python.org/mailman/listinfo/python-list