On Wednesday 28 September 2016 15:51, ast wrote: > Hello > > I noticed that searching in a set is faster than searching in a list. [...] > I tried a search in a tuple, it's not different that in a list. > Any comments ?
A list, array or tuple is basically a linear array that needs to be searched one item at a time: [ a | b | c | d | ... | x | y | z ] To find x, Python has to start at the beginning and look at every item in turn, until it finds x. But a set is basically a hash table. It is an array, but laid out differently, with blank cells: [ # | # | h | p | a | # | m | y | b | # | # | f | x | ... | # ] Notice that the items are jumbled up in arbitrary order. So how does Python find them? Python calls hash() on the value, which returns a number, and that points directly to the cell which would contain the value if it were there. So if you search for x, Python calls hash(x) which will return (say) 12. Python then looks in cell 12, and if x is there, it returns True, and if it's not, it returns False. So instead of looking at 24 cells in the array to find x, it calculates a hash (which is usually fast), then looks at 1 cell. (This is a little bit of a simplification -- the reality is a bit more complicated, but you can look up "hash tables" on the web or in computer science books. They are a standard data structure, so there is plenty of information available.) On average, if you have a list with 1000 items, you need to look at 500 items before finding the one you are looking for. With a set or dict with 1000 items, on average you need to look at 1 item before finding the one you are looking for. And that is why sets and dicts are usually faster than lists. -- Steven git gets easier once you get the basic idea that branches are homeomorphic endofunctors mapping submanifolds of a Hilbert space. -- https://mail.python.org/mailman/listinfo/python-list