On 12/15/2016 12:48 PM, Terry Reedy wrote:
On 12/15/2016 12:27 PM, Thomas Nyberg wrote:

I haven't dealt with a data structure exactly like this, but it's
basically a sparse array.

A sparse array has at least half missing values.  This one has none on
the defined domain, but contiguous dupicates.


I'm sorry for devolving into semantics, but there certainly isn't a single definition of "sparse array" out there. For example, the definition in wikipedia (https://en.wikipedia.org/wiki/Sparse_array) doesn't agree with you:

"In computer science, a sparse array is an array in which most of the elements have the default value (usually 0 or null)."

Personally my usage of sparse arrays in scipy has _always_ had all defined values it's just that the default value was 0. I never deal with "missing" values.

(I'm sure it has a name somewhere in the
academic literature, but I couldn't find it with a quick search right
now...)

'Sparse array' is the name for sparse arrays.  I cannot think of one for
blocks of duplicates.


Yeah that's why I said "basically a sparse array". It has the same defining characteristics. It is a situation where you have a large number of values associated to a small number of values in such a way that you can fairly easily describe the association without needing to simply write out all pairs. For regular sparse arrays it's easy because you associate them to a single value by default and only specify the ones that aren't. In this case that doesn't work, but due to the fact that you have long runs of repeated values you can use that to encode and compress the mapping.

Still not sure what to call it. I like D'Arcy's idea of subclassing a dict in combination with Peter's idea of using bisect (I had no idea that module existed!) so maybe I'll change my own not so great terminology to "basically a sparse map".

Cheers,
Thomas
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to