[Python-ideas] Re: Access (ordered) dict by index; insert slice

Steven D'Aprano Fri, 10 Jul 2020 07:32:46 -0700

On Thu, Jul 09, 2020 at 06:26:41PM +0100, Stestagg wrote:

> As for use-cases, I'll admit that I see this as a fairly minor
> quality-of-life issue.


Thank you for that comment.

I too have sometimes proposed what I think of as "minor quality-of-life" 
enhancements, and had them shot down. It stings a bit, and can be 
frustrating, but remember it's not personal.

The difficulty is that our QOL enhancement is someone else's bloat. 
Every new feature is something that has to be not just written once, but 
maintained, documented, tested and learned. Every new feature steepens 
the learning curve for the language; every new feature increases the 
size of the language, increases the time it takes to build, increases 
the time it takes for the tests to run.

This one might only be one new method on three classes, but it all adds 
up, and we can't add *everything*.

(I recently started writing what was intended to be a fairly small 
class, and before I knew it I was up to six helper classes, nearly 200 
methods, and approaching 1500 LOC, for what was conceptually intended to 
be a *lightweight* object. I've put this aside to think about it for a 
while, to decide whether to start again from scratch with a smaller API, 
or just remove the word "lightweight" from the description :-)

So each new feature has to carry its own weight. Even if the weight in 
effort to write, effort to learn, code, tests and documentation is 
small, the benefit gained must be greater or it will likely be rejected.

"Nice to have" is unlikely to be enough, unless you happen to be one of 
the most senior core devs scratching your own itch, and sometimes not 
even then.


> >>> import numpy as np
> >>> mapping_table = np.array(BIG_LOOKUP_DICT.items())
> [[1, 99],
>  [2, 23],
>  ...
> ]

That worked in Python 2 by making a copy of the dict items into a list. 
It will equally work in Python 3 by making a copy of the items into a 
list.

And I expect that even if dict.items() was indexable, numpy would 
still have to copy the items. I don't know how numpy works in detail, 
but I doubt that it will be able to use a view of a hash table internals 
as a fast array without copying.

Bottom line here is that adding indexing to dict views won't save you 
either time or memory or avoid making a copy in this example. All it 
will save you is writing an explicit call to `list`. And we know what 
the Zen says about being explicit.


> >>> import sqlite3
> >>> conn = sqlite3.connect(":memory:")
> >>> params = {'a': 1, 'b': 2}
> >>> placeholders = ', '.join(f':{p}' for p in params)
> >>> statement = f"select {placeholders}"
> >>> print(f"Running: {statement}")
> Running: select :a, :b
> >>> cur=conn.execute(statement, params.values())
> >>> cur.fetchall()
> [(1, 2)]

Why are you passing a view to a values when you could pass the dict 
itself? Is there some reason you don't do this?

    # statement = "select :a, :b"
    py> cur=conn.execute(statement, params)
    py> cur.fetchall()
    [(1, 2)]

I'm not an expert on sqlite, so I might be missing something here, but I 
would have expected that this is the prefered solution. It matches the 
example in the docs, which uses a dict.


> # This currently works, but is deprecated in 3.9
> >>> dict(random.sample({'a': 1, 'b': 2}.items(), 2))
> {'b': 2, 'a': 1}

I suspect that even if dict items were indexable, Raymond Hettinger 
would not be happy with random.sample on dict views.


> >>>  def min_max_keys(d):
> >>>      min_key, min_val = d.items()[0]
> >>>      max_key, max_val = min_key, min_val
> >>>      for key, value in d.items():

Since there's no random access to the items required, there's not really 
any need for indexing. You only need the first item, then iteration. So 
the natural way to write that is with iter() and next().

I suspect that the difference in perspective here is that (perhaps?) you 
still thing of concrete sequences and indexing as fundamental, while 
Python 3 has moved in the direction of making the iterator protocol and 
iterators as fundamental.

You have a hammer (indexing), so you want views to be nails so you can 
hammer them. But views are screws, and need a screwdriver (iter and 
next).

I'm not disputing that sometimes a hammer is the right solution, I'm 
just dubious that having views be a mutant nail+screw hybrid is worth 
it, just so you can pass a view to random.sample.


> The idea that there are future, unspecified changes to dicts() that may or
> may not be hampered by allowing indexing sounds like FUD to me, unless
> there are concrete references?

Nobody is suggesting that dicts will cease to be order preserving in the 
future. But order preserving does not necessarily require there to be 
easy indexing. It just requires that the order of iteration is the same 
as order of insertion, not that we can jump to the 350th key without 
stepping through the previous 349 keys.

Dicts have gone through a number of major redesigns and many careful 
tweaks over the years to get the best possible performance. The last 
major change was to add *order-preserving* behaviour, not indexing. The 
fact that they can be indexed in reasonable time is not part of the 
design, just an accident of implementation, and being an accident, it 
could change in the future.

This feature would require upgrading that accident of implementation to 
a guarantee. If the Python world were awash with dozens of compelling, 
strong use-cases for indexing dicts, then we would surely be willing to 
make that guarantee. But the most compelling use-case we've seen so far 
is awfully weak indeed: choose a random item from a dict.

So the cost-benefit calculation goes (in my opinion) something like 
this.

1. Risk of eliminating useful performance enhancements in the 
   future: small.

2. Benefit gained: even smaller.


That's not FUD. It's just a simple cost-benefit calculation. You can 
counter it by finding good use-cases that are currently difficult and 
annoying to solve. Using an explicit call to list is neither difficult 
nor annoying :-)


-- 
Steven
_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/JNJQCJJ5FG34WUUYCKVCL52YRWSG24Y4/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Access (ordered) dict by index; insert slice

Reply via email to