Hi Nikita Popov,

> 1. There would be the possibility of having an interface Deque that is
> backed by a VecDeque/ArrayDeque implementation. I'm not convinced this
> would actually make sense, but I wanted to throw it out there, given that
> the class is final (which is without any doubt the correct decision).

Do you mean ArrayDeque for a hardcoded max capacity?
I'm also not convinced there's a use case.

> 2. How do set() / offsetSet() work with an out of range index? Is it
> possible to push an element with $deque[count($deque)] = x? I would assume
> that throws, but want to make sure. On a related note, why do we need both
> get()/set() and offsetGet()/offsetSet()? These APIs seem redundant.

It isn't possible to set an out of range index - both throws 
`\OutOfBoundsException`

In order to support the ArrayAccess `$deque[$offset] = $value;` or 
`$deque[$offset]` shorthand,
the offsetGet/offsetSet needed to be implemented to follow conventions.
(because of ArrayAccess, offsetGet must accept mixed offsets)

Those aren't type safe, though, so get()/set() are provided for a type safe 
alternative
that will throw a TypeError for use cases that would benefit from runtime type 
safety.

> 3. I believe it's pretty standard for Deque implementations to provide
> insert() / remove() methods to insert at any position (these would be O(n)).

https://www.php.net/manual/en/class.splqueue.php didn't offer that 
functionality. 

https://www.php.net/manual/en/class.ds-deque.php did, apparently.

> 4. The design of getIterator() here is rather unusual, and deserves some
> additional justification. Normally, we let getIterator() see concurrent
> modifications to the structure, though the precise behavior is unspecified.
> I would also like to know how this will look like on a technical level (it
> doesn't seem to be implemented yet?) This seems like something that will
> require a check on every write operation, to potentially separate the
> structure in some form.
 
The original plan was to copy the entire array on iterator creation,
to imitate the immediate copy nature of foreach on arrays.

This was assuming that `foreach` over a Deque without removing elements would 
be a rare use case.

That may have been a mistake since `foreach (clone $deque as $key => $value)` 
can be done to explicitly do that.
There're around 4 approaches I could take with different tradeoffs

1. Iterate over $offset = 0 and increment offset in calls to 
InternalIterator->next() until exceeding the size of the deque, not copying the 
deque.

   That's the **actual** current implementation, but would be unintuitive with 
shift()/unshift()

   This would repeat elements on unshift(), or skip over elements when shift() 
is called.

   The current implementation of `Ds\Deque` seems to do the same thing, but 
there's a similar discussion in its issue tracker in 
https://github.com/php-ds/ext-ds/issues/17


2. Similar iteration behavior, but also have it relative to a uint64 indicating 
the number of times elements were added to the front of the deque minus the 
number of elements were removed from the back of the deque.

    (e.g. if the current Deque internalOffset is 123, the InternalIterator 
returned by getOffset would start with an offset of 123 and increase it when 
advancing.
    Calls to shift would raise the Deque internalOffset, calls to unshift would 
decrease it (by the number of elements).
    This would be independent of the circular buffer offset.
    And the InternalIterator would increase the internalOffset to catch up if 
the signed offset difference became negative, e.g. by calling shift() twice in 
a foreach block)

3. Behave as if the entire Deque was copied when iteration starts (either by 
actually copying everything or by copy on write).

   That was the **planned** implementation documented in the RFC, but would be 
inefficient for iterations that end early and use a lot of memory for large 
Deques.

4. Throw if attempting to change the size of the `Deque` while a foreach is in 
progress.

   The semantics of that would be annoying, e.g. handling `clear()` or 
`shift()`, and the exception getting thrown would be unexpected.

> 5. The shift/unshift terminology is already used by our array API, but I'm
> not sure it's the most intuitive choice in the context of a deque. SplQueue
> has enqueue() and dequeue(). Another popular option from other languages
> (which I would personally favor) is pushFront() and popFront().

My original name idea was pushBack/popBack/pushFront/popFront but I had decided 
against proposing brand new terms
for functionality that already had terms in php-src.
It would also be inconsistent with proposed names for `Vector` if that was also 
added.

https://www.php.net/manual/en/class.ds-deque.php and SplDoublyLinkedList had 
done the same thing

enqueue and dequeue don't have a good term for enqueueing on the opposite side.

- It would make it easier to learn the language to have fewer terms and migrate 
code using SplDoublyLinkedList or arrays
- push()/shift() are also shorter names than pushBack/popFront()

Thanks,
Tyson


From: Nikita Popov <nikita....@gmail.com>
Sent: October 4, 2021 8:04 AM
To: Levi Morrison <levi.morri...@datadoghq.com>
Cc: PHP internals <internals@lists.php.net>
Subject: Re: [PHP-DEV] Adding `final class Deque` to PHP 
 
On Tue, Sep 21, 2021 at 11:05 PM Levi Morrison via internals <
internals@lists.php.net> wrote:

> The name "deque" is used in the standard library of these languages:
>
>  - C++: std::deque
>  - Java: java.util.Deque (interface)
>  - Python: collections.deque
>  - Swift: Collections.Deque (not standard lib, apparently, but Apple
> official? Don't know Swift)
>  - Rust: std::collections::VecDeque
>
> And these don't have it in the standard library:
>  - Go
>  - C#
>  - C
>  - JavaScript
>
> Anyway, it's pretty decent evidence that:
>  1. This functionality is pretty widely used across languages.
>  2. This functionality should have "deque" be in the name, or be the
> complete name.
>
> Discussion naming for "vector" I can understand, as it's less widely
> used or sometimes means something specific to numbers, but there's
> little point in discussing the name "deque." It's well established in
> programming, whether PHP programmers are aware of it or not.
>
> As I see it, the discussion should be centered around:
>  1. The API Deque provides.
>  2. The package that provides it.
>  3. Whether Deque's API is consistent with other structures in the same
> package.
>  4. Whether this should be included in php-src or left to extensions.
>
> I suggest that we try to make PHP as homogenous in each bullet as we
> can with other languages:
>  1. Name it "Deque."
>  2. Put it in the namespace "Collections" (and if included in core,
> then "ext/collections").
>  3. Support common operations on Deque (pushing and popping items to
> both front and back, subscript operator works, iteration, size, etc)
> and add little else (don't be novel here). To me, the biggest thing
> left to discuss is a notion of a maximum size, which in my own
> experience is common for circular buffers (the implementation chosen
> for Deque) but not all languages have this.
>  4. This is less clear, but I'm in favor as long as we can provide a
> few other data structures at the same time. Obviously, this a voter
> judgement call on the yes/no.
>
> Given that I've suggested the most common options for these across
> many languages, it shouldn't be very controversial. The worst bit
> seems like picking the namespace "Collections" as this will break at
> least one package: https://github.com/danielgsims/php-collections. We
> should suggest that they vendor it anyway, as "collections" is common
> e.g. "Ramsey\Collections", "Doctrine\Common\Collections". I don't see
> a good alternative here to "Collections", as we haven't agreed on very
> much on past namespace proposals.
>
> Anyway, that's what I suggest.
>

I generally agree with everything Levi has said here. I think that adding a
deque structure generally makes sense, and that putting it into a new
ext/collections extension under the corresponding namespace would be
appropriate. I wouldn't insist on introducing it together with other
structures (I'm a lot more sceptical about your Vector proposal), but do
think there should be at least some overarching plan here, even if we only
realize a small part of it in this version.

A few questions for the particular API proposed here:

1. There would be the possibility of having an interface Deque that is
backed by a VecDeque/ArrayDeque implementation. I'm not convinced this
would actually make sense, but I wanted to throw it out there, given that
the class is final (which is without any doubt the correct decision).

2. How do set() / offsetSet() work with an out of range index? Is it
possible to push an element with $deque[count($deque)] = x? I would assume
that throws, but want to make sure. On a related note, why do we need both
get()/set() and offsetGet()/offsetSet()? These APIs seem redundant.

3. I believe it's pretty standard for Deque implementations to provide
insert() / remove() methods to insert at any position (these would be O(n)).

4. The design of getIterator() here is rather unusual, and deserves some
additional justification. Normally, we let getIterator() see concurrent
modifications to the structure, though the precise behavior is unspecified.
I would also like to know how this will look like on a technical level (it
doesn't seem to be implemented yet?) This seems like something that will
require a check on every write operation, to potentially separate the
structure in some form.

5. The shift/unshift terminology is already used by our array API, but I'm
not sure it's the most intuitive choice in the context of a deque. SplQueue
has enqueue() and dequeue(). Another popular option from other languages
(which I would personally favor) is pushFront() and popFront().

Regards,
Nikita
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php

Reply via email to