Re: std.container & ranges

Max Wolter Wed, 02 Nov 2011 16:25:27 -0700

On 11/2/2011 2:41 PM, Steven Schveighoffer wrote:

On Wed, 02 Nov 2011 09:17:39 -0400, Ary Manzana <a...@esperanto.org.ar>
wrote:

On 11/2/11 10:12 AM, Steven Schveighoffer wrote:

On Wed, 02 Nov 2011 08:40:19 -0400, Ary Manzana <a...@esperanto.org.ar>
wrote:

On 11/2/11 8:48 AM, Steven Schveighoffer wrote:


The basic response to this is, when dealing with containers
generically
(that is, you know you have a container, but you don't know what
type),
the "remove this element" operation is not necessarily a good
primitive
to have.

Simply because from the myriad of containers, only some can implement
this operation efficiently. Java embeds this operation in the
interface,
which means any interface you have to a container could potentially
use
O(n) time to remove that element. Such an innocuous piece of syntax
*should* have a cost if it's not efficient IMO.

BTW, the original question doesn't provide enough information to say
"remove this element." Even in Java, if you aren't using the default
comparison, you must use a comparator method to determine which one to
remove. If cell.x == x && cell.y == y *is* the comparison operator for
the type, then the syntax gets much simpler, because you don't need to
pass a specialized comparison function.

In dcollections, removing a specific element (using the default
comparison operator for that element) on a *fast lookup* container
is as
simple as:

container.remove(container.find(x));

Which removes the element x if it's found. However, this is not
defined
for containers which use O(n) time to search (such as linked list),
you
must use std.algorithm.find for that:

container.remove(find(container[], x).begin);

Should work, and takes O(n) time.

-Steve


I don't really understand what's wrong with inefficient methods. You
can have inefficient methods that are convenient, like removing an
element by the default comparison, or giving it a delegate to match
the element(s) to remove.

You profile your application. Is that method the bottle-neck? If so,
you change it to a more efficient one. If not, you are happy you had
that method there, performing in an inefficient way, but which doesn't
matter that much compared to, say, opening an SQL connection.

Programmers want to program, fast. They have schedules, they need to
deliver. They don't need to always find the best solution. They can
find a compromise between "working" and "fast", move on, and later
profile and worry about what matters most.

Programmers don't want to fight with the language or think "Oh, so to
remove an element I need to use this operation and combine it with
that one and with that other one"...


Or use the right container for the job?

Where it really comes into play is generic programming.

Let's say I write some algorithm that removes certain elements from a
container:

removeElements(C, T)(C c, T t[]...)
{
foreach(x; t)
c.remove(t);
}

What's the complexity of this algorithm? For a HashSet, for instance, it
will be O(n) where n is the number of elements to remove.

But for an ArrayList, it will be O(n*m) where m is the number of
elements in c.


But I'm sure in this algorithm I have for this app I'm making, my
collection won't have more than 50 elements. So everything will be
O(1). I need to remove an element from the collection. I really don't
care about the complexity of the operation, because if n is 50,
everything is O(1).


Then your specific application can use std.algorithm.find to implement
the equivalent. Only the algorithm body changes, not the usage of the
algorithm.


Why can't I have an inefficient (for you) remove operation that, for
me, will be ok?


I never said you couldn't (and I've even given examples of such
implementations). It's just not neatly packaged into a method.

But again, if the method is exactly the same as the efficient version
for other containers, it becomes *impossible* to design an algorithm
that guarantees any sort of complexity. As I said before, quadratic sort
is epic fail, and needs to always be avoided.

I'll give you a scenario:

People often complain that Linked List does not have an opIndex on it.
Yes it's inefficient, but so what? "I know it's inefficient, let me
decide whether it's worth it or not."

So let's say I add it to LinkList. Those people are happy.

But now, LinkList becomes defined as a *random-access-range* according
to std.ranges. Therefore, std.algorithm.sort(linklist) compiles! And is
now something like O(n^3).

Whereas LinkList already defines a sort method, which uses mergesort
(O(nlgn)). So are you going to realize this when reading someones code
and you see:

sort(somelist);

That it's going to be horribly inefficient? Why shouldn't we strive for
a library where such things just don't compile?

I don't want to spend 2 hours browsing the ranges algorithms to figure
out how to combine them to remove an element...


You severely exaggerate the amount of time needed to browse the
algorithms. I wrote an equivalent in less than a minute. And I don't
even look at std.algorithm maybe once every few months (and didn't look
at it to write the solution).

My point is, as someone else said it in another post: add inefficient
operations and tell the programmer so. Then he can decide what to do.
If he's sure that that performance penalty is not a big issue for him,
he'll be happy to have that method available.


The operation is available using std.algorithm. There is no need to
reimplement it as a method, sanctioned by the container. As I said
before, if you find yourself wanting to remove specific elements from a
container, perhaps Array is not the smartest choice for a container.
There are plenty of others.

There is something to be said for a language/library that discourages or
prevents you from writing dumb code. It means that I'm that much more
confident a piece of code written in that language is more efficient
than one written in another language.

-Steve


Hello.

You generally arguing this is all nice and good, but this is a veryspecific case.

I am using a LinkList because in my code, the elements are iterated overa million times and during this, I add stuff in-between elements all thetime. However, I will be removing stuff *very* rarely. I am thus usingthe appropriate container and it doesn't matter whether the remove willbe inefficient.

To put it another way: if removing elements was of crucial importance tothe performance of my code in the first place, I wouldn't (andshouldn't) be using a LinkList. Therefore, implementing an inefficientmethod which does this won't be of consequence. Finally, the fundamentalstatement I'm trying to make here is: adding and removing *single*elements should be a straightforward method call for *any* container.

As a side note, your example about generic programming is really neatand makes sense; personally, I would never want a linked list withindexes and it's also a horrible analogy to the complaint at hand.


/Max

Re: std.container & ranges

Reply via email to