Thanks to Thomas, Martin, Jim and William,
Your input was very informative, and thanks for the reference to Sedgwick.
In the end, it does seem to me that all these algorithms require fast
lookup by ID of nodes to access data, and that conditional on such fast
lookup, algorithms are possible wi
On Sat, Nov 2, 2013 at 11:12 AM, Martin Morgan wrote:
> On 11/01/2013 08:22 AM, Magnus Thor Torfason wrote:
>
>> Sure,
>>
>> I was attempting to be concise and boiling it down to what I saw as the
>> root
>> issue, but you are right, I could have taken it a step further. So here
>> goes.
>>
>> I
lto:r-help-boun...@r-project.org] On
Behalf
Of Magnus Thor Torfason
Sent: Friday, November 01, 2013 8:23 AM
To: r-help@r-project.org
Subject: Re: [R] Inserting 17M entries into env took 18h, inserting 34M entries
taking 5+
days
Sure,
I was attempting to be concise and boiling it down to what I saw as
On 11/1/2013 10:12 PM, Martin Morgan wrote:
Do you mean that if A,B occur together and B,C occur together, then A,B
and A,C are equivalent?
Yes, that's what I meant, sorry, typo.
I like your uid() function. It avoids the 20M times loop, and the issue
of circular references can be solved by
There are around 16M unique values. After accounting for equivalence,
the number is much smaller (I don't know how much smaller, since my
program has not completed yet :-)
Yes, I meant that "B and C are also equivalent". The original version
was a typo.
Best,
Magnus
On 11/1/2013 3:45 PM, ji
On 11/01/2013 08:22 AM, Magnus Thor Torfason wrote:
Sure,
I was attempting to be concise and boiling it down to what I saw as the root
issue, but you are right, I could have taken it a step further. So here goes.
I have a set of around around 20M string pairs. A given string (say, A) can
either
2013 8:23 AM
> To: r-help@r-project.org
> Subject: Re: [R] Inserting 17M entries into env took 18h, inserting 34M
> entries taking 5+
> days
>
> Sure,
>
> I was attempting to be concise and boiling it down to what I saw as the
> root issue, but you are right, I could
Sure,
I was attempting to be concise and boiling it down to what I saw as the
root issue, but you are right, I could have taken it a step further. So
here goes.
I have a set of around around 20M string pairs. A given string (say, A)
can either be equivalent to another string (B) or not. If A
It would be nice if you followed the posting guidelines and at least
showed the script that was creating your entries now so that we
understand the problem you are trying to solve. A bit more
explanation of why you want this would be useful. This gets to the
second part of my tag line: Tell me w
Pretty much what the subject says:
I used an env as the basis for a Hashtable in R, based on information
that this is in fact the way environments are implemented under the hood.
I've been experimenting with doubling the number of entries, and so far
it has seemed to be scaling more or less l
10 matches
Mail list logo