On 3/5/10 4:40 AM, Matthew Dowle wrote:
Thanks a lot. Quick and brief responses below...
"Duncan Murdoch"<murd...@stats.uwo.ca> wrote in message
news:4b90f134.6070...@stats.uwo.ca...
Matthew Dowle wrote:
I was hoping for a 'yes', 'no', 'maybe' or 'bad idea because ...'. No
response resulted in a retry() after a Sys.sleep(10 days).
If its a "yes" or "maybe" then I could proceed to try it, test it, and
present the test results and timings to you along with the patch. It
would be on 32bit Ubuntu first, and I would need to either buy, rent time
on, or borrow a 64bit machine to be able to then test there, owing to the
nature of the suggestion.
If its "no", "bad idea because..." or "we were already working on it, or
better", then I won't spend any more time on it.
Matthew
"Matthew Dowle"<mdo...@mdowle.plus.com> wrote in message
news:hlu4qh$l7...@dough.gmane.org...
Looking at shash in unique.c, from R-2.10.1 I'm wondering if it makes
sense to hash the pointer itself rather than the string it points to?
In other words could the SEXP pointer be cast to unsigned int and the
usual scatter be called on that as if it were integer?
Two negative but probably not fatal issues:
Pointers and ints are not always the same size. In Win64, ints are 32
bits, pointers are 64 bits. (Can we be sure there is some integer type
the same size as a pointer? I don't know, ask a C expert.)
No we can't be sure. But we could test at runtime, and if the assumption
wasn't true, then revert to the existing method.
I think the idea is, on the whole, a reasonable one and would be
inclined to apply a patch if it demonstrated some measurable performance
improvement.
For the 32bit v 64bit issue, I think we could detect and in the 64bit
case take something like:
((int)p) ^ ((int)(p >> 32))
We might want to save the hash to disk. On restore, the pointer based
hash would be all wrong. (I don't know if we actually do ever save a hash
to disk. )
The hash table in unique.c appears to be a temporary private hash, different
to the global R_StringHash. Its private hash appears to be used only while
the call to unique runs, then free'd. Thats my understanding anyway. The
suggestion is not to alter the global R_StringHash in any way at all, which
is the one that might be saved to disk now or in the future.
I agree with your reading: this is a temporary hash table and there
would be little reason to want to save it (it is not saved now).
+ seth
--
Seth Falcon | @sfalcon | http://userprimary.net/user
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel