On 03-11-2021 00:42, Avi Gross via R-help wrote:
Finally, someone mentioned how creating a data.frame with duplicate names
for columns is not a problem as it can automagically CHANGE them to be
unique. That is a HUGE problem for using that as a dictionary as the new
name will not be known to the system so all kinds of things will fail.
I think you are referring to my remark which was:
> However, the data.frame construction method will detect this and
> generate unique names (which also might not be what you want):
I didn't say this means that duplicate names aren't a problem; I just
mentioned the the behaviour is different. Personally, I would actually
prefer the behaviour of list (keep the duplicated name) with a warning.
Most of the responses seem to assume that the OP actually wants a hash
table. Yes, he did ask for that and for a hash table an environment
(with some work) would be a good option. But in many cases, where other
languages would use a hash-table-like object (such as a dict) in R you
would use other types of objects. Furthermore, for many operations where
you might use hash tables to implement the operation, R has already
built in options, for example %in%, match, duplicated. These are also
vectorised; so two vectors: one with keys and one with values might
actually be faster than an environment in some use cases.
Best,
Jan
And there are also packages for many features like sets as well as functions
to manipulate these things.
-----Original Message-----
From: R-help <r-help-boun...@r-project.org> On Behalf Of Bill Dunlap
Sent: Tuesday, November 2, 2021 1:26 PM
To: Andrew Simmons <akwsi...@gmail.com>
Cc: R Help <r-help@r-project.org>
Subject: Re: [R] Is there a hash data structure for R
Note that an environment carries a hash table with it, while a named list
does not. I think that looking up an entry in a list causes a hash table to
be created and thrown away. Here are some timings involving setting and
getting various numbers of entries in environments and lists. The times are
roughly linear in n for environments and quadratic for lists.
vapply(1e3 * 2 ^ (0:6), f, L=new.env(parent=emptyenv()),
FUN.VALUE=NA_real_)
[1] 0.00 0.00 0.00 0.02 0.03 0.06 0.15
vapply(1e3 * 2 ^ (0:6), f, L=list(), FUN.VALUE=NA_real_)
[1] 0.01 0.03 0.15 0.53 2.66 13.66 56.05
f
function(n, L, V = sprintf("V%07d", sample(n, replace=TRUE))) {
system.time(for(v in V)L[[v]]<-c(L[[v]],v))["elapsed"] }
Note that environments do not allow an element named "" (the empty string).
Elements named NA_character_ are treated differently in environments and
lists, neither of which is great. You may want your hash table functions to
deal with oddball names explicitly.
-Bill
On Tue, Nov 2, 2021 at 8:52 AM Andrew Simmons <akwsi...@gmail.com> wrote:
If you're thinking about using environments, I would suggest you
initialize them like
x <- new.env(parent = emptyenv())
Since environments have parent environments, it means that requesting
a value from that environment can actually return the value stored in
a parent environment (this isn't an issue for [[ or $, this is
exclusively an issue with assign, get, and exists) Or, if you've
already got your values stored in a list that you want to turn into an
environment:
x <- list2env(listOfValues, parent = emptyenv())
Hope this helps!
On Tue, Nov 2, 2021, 06:49 Yonghua Peng <y...@pobox.com> wrote:
But for data.frame the colnames can be duplicated. Am I right?
Regards.
On Tue, Nov 2, 2021 at 6:29 PM Jan van der Laan <rh...@eoos.dds.nl>
wrote:
True, but in a lot of cases where a python user might use a dict
an R user will probably use a list; or when we are talking about
arrays of dicts in python, the R solution will probably be a
data.frame (with
each
dict field in a separate column).
Jan
On 02-11-2021 11:18, Eric Berger wrote:
One choice is
new.env(hash=TRUE)
in the base package
On Tue, Nov 2, 2021 at 11:48 AM Yonghua Peng <y...@pobox.com> wrote:
I know this is a newbie question. But how do I implement the
hash
structure
which is available in other languages (in python it's dict)?
I know there is the list, but list's names can be duplicated here.
x <- list(x=1:5,y=month.name,x=3:7)
x
$x
[1] 1 2 3 4 5
$y
[1] "January" "February" "March" "April" "May"
"June"
[7] "July" "August" "September" "October" "November"
"December"
$x
[1] 3 4 5 6 7
Thanks a lot.
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more,
see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more,
see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.