Something I've just noticed, stringsAsFactors is not an argument to merge().
And, without changing the class I g a warning:
Warning message:
In `[<-.factor`(`*tmp*`, ri, value = 1:31) :
invalid factor level, NA generated
Rui Barradas
Em 23-07-2013 13:36, arun escreveu:
I tried this without the changing the class, but there was no warning.
#'data.frame': 62 obs. of 4 variables:
# $ d_release: Factor w/ 31 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
# $ m_release: Factor w/ 2 levels "5","6": 1 1 1 1 1 1 1 1 1 1 ...
# $ y_release: Factor w/ 1 level "2004": 1 1 1 1 1 1 1 1 1 1 ...
# $ Freq : num 0 0 0 0 1 1 1 0 0 1 ...
#'data.frame': 31 obs. of 4 variables:
# $ y_temp: int 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 ...
# $ m_temp: int 5 5 5 5 5 5 5 5 5 5 ...
# $ d_temp: int 1 2 3 4 5 6 7 8 9 10 ...
# $ temp : num 16.9 18 17.4 19.7 105.7 ...
res<-merge(release_freq, temp_h12, by.x=c("y_release","m_release","d_release"),
by.y=c("y_temp","m_temp","d_temp"), stringsAsFactors=FALSE)
# y_release m_release d_release Freq temp
#1 2004 5 1 0 16.9
#2 2004 5 10 1 16.1
#3 2004 5 11 1 15.8
#4 2004 5 12 1 15.1
#5 2004 5 13 0 17.8
#6 2004 5 14 0 17.4
# changing the class
release_freq$d_release <- as.integer(as.character(release_freq$d_release))
release_freq$m_release <- as.integer(as.character(release_freq$m_release))
release_freq$y_release <- as.integer(as.character(release_freq$y_release))
res1<- merge(release_freq, temp_h12,
by.y=c("y_temp","m_temp","d_temp"), stringsAsFactors=FALSE)
# y_release m_release d_release Freq temp
#1 2004 5 1 0 16.9
#2 2004 5 10 1 16.1
#3 2004 5 11 1 15.8
#4 2004 5 12 1 15.1
#5 2004 5 13 0 17.8
#6 2004 5 14 0 17.4
The results are not identical.
#[1] FALSE
#'data.frame': 31 obs. of 5 variables:
# $ y_release: Factor w/ 1 level "2004": 1 1 1 1 1 1 1 1 1 1 ...
# $ m_release: Factor w/ 2 levels "5","6": 1 1 1 1 1 1 1 1 1 1 ...
# $ d_release: Factor w/ 31 levels "1","2","3","4",..: 1 10 11 12 13 14 15 16
17 18 ...
# $ Freq : num 0 1 1 1 0 0 1 1 0 1 ...
# $ temp : num 16.9 16.1 15.8 15.1 17.8 17.4 16 17.7 17.3 22.3 ...
#'data.frame': 31 obs. of 5 variables:
# $ y_release: int 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 ...
# $ m_release: int 5 5 5 5 5 5 5 5 5 5 ...
# $ d_release: int 1 10 11 12 13 14 15 16 17 18 ...
# $ Freq : num 0 1 1 1 0 0 1 1 0 1 ...
# $ temp : num 16.9 16.1 15.8 15.1 17.8 17.4 16 17.7 17.3 22.3 ...
R version 3.0.1 (2013-05-16)
Platform: x86_64-unknown-linux-gnu (64-bit)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] stringr_0.6.2 reshape2_1.2.2
loaded via a namespace (and not attached):
[1] plyr_1.8
----- Original Message -----
From: Rui Barradas <ruipbarra...@sapo.pt>
To: Stefano Sofia <stefano.so...@regione.marche.it>
Cc: "r-help@r-project.org" <r-help@r-project.org>
Sent: Tuesday, July 23, 2013 6:50 AM
Subject: Re: [R] Some days missing using xtabs
As for your second question, before merge(), try the following.
release_freq$d_release <- as.integer(as.character(release_freq$d_release))
release_freq$m_release <- as.integer(as.character(release_freq$m_release))
release_freq$y_release <- as.integer(as.character(release_freq$y_release))
And the warning is gone.
Hope this helps,
Rui Barradas
Em 23-07-2013 10:33, Stefano Sofia escreveu:
Dear R-users,
given the following data frame called hospital_2004
gender d_birth m_birth y_birth address d_admittance m_admittance y_admittance
yard_admittance d_release m_release y_release yard_release diaprinc diasec1
diasec2 diasec3 diasec4 diasec5
2 13 12 1929 42002 30 3 2004 3003 6 5 2004 4902 430 4299 51881 4275 78001 0
1 1 8 1935 42002 7 4 2004 2401 18 5 2004 1801 20500 V581 0388 5849 0 0
1 23 12 1956 42018 26 4 2004 2402 31 5 2004 2402 1552 5715 7895 25000 4148 5722
1 9 8 1919 42002 05 5 2004 2602 22 5 2004 4902 51881 4254 4275 0 0 0
2 11 1 1925 52014 30 4 2004 2603 13 6 2004 4902 51881 49121 2732 4275 4299 5849
2 1 3 1963 44060 1 5 2004 5101 16 5 2004 2401 3201 1519 1976 1983 4019 0
1 6 3 1937 45010 6 5 2004 3003 12 5 2004 4901 431 3314 41189 25001 4019 V594
1 3 9 1931 42034 3 5 2004 5101 5 5 2004 5101 78559 4829 5119 1619 4241 585
2 13 9 1912 41007 5 5 2004 4901 7 5 2004 4901 85225 4019 42731 49121 0 0
1 21 10 1936 15146 7 5 2004 4901 10 5 2004 4901 431 430 V594 V595 0 0
2 8 5 1933 43044 8 5 2004 5802 8 6 2004 5802 5712 45620 2851 5119 5184 0
1 25 1 1926 41057 8 5 2004 4901 15 5 2004 4901 431 78001 49121 0 0 0
1 6 1 1923 42002 10 5 2004 1401 11 5 2004 4901 4440 412 4413 0 0 0
1 19 3 1934 42022 9 5 2004 1401 21 6 2004 4901 4413 5609 99811 4019 412 0
1 6 6 1921 43052 15 5 2004 4302 4 6 2004 4302 1890 20280 436 49121 9986 V1005
when I try to evaluate the frequency of daily releases through
release_freq <- as.data.frame(xtabs( ~ d_release + m_release + y_release,
I get the following result:
d_release m_release y_release Freq
4 5 2004 0
5 5 2004 1
6 5 2004 1
7 5 2004 1
8 5 2004 0
10 5 2004 1
11 5 2004 1
12 5 2004 1
13 5 2004 0
15 5 2004 1
16 5 2004 1
18 5 2004 1
21 5 2004 0
22 5 2004 1
31 5 2004 1
4 6 2004 1
5 6 2004 0
6 6 2004 0
7 6 2004 0
8 6 2004 1
10 6 2004 0
11 6 2004 0
12 6 2004 0
13 6 2004 1
15 6 2004 0
16 6 2004 0
18 6 2004 0
21 6 2004 1
22 6 2004 0
31 6 2004 0
Why the 1st, 2nd, 3rd, 9th, 14th, 17th, 19th, 20th, from 23rd to 30th of both
May and June are missing? (and there is the 31st of June?)
And a final question: why given another data frame called temp_h12
y_temp m_temp d_temp temp
2004 5 1 16.90
2004 5 2 18.00
2004 5 3 17.40
2004 5 4 19.70
2004 5 5 105.70
2004 5 6 17.30
2004 5 7 17.00
2004 5 8 16.20
2004 5 9 16.10
2004 5 10 16.10
2004 5 11 15.80
2004 5 12 15.10
2004 5 13 17.80
2004 5 14 17.40
2004 5 15 16.00
2004 5 16 17.70
2004 5 17 17.30
2004 5 18 22.30
2004 5 19 23.30
2004 5 20 24.30
2004 5 21 19.90
2004 5 22 15.70
2004 5 23 15.80
2004 5 24 17.10
2004 5 25 18.30
2004 5 26 21.00
2004 5 27 18.20
2004 5 28 17.90
2004 5 29 19.40
2004 5 30 22.10
2004 5 31 17.40
merge(release_freq, temp_h12, by.x=c("y_release","m_release","d_release"),
by.y=c("y_temp","m_temp","d_temp"), stringsAsFactors=FALSE)
gives the following warning
Warning message:
In `[<-.factor`(`*tmp*`, ri, value = 1:31) :
invalid factor level, NAs generated
Thank you for your help
Stefano Sofia
AVVISO IMPORTANTE: Questo messaggio di posta elettronica può contenere
informazioni confidenziali, pertanto è destinato solo a persone autorizzate
alla ricezione. I messaggi di posta elettronica per i client di Regione Marche
possono contenere informazioni confidenziali e con privilegi legali. Se non si
è il destinatario specificato, non leggere, copiare, inoltrare o archiviare
questo messaggio. Se si è ricevuto questo messaggio per errore, inoltrarlo al
mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi
dell'art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessità ed
urgenza, la risposta al presente messaggio di posta elettronica può essere
visionata da persone estranee al destinatario.
IMPORTANT NOTICE: This e-mail message is intended to be received only by
persons entitled to receive the confidential information it may contain. E-mail
messages to clients of Regione Marche may contain information that is
confidential and legally privileged. Please do not read, copy, forward, or
store this message unless you are an intended recipient of it. If you have
received this message in error, please forward it to the sender and delete it
completely from your computer system.
[[alternative HTML version deleted]]
R-help@r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
R-help@r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
R-help@r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.