
indeed norm should be in the same group as as the months. everything works 
fine when the number of data is quite small, but with big datasets (15 000 
values) things seem to go wrong and I can't explain why. It puts norm as 
an individual column in stead of in the group of months as it does when 
the dataset is small.

From: PIKAL Petr
To: Joachim Audenaert
Cc:     "" <>
Date:   16/04/2015 13:41
Subject:        RE: [R]  melt function chooses wrong id variable with 
large datasets

With this dataset I get
> dd.m0<-melt(dataset, na.rm=T)
Using norm as id variables
> head(dd.m0)
                norm variable value
1   45.8713463281901  januari  38.1
2 24.047250681782984  januari  32.4
3 3.7533684144746324  januari  34.5
4 38.594241119279324  januari  20.7
5 26.391897460120358  januari  21.5
6 61.746470001194638  januari  23.1
dd.m<-melt(dataset, id.vars=NULL, na.rm=T)
> head(dd.m)
  variable value
1  januari  38.1
2  januari  32.4
3  januari  34.5
4  januari  20.7
5  januari  21.5
6  januari  23.1
> tail(dd.m)
    variable              value
255     norm  4.856812959269508
256     norm 5.3982910143166514
257     norm 46.553976273304215
258     norm 17.566272518985429
259     norm 20.552451905814117
260     norm 61.894775704479279
The latter will put norm to the same column as months. Is it intended?
Maybe you want
> dd.m1<-melt(dataset[,-13], na.rm=T)
No id variables; using all as measure variables
> head(dd.m1)
  variable value
1  januari  38.1
2  januari  32.4
3  januari  34.5
4  januari  20.7
5  januari  21.5
6  januari  23.1
> tail(dd.m1)
    variable value
235 december  20.7
236 december  30.9
237 december  36.2
238 december  21.0
239 december  20.2
240 december  21.3
From: Joachim Audenaert [] 
Sent: Thursday, April 16, 2015 1:13 PM
To: PIKAL Petr
Subject: RE: [R] melt function chooses wrong id variable with large 

This is a part of my dataset: 

structure(list(januari = c(38.1, 32.4, 34.5, 20.7, 21.5, 23.1, 
29.7, 36.6, 36.1, 20.6, 20.4, 30.1, 38.7, 41.4, 37, 36, 37, 38, 
23, 26.7), februari = c(31.5, 36.2, 38.2, 26.4, 20.9, 21.5, 30.2, 
33.4, 32.6, 22.2, 21.7, 30, 35.7, 32.8, 39.3, 25.5, 23, 19.9, 
21.3, 20.8), maart = c(34.2, 27, 24.2, 19.9, 19.7, 21.5, 30.6, 
30, 19, 19.6, 20.6, 23.6, 17.9, 17.3, 21.4, 24.1, 20.9, 30.1, 
32.6, 21.3), april = c(26.3, 29.6, 30.3, 23.6, 28.4, 20.7, 24.1, 
27.3, 23.2, 18.3, 24.6, 27.4, 20.4, 18.1, 25.2, 19.8, 21, 23.7, 
19.6, 18.1), mei = c(23.7, 24, 17.2, 23.2, 25.2, 17.2, 16, 15.6, 
13.4, 16, 16.8, 14.6, 19.4, 21, 19.5, 18.5, 13.3, 13.7, 14.3, 
14.1), juni = c(17.7, 14.2, 16.6, 15.7, 13.7, 14.7, 13.1, 12.9, 
15.4, 11.9, 15.2, 15.3, 16.5, 16.1, 11.7, 11.2, 11.5, 10.8, 16.1, 
14.8), juli = c(15.7, 14.5, 10.8, 10.5, 13.4, 12.2, 13.2, 13, 
12.4, 13.1, 9.8, 10.5, 13.4, 11, 13.1, 15, 16.7, 16.1, 18.2, 
15.7), augustus = c(12.9, 12.8, 15.2, 14.5, 17.2, 14.5, 14.4, 
11, 13.1, 13.6, 14.6, 12.7, 13.6, 12.7, 15.5, 17.4, 15.2, 14.2, 
17.7, 19.2), september = c(15.6, 15.5, 15.9, 15.1, 16, 19.4, 
21.5, 23.7, 18.7, 23.8, 18, 16.2, 18.5, 20.6, 18.3, 22.5, 26.9, 
19.4, 15.9, 20.5), oktober = c(21.4, 20.8, 14, 17, 23, 26.4, 
19.6, 22.7, 26.9, 14.7, 15.2, 19.8, 26.9, 20.2, 14.3, 14.8, 18.5, 
21.7, 21.4, 21.8), november = c(24.7, 26.2, 29, 21.6, 17.1, 16.9, 
19.1, 24.7, 25.4, 19.8, 18.2, 16.3, 17, 17.7, 15.5, 14.7, 15.8, 
19.9, 20.4, 23.3), december = c(19.8, 27, 21, 33, 22.6, 28.3, 
21.1, 19, 17.3, 27, 30.2, 24.8, 17.9, 17.9, 20.7, 30.9, 36.2, 
21, 20.2, 21.3), norm = c("45.8713463281901", "24.047250681782984", 
"3.7533684144746324", "38.594241119279324", "26.391897460120358", 
"61.746470001194638", "6.8321020448487992", "11.933109250115226", 
"51.951891096493924", "37.424611852237945", "5.1587836676942374", 
"36.552835044409434", "31.781209673851027", "29.09146215582853", 
"4.856812959269508", "5.3982910143166514", "46.553976273304215", 
"17.566272518985429", "20.552451905814117", "61.894775704479279"
)), .Names = c("januari", "februari", "maart", "april", "mei", 
"juni", "juli", "augustus", "september", "oktober", "november", 
"december", "norm"), row.names = c(NA, 20L), class = "data.frame") 

I transform my dataset with the following script: 

y <- melt(dataset,na.rm=TRUE) 
variable <- y[,1] 
value <- y[,2] 

and can then perform a levene test as follows: 

LEVENE <- leveneTest(value~variable,y) 

When the dataset is small, lets say less than 100 values per column 
everything works great. I get the message: 

No id variables; using all as measure variables 

When the dataset is much bigger I get the following message 

Using norm as id variables, why does this function pick norm as id 
variable? and how can I tell R that each column title is my variable 

From:        PIKAL Petr <> 
To:        Joachim Audenaert <>, "" <> 
Date:        16/04/2015 12:13 
Subject:        RE: [R]  melt function chooses wrong id variable with 
large datasets 


There is something weird with your data and melt function.

AFAIK melt does not use first row as id.variables.

What is result of


Instead of

melt(dataset,id.vars=dataset[1,], na.rm=TRUE)

melt expects something like

melt(dataset, id.vars=c("norm, "jaar"), na.rm=TRUE)

If you want more specific answer you shall show us part of your data, 
preferably copy output of


into your mail.


> -----Original Message-----
> From: R-help [] On Behalf Of Joachim
> Audenaert
> Sent: Thursday, April 16, 2015 11:37 AM
> To:
> Subject: [R] melt function chooses wrong id variable with large
> datasets
> Hello all,
> I'm using a large dataset consisting of 2 groups of data, 2 columns in
> excel with a header (group name) and 15 000 rows of data. I would like
> like to compare this data, so I transform my dataset with the melt
> function to get 1 column of data and 1 column of ID variables, then I
> can apply different statistical tests. With small datasets this works
> great, the melt function automatically chooses the name in row 1 as ID
> variable and melts the data, thus giving me a matrix with all ID
> variables in column one and the data accordingly in column 2.
> With this big dataset however it chooses the whole first column as ID
> variables in stead of the first row. Is there a reason why this happens
> and how can I make sure the first row is chosen as ID variabele and the
> lower rows as data?
> If I specify that I want the first row to be the id variable I also get
> error.
> melt(dataset,id.vars=dataset[1,], na.rm=TRUE)
> Error: id variables not found in data: norm, jaar
> Are there alternative ways to create a good reshaped dataset?
>       [[alternative HTML version deleted]]

