Re: [R] Best practice: to factor or not to factor for float variables

2014-07-04 Thread Sebastian Schubert
Hi Hadley, actually, I started with floating point numbers, ensured that the respective numbers are equal in R but I still got strange behaviour with dplyr's group_by: https://github.com/hadley/dplyr/issues/482 If I had to guess, I would suppose the source of this error somewhere in the C++ part

[R] Calling Matrices from a Function

2014-07-04 Thread Cheryl Johnson
When I call matrices from a function, they are called but are not recognized as matrices. I use the code below. From the main function, when I use the command is.matrix(), the response is FALSE. Why are the matrices not recognized as matrices? Thanks in advance for any guidance. #This function cre

Re: [R] Transform a data.frame with "; " sep column and another one in a a new one with the same two column but with repetitions

2014-07-04 Thread John McKown
On Fri, Jul 4, 2014 at 7:50 AM, João Azevedo Patrício wrote: > Hi, > > I've been trying to solve this issue but with no success. > > I have some data like this: > > 1 > TC WC > 2 > 0 Instruments & Instrumentation; Nuclear Science & Technology; > Physics, Particles & Fields; Spectroscopy > 3 > 0

[R] Rugarch package: arfimaspec and arfimafit

2014-07-04 Thread Anne-Marie B.
Hello, My question is related to the rugarch package and its arfimafit function. I am trying to fit an arfima(7,d,1) with an exogenous variable. Here is a part of my code: spec1<- arfimaspec(mean.model = list(armaOrder = c(7, 1), include.mean = FALSE,arfima = TRUE, ext

Re: [R] Display a dataframe

2014-07-04 Thread arun
You can use: print(dd1, row.names=F)     # Chisq DF   Pr(>Chisq) term    153.0216306  1 7.578366e-35 # Sex    13.3696538  1 5.114571e-04 # Volume      0.8476713  1 7.144239e-01 # Weight      1.2196050  1 5.388764e-01 # Intensity    2.6349405  1 2.090719e-01 # ISO     6.0507714  1 2

Re: [R] Transform a data.frame with "; " sep column and another one in a a new one with the same two column but with repetitions

2014-07-04 Thread arun
Hi, Try: dat1 <- read.table(text="'1 > TC' 'WC' '2 > 0'  'Instruments & Instrumentation; Nuclear Science & Technology;Physics, Particles & Fields; Spectroscopy' '3 > 0' 'Nanoscience & Nanotechnology; Materials Science,Multidisciplinary; Physics, Applied' '4 > 2'    'Physics, Nuclear; Physics, P

Re: [R] Transform a data.frame with "; " sep column and another one in a a new one with the same two column but with repetitions

2014-07-04 Thread João Azevedo Patrício
Em 04-07-2014 15:15, arun escreveu: Hi, Try: dat1 <- read.table(text="'1 > TC' 'WC' '2 > 0' 'Instruments & Instrumentation; Nuclear Science & Technology;Physics, Particles & Fields; Spectroscopy' '3 > 0' 'Nanoscience & Nanotechnology; Materials Science,Multidisciplinary; Physics, Applied' '4

Re: [R] Display a dataframe

2014-07-04 Thread David Winsemius
On Jul 4, 2014, at 7:27 AM, Gang Chen wrote: I really your kind help! This is exactly what I was looking for except that I need to get rid of the numbered row names. Look at the documentation: ?print.data.frame You cannot "get rid of" rownames in dataframes (at least as far as I know)

Re: [R] how does a valid subscript can produce an "subscript out of bounds" error?

2014-07-04 Thread Duncan Murdoch
On 04/07/2014, 6:35 PM, Witold E Wolski wrote: > how does a valid subscript (see first 2 lines) can produce an > "subscript out of bounds" error (see line 4)? > > > 1> sum(!rownames(msexp$rt) %in% msexp$pepinfo$transition_group_id) > [1] 0 > 2> sum(!msexp$pepinfo$transition_group_id %in% rownames

Re: [R] how does a valid subscript can produce an "subscript out of bounds" error?

2014-07-04 Thread Duncan Murdoch
On 04/07/2014, 6:35 PM, Witold E Wolski wrote: > how does a valid subscript (see first 2 lines) can produce an > "subscript out of bounds" error (see line 4)? > > > 1> sum(!rownames(msexp$rt) %in% msexp$pepinfo$transition_group_id) > [1] 0 > 2> sum(!msexp$pepinfo$transition_group_id %in% rownames

[R] how does a valid subscript can produce an "subscript out of bounds" error?

2014-07-04 Thread Witold E Wolski
how does a valid subscript (see first 2 lines) can produce an "subscript out of bounds" error (see line 4)? 1> sum(!rownames(msexp$rt) %in% msexp$pepinfo$transition_group_id) [1] 0 2> sum(!msexp$pepinfo$transition_group_id %in% rownames(msexp$rt)) [1] 0 3> class(msexp$rt) [1] "matrix" 4> msexp$rt

[R] data.table merge question...

2014-07-04 Thread Witold E Wolski
Actually the question is regarding differences in behaviour on windows in linux. The 2 lines of code produce on linux all TRUE on windows this looks "heterogenous"... Using merge.data.frame produces on all platforms TRUE ... > msexp$pepinfo = > data.frame(merge(tt,msexp$pepinfo,by="transit

[R] Training and testing on Unbalanced Data Set

2014-07-04 Thread Vijay goel
I used SMOTE algorithm in R for class balancing. My data size has 13000 rows, I had 7% minority class in my sample now I used SMOTE( Synthetic Minority Oversampling Technique) for class balancing such that I raised the ration of minority class to 42 % and number of rows in data sample becomes 12655

Re: [R] Best practice: to factor or not to factor for float variables

2014-07-04 Thread David Winsemius
Keep as numeric and group with cut(), Hmisc::cut2, or findInterval. The beauty of the functional language design is that you do not need to create a new factor variable. -- David Sent from my iPhone > On Jul 4, 2014, at 8:33 AM, Hadley Wickham wrote: > > Why not just round the floating poin

Re: [R] Best practice: to factor or not to factor for float variables

2014-07-04 Thread Hadley Wickham
Why not just round the floating point numbers to ensure they're equal with zapsmall, round or signif? Hadley On Fri, Jul 4, 2014 at 4:04 AM, Sebastian Schubert wrote: > Hi, > > I would like to ask for best practice advice on the design of data > structure and the connected analysis techniques. >

Re: [R] Display a dataframe

2014-07-04 Thread Gang Chen
I really your kind help! This is exactly what I was looking for except that I need to get rid of the numbered row names. On July 3, 2014 9:57:00 PM EDT, arun wrote: >Hi, >May be this helps: >nC <- max(nchar(row.names(dd))) > term <- formatC(row.names(dd), width=-nC) >#or > term <- sprintf("%-1

Re: [R] Best practice: to factor or not to factor for float variables

2014-07-04 Thread PIKAL Petr
Hi I would keep height as numeric and created height.f as factor, maybe ordered. > hh<-runif(50) > hh [1] 0.116060220 0.447546370 0.433749570 0.006548963 0.425710667 0.328972894 [7] 0.091274539 0.271797166 0.007669982 0.208922146 0.168174196 0.227466231 ... hh.f<-cut(hh, seq(0,1,.1)) > hh.f [1

Re: [R] Display a dataframe

2014-07-04 Thread Gang Chen
Perfect! Thanks a lot! On July 3, 2014 5:10:02 PM EDT, David L Carlson wrote: >Not elegant, but it works: > >> term <- dimnames(dd)[[1]] >> dd1 <- dd >> dimnames(dd1)[[1]] <- rep("", 6) >> dd2 <- capture.output(dd1) >> cat(paste(dd2, " ", c("Term", term)), fill=48) > # Chisq DF Pr(>Chisq)

Re: [R] applying operations within() a matrix's environment

2014-07-04 Thread Steve Bellan
No, I’m sorry. That’s a mistake. I should have written: row.active <- matrix(rbinom(K*steps, 1, .7), nr = K, nc = steps) == 1 so that this is a logical indexing vector, not just pulling out the first element. To avoid yielding NaNs (though that doesn’t really break the question), I’ve

[R] Best practice: to factor or not to factor for float variables

2014-07-04 Thread Sebastian Schubert
Hi, I would like to ask for best practice advice on the design of data structure and the connected analysis techniques. In my particular case, I have measurements of several variables at several, sometimes equal, heights. Following the tidy data approach of Hadley Wickham, I want to put all data

[R] Transform a data.frame with "; " sep column and another one in a a new one with the same two column but with repetitions

2014-07-04 Thread João Azevedo Patrício
Hi, I've been trying to solve this issue but with no success. I have some data like this: 1 > TC WC 2 > 0 Instruments & Instrumentation; Nuclear Science & Technology; Physics, Particles & Fields; Spectroscopy 3 > 0 Nanoscience & Nanotechnology; Materials Science, Multidisciplinary; Phys

Re: [R] applying operations within() a matrix's environment

2014-07-04 Thread PIKAL Petr
Hi Well, that is better. This is what is expected result<-cbind(v1,v2,v3) I am not sure if such function solves your problem, as you have 50 variables. fff<-function(v1, v2, v3, logic) { res1<-v1 res2<-v2 res3<-v3 res1[logic] <- v1[logic]*2 res2[logic] <- 3*v3[logic]*res1[logic] res3[logic] <-

[R] How to extract convergence code from lmer object?

2014-07-04 Thread Juan Andres Hernandez
Does anyone know how to extract the convergence code of an lmer object. I am working in a monte carlo simulation with mixed model and I need to know if a model has or not convergence. With unclass(mymodel) the following information attr(,"optinfo")$conv$lme4 can be seen. How can I get this importan

[R] How to apply data sets in Vennerable?

2014-07-04 Thread gktahon
Hi all, I finally managed to get Vennerable working with the help I got on the forum. However, I'm faced with my next challange now. The easy way to work with Vennerable works just fine, that is, if I enter code like this: Vdemo2 <- Venn(SetNames = c("foo", "bar"), Weight = c(`01`= 7, '11' = 8, '1

Re: [R] error:max not meaningful for factors

2014-07-04 Thread Jim Lemon
On Thu, 3 Jul 2014 12:34:42 PM Marta valdes lopez wrote: > Thank you Jim for your answer.Ok alpha( it is the speed of the boat) is a > range of number from 0.5 to 10 like 0.5,1,1.5,2, I would like to have > the mean of x and y base on each value of alpha, because I have like ten > numbers of

Re: [R] sammon fails with duplicates error, but no duplicates there (MASS package)

2014-07-04 Thread Martin Guetlein
Hi Frede, awesome, thanks a lot. Helped me to understand how sammon is working as well. Martin On 4 July 2014 10:02, Frede Aakmann Tøgersen wrote: > Hi > > It seems to be related to the way that the default start values (y argument > of sammon) are calculated. > > Here the last two rows are th

Re: [R] Fisher Scoring v/s Coordinate Descent for MLE in R

2014-07-04 Thread peter dalgaard
There are books on this, can't repeat them here... Roughly speaking, Fisher Scoring is quadratically convergent, hence requires much fewer iterations than gradient descent methods which are generally only linear, and sometimes very slowly so (in highly collinear cases, usually). I.e., it is a m

Re: [R] error:max not meaningful for factors

2014-07-04 Thread PIKAL Petr
Hi Not much helpful. Now we know what you ***think*** alpha is but not what it really is. You shall post at least result of str(your.objects) I also wonder why do you populate slots in perf manually and with data.frames instead of lists which are required according to documentation. Regards

Re: [R] Dataframes and text identifier columns

2014-07-04 Thread PIKAL Petr
Hi. Well, Case is probably factor, which is basically numeric vector with labels. It is useful for some operations but it can have some features which lead to this behaviour. I do not have available your exact code but I presume you use c or cbind somewhere. > Case<-factor(letters[1:4]) > Case

Re: [R] sammon fails with duplicates error, but no duplicates there (MASS package)

2014-07-04 Thread Frede Aakmann Tøgersen
Hi It seems to be related to the way that the default start values (y argument of sammon) are calculated. Here the last two rows are the same: > cmdscale(dist(data), 2) [,1] [,2] c1 2.04910556 -0.3627887 c2 -0.01889892 -0.1822057 c3 0.40767629 0.2599026 c4 0.81569304 -0

Re: [R] applying operations within() a matrix's environment

2014-07-04 Thread PIKAL Petr
Hi Are you 100% sure that you always want select only first item from v1,v2 and v3 and change it in each step of cycle and keep only last value from your cycle in first item in vectors v1-3? Because this is what your cycle does. Petr From: r-help-boun...@r-project.org [mailto:r-help-boun...@r

[R] sammon fails with duplicates error, but no duplicates there (MASS package)

2014-07-04 Thread Martin Guetlein
Hi all, the sammon mapping fails with message "initial configuration has duplicates". But there are no duplicates in my data (see example below). Apparently, the problem is that row 9 and 10 have an equal distance to all other rows (but they are not equal, see last two columns). Any help to get s