[R] Statistician Needed

2009-08-10 Thread Noah Silverman
Hello, I've come up with some challenges with my process that are a bit too complicated for the mailing list. Is there anyone out there, preferably a real "statistician", who is willing to consult with me via phone/email for a few hours. I'm happy to pay you for your time. Thanks, -Noah

[R] nominal to numeric function

2009-08-11 Thread Noah Silverman
Hi, I'm training an SVM (C-classification from e1071 library) Some of the variables in my data set are nominal. Is there some easy/automatic way to convert them to numerical representations? Thanks, -N __ R-help@r-project.org mailing list https:/

Re: [R] nominal to numeric function

2009-08-12 Thread Noah Silverman
On Wed, 12 Aug 2009, Daniel Malter wrote: Hi you can use newvariable=as.numeric(variablename). This converts your factors into numeric variables, but not always with the desired result. So make sure that you check whether "newvariable" gives you what you want. Otherwise recoding by

[R] Nominal variables in SVM?

2009-08-12 Thread Noah Silverman
Hi, The answers to my previous question about nominal variables has lead me to a more important question. What is the "best practice" way to feed nominal variable to an SVM. For example: color = ("red, "blue", "green") I could translate that into an index so I wind up with color= (1,2,3) Bu

Re: [R] Nominal variables in SVM?

2009-08-12 Thread Noah Silverman
PM, Steve Lianoglou wrote: Hi, On Aug 12, 2009, at 2:53 PM, Noah Silverman wrote: Hi, The answers to my previous question about nominal variables has lead me to a more important question. What is the "best practice" way to feed nominal variable to an SVM. For example: color = ("

Re: [R] Nominal variables in SVM?

2009-08-12 Thread Noah Silverman
factor? i.e. foo$color <- factor(foo$color) On 8/12/09 2:21 PM, Achim Zeileis wrote: > On Wed, 12 Aug 2009, Noah Silverman wrote: > >> Hi, >> >> The answers to my previous question about nominal variables has lead >> me to a more important question. >> >

Re: [R] Nominal variables in SVM?

2009-08-12 Thread Noah Silverman
ominal factor, but I'm not sure. Can anyone provide an opinion on this? Thanks! -N On 8/12/09 2:21 PM, Achim Zeileis wrote: > On Wed, 12 Aug 2009, Noah Silverman wrote: > >> Hi, >> >> The answers to my previous question about nominal variables has lead >>

[R] Help understanding lrm function of Design library

2009-08-18 Thread Noah Silverman
Hi, I'm developing an experiment with logistic regression. I've come across the lrm function in the Design library. While I understand and can use the basic functionality, there are a ton of options that go beyond my knowledge. I've carefully read the help page for lrm, but don't understand

[R] Help understanding lrm function of Design library

2009-08-18 Thread Noah Silverman
Hi, I'm developing an experiment with logistic regression. I've come across the lrm function in the Design library. While I understand and can use the basic functionality, there are a ton of options that go beyond my knowledge. I've carefully read the help page for lrm, but don't understand

[R] Performance measure for probabilistic predictions

2009-08-19 Thread Noah Silverman
Hello, I'm using an SVM for predicting a model, but I'm most interested in the probability output. This is easy enough to calculate. My challenge is how to measure the relative performance of the SVM for different settings/parameters/etc. An AUC curve comes to mind, but I'm NOT interested

[R] Best performance measure?

2009-08-19 Thread Noah Silverman
Hello, I working on a model to predict probabilities. I don't really care about binary prediction accuracy. I do really care about the accuracy of my probability predictions. Frank was nice enough to point me to the val.prob function from the Design library. It looks very promising for my ne

[R] Erros with RVM and LSSVM from kernlab library

2009-08-19 Thread Noah Silverman
Hello, In my ongoing quest to develop a "best" model, I'm testing various forms of SVM to see which is best for my application. I have been using the SVM from the e1071 library without problem for several weeks. Now, I'm interested in RVM and LSSVM to see if I get better performance. When

Re: [R] Best performance measure?

2009-08-19 Thread Noah Silverman
kind of score that measures just the accuracy? Thanks! -N On 8/19/09 10:42 AM, Frank E Harrell Jr wrote: > Noah Silverman wrote: >> Hello, >> >> I working on a model to predict probabilities. >> >> I don't really care about binary prediction accuracy. >

Re: [R] Best performance measure?

2009-08-19 Thread Noah Silverman
time. mean(label) On 8/19/09 11:51 AM, Frank E Harrell Jr wrote: > Noah Silverman wrote: >> Thanks for the suggestion. >> >> You explained that Briar combines both accuracy and discrimination >> ability. If I understand you right, that is in relation to binary

Re: [R] Erros with RVM and LSSVM from kernlab library

2009-08-19 Thread Noah Silverman
:50 AM, Steve Lianoglou wrote: > Hi, > > On Aug 19, 2009, at 1:27 PM, Noah Silverman wrote: > >> Hello, >> >> In my ongoing quest to develop a "best" model, I'm testing various >> forms of SVM to see which is best for my application. >> >

Re: [R] Erros with RVM and LSSVM from kernlab library

2009-08-19 Thread Noah Silverman
Steve, That makes sense, except that x is a data.frame with about 70 columns. So I don't see how it would convert to a list. -N On 8/19/09 12:09 PM, Steve Lianoglou wrote: > Howdy, > > On Aug 19, 2009, at 2:54 PM, Noah Silverman wrote: > >> Hi Steve, >> >> N

Re: [R] Best performance measure?

2009-08-19 Thread Noah Silverman
ore > into discrimination and calibration components (which is not in the > software). > > Frank > >> >> i.e. For predicted probabilities of .10 to .20 the data was actually >> labeled true .18 percent of the time. mean(label) >> >> >>

Re: [R] Erros with RVM and LSSVM from kernlab library

2009-08-19 Thread Noah Silverman
Steve, Not sure what to do with this. I have a data.frame. Don't know how to convert it to a list. Does anybody else have any input on this? On 8/19/09 12:17 PM, Steve Lianoglou wrote: >> Steve, >> >> That makes sense, except that x is a data.frame with about 70 >> columns. So I don't see

Re: [R] Erros with RVM and LSSVM from kernlab library

2009-08-19 Thread Noah Silverman
;kernel' " Any suggestions? Thanks! On 8/19/09 3:17 PM, David Winsemius wrote: > > On Aug 19, 2009, at 6:11 PM, Noah Silverman wrote: > >> Steve, >> >> Not sure what to do with this. >> >> I have a data.frame. Don't know how to convert it

Re: [R] Erros with RVM and LSSVM from kernlab library

2009-08-19 Thread Noah Silverman
at 6:36 PM, David Winsemius > wrote: > >> >> On Aug 19, 2009, at 6:30 PM, Noah Silverman wrote: >> >>> Thanks David, >>> >>> Then, do you have any clue why RVM or LSSVM would be generating an >>> error? >> >> No. >>> &

[R] Calculating loess value

2009-08-19 Thread Noah Silverman
Hello, I'm attempting to evaluate the accuracy of the probability predictions for my model. As previously discussed here, the AUC is not a good measure as I'm not concerned with classification accuracy but probability accurcy. It was suggested to me that the loess function would be a good m

[R] Calculating loess value

2009-08-20 Thread Noah Silverman
Hello, I'm attempting to evaluate the accuracy of the probability predictions for my model. As previously discussed here, the AUC is not a good measure as I'm not concerned with classification accuracy but probability accurcy. It was suggested to me that the loess function would be a good m

[R] Possible bug with lrm.fit in Design Library

2009-08-20 Thread Noah Silverman
Hi, I've come across a strange error when using the lrm.fit function and the subsequent predict function. The model is created very quickly and can be verified by printing it on the console. Everything looks good. (In fact, the performance measures are rather nice.) Then, I want to use th

[R] Repost - Calculating loess value

2009-08-21 Thread Noah Silverman
Hello, I'm attempting to evaluate the accuracy of the probability predictions for my model. As previously discussed here, the AUC is not a good measure as I'm not concerned with classification accuracy but probability accurcy. It was suggested to me that the loess function would be a good m

[R] Repost - Possible bug with lrm.fit in Design Library

2009-08-21 Thread Noah Silverman
Hi, I've come across a strange error when using the lrm.fit function and the subsequent predict function. The model is created very quickly and can be verified by printing it on the console. Everything looks good. (In fact, the performance measures are rather nice.) Then, I want to use th

Re: [R] Repost - Possible bug with lrm.fit in Design Library

2009-08-21 Thread Noah Silverman
Thanks Marc, My apologies to all for the unnecessary re-posting. -Noah On 8/21/09 9:13 AM, Marc Schwartz wrote: > On Aug 21, 2009, at 11:02 AM, Noah Silverman wrote: > >> Hi, >> >> I've come across a strange error when using the lrm.fit function and >&

[R] Question about validating predicted probabilities

2009-08-21 Thread Noah Silverman
Hello, Frank was nice enough to point me to the val.prob function of the Design library. It creates a beautiful graph that really helps me visualize how well my model is predicting probabilities. By default, there are two lines on the graph 1) fitted logistic calibration curve 2) n

Re: [R] Question about validating predicted probabilities

2009-08-21 Thread Noah Silverman
> > require(Design) > dd <- datadist(predprob); options(datadist='dd') > f <- lrm(event ~ rcs(qlogis(predprob), 3)) > plot(f, predprob=NA, fun=plogis) > > Frank > > > Noah Silverman wrote: >> Hello, >> >> Frank was nice enough

[R] Quick explanation of model output

2009-08-21 Thread Noah Silverman
Hi, Been running the lrm model from the Design package. (Thanks Frank!) There are some output columns that I don't quite understand. What is "Wald Z" and then "P" which is 0 for all rows??? --- Coef S.E. Wald Z P Intercept -2.797 0

[R] Trying something for fun...

2009-08-21 Thread Noah Silverman
Hi, For fun, I'm trying to throw some horse racing data into either an svm or lrm model. Curious to see what comes out as there are so many published papers on this. One thing I don't know how to do is to standardize the probabilities by race. For example, if I train an LRM on a bunch of

Re: [R] Trying something for fun...

2009-08-22 Thread Noah Silverman
, when training a clogit is the exact value of the strata saved as part of the model, or is it just used for grouping?) On 8/22/09 10:57 AM, Charles C. Berry wrote: On Fri, 21 Aug 2009, Noah Silverman wrote: Hi, For fun, I'm trying to throw some horse racing data into either an svm or l

Re: [R] Trying something for fun...

2009-08-22 Thread Noah Silverman
utput options. (I can see one that is a probability option.) Thanks!! -N On 8/22/09 10:57 AM, Charles C. Berry wrote: > On Fri, 21 Aug 2009, Noah Silverman wrote: > >> Hi, >> >> For fun, I'm trying to throw some horse racing data into either an >> svm or lrm

Re: [R] Clogit or LRM?

2009-08-25 Thread Noah Silverman
the conditional logit. Chuck's > reference didn't help me much > with that so if you know of others, please let me know. Thanks. > > > > Mark > > > On Aug 25, 2009, *Noah Silve

[R] Select top three values from data frame

2009-08-26 Thread Noah Silverman
Hi, I'm trying to find an easy way to do this. I want to select the top three values of a specific column in a subset of rows in a data.frame. I'll demonstrate. ABC x21 x41 x32 y15 y26 y38 I want the top 3 values of B from the data.fr

Re: [R] Select top three values from data frame

2009-08-26 Thread Noah Silverman
I only have a few values in my example, but the real data set might have 20-100 rows with A="X". So how do I pick just the three highest ones? -N On 8/26/09 2:46 AM, Ottorino-Luca Pantani wrote: > df.mydata[df.mydata$A=="X" AND df.mydata$C < 2, ] > will do

Re: [R] Select top three values from data frame

2009-08-26 Thread Noah Silverman
s should work - head is quite a usefull summary function > > head(df.mydata[df.mydata$A=="X"& df.mydata$C< 2, ],3) > > > Colin. > > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On Behalf Of Noah Silve

[R] Managing output

2009-08-26 Thread Noah Silverman
Hi, Is there a way to build up a vector, item by item. In perl, we can "push" an item onto an array. How can we can do this in R? I have a loop that generates values as it goes. I want to end up with a vector of all the loop results. In perl it woud be: for(item in list){ result <- 2

Re: [R] Managing output

2009-08-26 Thread Noah Silverman
y/lapply/mapply family of functions? > > In general, the "for" loop construct can be avoided so you don't have to > think about messy indexing. What exactly are you trying to do? > > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-bou

Re: [R] Managing output

2009-08-26 Thread Noah Silverman
tatistical Computing Facility > Department of Statistics > UC Berkeley > spec...@stat.berkeley.edu > > > On Wed, 26 Aug 2009, Noah Silverman wrote: > >> The actually process is REALLY complicate, I just gave a simple example >> for the

Re: [R] Submit a R job to a server

2009-08-26 Thread Noah Silverman
Deb, I generally run my larger R tasks on a server. Here is my workflow. 1) Write an R script using a text editor. (There are many popular ones.) 2) FTP the R script to your server. 3) SSH into the server 4) Run R 5) Run the script that you uploaded from the R process you just started. On 8/2

[R] Sapply

2009-08-30 Thread Noah Silverman
Hi, I need a bit of guidance with the sapply function. I've read the help page, but am still a bit unsure how to use it. I have a large data frame with about 100 columns and 30,000 rows. One of the columns is "group" of which there are about 2,000 distinct "groups". I want to normalize (s

[R] SVM coefficients

2009-08-30 Thread Noah Silverman
Hello, I'm using the svm function from the e1071 package. It works well and gives me nice results. I'm very curious to see the actual coefficients calculated for each input variable. (Other packages, like RapidMiner, show you this automatically.) I've tried looking at attributes for the mo

Re: [R] SVM coefficients

2009-08-31 Thread Noah Silverman
ot; for each of the 80 variables. -- Noah On 8/30/09 7:47 PM, Steve Lianoglou wrote: Hi, On Sun, Aug 30, 2009 at 6:10 PM, Noah Silverman wrote: Hello, I'm using the svm function from the e1071 package. It works well and gives me nice results. I'm very curious to see the actua

Re: [R] SVM coefficients

2009-08-31 Thread Noah Silverman
relative weight (significance.) the SVM assigned to each variable. On 8/31/09 12:54 AM, Achim Zeileis wrote: > On Mon, 31 Aug 2009, Noah Silverman wrote: > >> Steve, >> >> That doesn't work. >> >> I just trained an SVM with 80 variables. >> svm_mod

[R] Probit function

2009-08-31 Thread Noah Silverman
Hello, I want to start testing using the MNP probit function in stead of the lrm function in my current experiment. I have one dependant label and two independent varaibles. The lrm is simple model <- lrm(label ~ val1 + val2) I tried the same thing with the mnp function and got an error tha

Re: [R] Probit function

2009-08-31 Thread Noah Silverman
my thought was that they were using the same for this application. Any thoughts? -- Noah On 8/31/09 5:07 PM, Achim Zeileis wrote: > On Mon, 31 Aug 2009, Noah Silverman wrote: > >> Hello, >> >> I want to start testing using the MNP probit function in stead of the &

Re: [R] Probit function

2009-08-31 Thread Noah Silverman
I get that. Still trying to figure out what the "multi" nominal labels they used were. That's why I passed on the reference to the seminar summary. On 8/31/09 5:40 PM, Achim Zeileis wrote: > On Mon, 31 Aug 2009, Noah Silverman wrote: > >> Thanks Achim, >> &g

Re: [R] Probit function

2009-08-31 Thread Noah Silverman
Since the Boltman and Chapman application didn't really have multiple discreet choices, I'm not sure how the probit model would. Hence my inquiry. On 8/31/09 6:23 PM, Achim Zeileis wrote: > On Mon, 31 Aug 2009, Noah Silverman wrote: > >> I get that. >> >> St

Re: [R] Probit function

2009-08-31 Thread Noah Silverman
ken and they are in fact predicting rank, would you please show me where that is in their paper. Thanks! -N On 8/31/09 7:17 PM, Achim Zeileis wrote: > On Mon, 31 Aug 2009, Noah Silverman wrote: > >> Um. I did my research. Have been for years. I assume you're >> referring to

Re: [R] [OT] book on Linux scripting

2009-09-02 Thread Noah Silverman
Erin, Linux supports many scripting languages. Which language are you interested in: Perl, PHP, Bash, Python, etc??? -- Noah On 9/2/09 10:35 PM, Erin Hodgess wrote: > Dear R People: > > I know that this is off topic, but could anyone recommend a good book > on Linux scripting please? > > Any he

Re: [R] [OT] book on Linux scripting

2009-09-03 Thread Noah Silverman
nking along the lines of sed or awk, please. On Thu, Sep 3, 2009 at 1:56 AM, Noah Silverman wrote: Erin, Linux supports many scripting languages. Which language are you interested in: Perl, PHP, Bash, Python, etc??? -- Noah On 9/2/09 10:35 PM, Erin Hodgess wrote: Dear R People: I kn

[R] Easy way to get top 2 items from vector

2009-09-03 Thread Noah Silverman
Hi, I use the max function often to find the top value from a matrix or column of a data.frame. Now I'm looking to find the top 2 (or three) values from my data. I know that I could sort the list and then access the first two items, but that seems like the "long way". Is there some way to a

Re: [R] Easy way to get top 2 items from vector

2009-09-03 Thread Noah Silverman
Statistical Computing Facility > Department of Statistics > UC Berkeley > spec...@stat.berkeley.edu > > > On Thu, 3 Sep 2009, Noah Silverman wrote: > >> Hi, >> >> I use the max function often to find

[R] Confused - better empirical results with error in data

2009-09-07 Thread Noah Silverman
Hi, I have a strange one for the group. We have a system that predicts probabilities using a fairly standard svm (e1017). We are looking at probabilities of a binary outcome. The input data is generated by a perl script that calculates a bunch of things, fetches data from a database, etc.

Re: [R] Confused - better empirical results with error in data

2009-09-07 Thread Noah Silverman
data = 6.9 2) Run with "bad" data missing = 5.5 3) Run with "correct" data = ?? (We're running now, will take a few hours to compute.) I might also try to plot the bad data. It would be interesting to see what shape it has... On 9/7/09 1:05 PM, Mark Knecht wro

Re: [R] Confused - better empirical results with error in data

2009-09-07 Thread Noah Silverman
data = 6.9 2) Run with "bad" data missing = 5.5 3) Run with "correct" data = ?? (We're running now, will take a few hours to compute.) I might also try to plot the bad data. It would be interesting to see what shape it has... On 9/7/09 1:05 PM, Mark Knecht wrote:

Re: [R] Confused - better empirical results with error in data

2009-09-07 Thread Noah Silverman
Just for fun, I'll see if I can schedule a few hours to run the same experiment with the training data order reversed. If I'm correct, the results should be the same. Thanks! -- N On 9/7/09 2:34 PM, Mark Knecht wrote: > On Mon, Sep 7, 2009 at 1:22 PM, Noah Silverman > wrot

Re: [R] Moving to Mac OS X

2009-09-11 Thread Noah Silverman
Hi, I'm a daily user of both mac and Linux so wanted to offer some thoughts: 1) R runs great on a Mac. There is a standard install from the cran website that has a nice GUI built into it. You can do things like drag files to the console and it will fill in the path name. 2) I like using B

Re: [R] Moving to Mac OS X

2009-09-11 Thread Noah Silverman
Steve, You make a good point. I confused 64 bit with a multi-core setup. That said, I don't belive the pretty packaged up GUI has a 64 bit version, just the "raw terminal" version does. On 9/11/09 12:38 PM, Steve Lianoglou wrote: Hi, On Sep 11, 2009, at 3:08 PM, Noah Sil

Re: [R] Moving to Mac OS X

2009-09-11 Thread Noah Silverman
Thanks Steve, That's a big help. On 9/11/09 12:48 PM, Steve Lianoglou wrote: Hi, On Sep 11, 2009, at 3:40 PM, Noah Silverman wrote: Steve, You make a good point. I confused 64 bit with a multi-core setup. That said, I don't belive the pretty packaged up GUI has a 64 bit ver

[R] R on Multi Core

2009-09-11 Thread Noah Silverman
Hi, Our discussions about 64 bit R has led me to another thought. I have a nice dual core 3.0 chip inside my Linux Box (Running Fedora 11.) Is there a version of R that would take advantage of BOTH cores?? (Watching my system performance meter now is interesting, Running R will hold a single

[R] Alternative to Scale Function?

2009-09-11 Thread Noah Silverman
Hi, Is there an alternative to the scale function where I can specify my own mean and standard deviation? I've come across an interesting issue where this would help. I'm training and testing on completely different sets of data. The testing set is smaller than the training set. Using the

Re: [R] Alternative to Scale Function?

2009-09-11 Thread Noah Silverman
sure that a value is transformed the same regardless of which data set it is in. Do I have this correct, or can anybody contribute any more to the concept? Thanks! -- Noah On 9/11/09 1:10 PM, Noah Silverman wrote: Hi, Is there an alternative to the scale function where I can specify my own

Re: [R] Alternative to Scale Function?

2009-09-11 Thread Noah Silverman
Genius, That certainly is much faster that what I had worked out on my own. I looked at sweep, but couldn't understand the rather thin help page. Your example makes it really clear Thank You!!! -- Noah On 9/11/09 1:57 PM, Gavin Simpson wrote: > On Fri, 2009-09-11 at 13:10 -07

[R] Strange question/result about SVM

2009-09-14 Thread Noah Silverman
Hello, I have a very unusual situation with an SVM and wanted to get the group's opinion. We developed an experiment where we train the SVM with one set of data (train data) and then test with a completely independent set of data (test data). The results were VERY good. I found and error

Re: [R] Strange question/result about SVM

2009-09-14 Thread Noah Silverman
inal Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Noah Silverman Sent: Monday, September 14, 2009 1:00 PM To: r help Subject: [R] Strange question/result about SVM Hello, I have a very unusual situation with an SVM and wanted to get the gro

[R] Grouped Logistic (Or conditional Logistic.)

2009-09-17 Thread Noah Silverman
Hi, I'm not sure of the correct nomenclature or function for what I'm trying to do. I'm interested in calculated a logistic regression on a binary dependent variable (True,False). There are a few ways to easily do this in R. Both SVM and GLM work easily. The part that I want to add is "gr

Re: [R] Grouped Logistic (Or conditional Logistic.)

2009-09-17 Thread Noah Silverman
as you work through them would have to be adjusted to look "per group". I would call this something like "grouped maximum liklihood" if I got to make up the name. -N On 9/17/09 11:06 AM, (Ted Harding) wrote: On 17-Sep-09 17:28:16, Noah Silverman wrote: Hi, I'

[R] Pull Coefficients from MCMCpack models

2009-09-21 Thread Noah Silverman
Hi, I've been testing some models with the MCMCpack library. I can run the process and get a nice model "object". I can easily see the summary and even plot it. I can't seem to figure out how to: 1) Access the final coefficients in the model 2) Turn the coefficients into a model so I can the

Re: [R] Pull Coefficients from MCMCpack models

2009-09-21 Thread Noah Silverman
21/09 7:58 PM, Debabrata Midya wrote: > Try this: > apply(foo, 2, mean) or > apply(foo, 2, median) > Thanks, > Deb > > >>> Noah Silverman 22/09/2009 12:34 pm >>> > Hi, > > I've been testing some models with the MCMCpack library. > > I can

[R] lapply with data frame

2010-02-27 Thread Noah Silverman
I'm a bit confused on how to use lapply with a data.frame. For example. lapply(data, function(x) print(x)) WHAT exactly is passed to the function. Is it each ROW in the data frame, one by one, or each column, or the entire frame in one shot? What I want to do apply a function to each row in

[R] lapply with data frame

2010-02-28 Thread Noah Silverman
I'm a bit confused on how to use lapply with a data.frame. For example. lapply(data, function(x) print(x)) WHAT exactly is passed to the function. Is it each ROW in the data frame, one by one, or each column, or the entire frame in one shot? What I want to do apply a function to each row in

[R] Strange behavior with poisosn and glm

2010-03-02 Thread Noah Silverman
Hi, I'm just learning about poison links for the glm function. One of the data sets I'm playing with has several of the variables as factors (i.e. month, group, etc.) When I call the glm function with a formula that has a factor variable, R automatically converts the variable to a series of

Re: [R] Strange behavior with poisosn and glm

2010-03-02 Thread Noah Silverman
37 From my understanding, the exp of the prediction should be equal to the fitted value. Here it is not. I don't understand why. Any insight? -N On 3/2/10 12:47 AM, (Ted Harding) wrote: On 02-Mar-10 08:02:27, Noah Silverman wrote: Hi, I'm just learning about poison links for the g

[R] logistic regression by group?

2010-03-03 Thread Noah Silverman
Hi, Looking for a function in R that can help me calculate a parameter that maximizes the likelihood over groups of observations. The general formula is: p = exp(xb) / sum(exp(xb)) So, according to the formulas I've seen published, to do this "by group" is product(p = exp(x_i * b_i) / sum(exp(

Re: [R] logistic regression by group?

2010-03-04 Thread Noah Silverman
Corey, Thanks for the quick reply. I cant give any sample code as I don't know how to code this in R. That's why I tried to pass along some pseudo code. I'm looking for the best "beta" that maximize likelihood over all the groups. So, while your suggestion is close, it isn't quite what I need.

[R] Factor variables with GAM models

2010-03-19 Thread Noah Silverman
I'm just starting to learn about GAM models. When using the lm function in R, any factors I have in my data set are automatically converted into a series of binomial variables. For example, if I have a data.frame with a column named color and values "red", "green", "blue". The lm function a

Re: [R] Factor variables with GAM models

2010-03-19 Thread Noah Silverman
rg [r-help-boun...@r-project.org] On Behalf Of Noah Silverman [n...@smartmediacorp.com] Sent: March 19, 2010 12:54 PM To: r-help@r-project.org Subject: [R] Factor variables with GAM models I'm just starting to learn about GAM models. When using the lm function in R, any factors I have in my d

[R] Help with Conditional Logit

2009-07-16 Thread Noah Silverman
Hello, I'm brand new to using R. (I've been using Rapid Miner, but would like to move over to R since it gives me much more functionality.) I'm trying to learn how to do a conditional logit model. My data has one dependent variable, 2 independent variables and a "group" variable. example: c

Re: [R] Help with Conditional Logit

2009-07-16 Thread Noah Silverman
e which field is the "group ID" for the subset grouping? 3) How do I indicate which field is the label? 4) How do I indicate which fields in my dataset are for training? Thanks!!! On 7/16/09 12:54 PM, (Ted Harding) wrote: > On 16-Jul-09 19:40:20, Noah Silverman wrote: >

[R] svm works but tune.svm give error

2009-07-18 Thread Noah Silverman
Hello, I'm using the e1071 library for SVM functions. I can quickly train an SVM with: svm(formula = label ~ ., data = testdata) That works well. I want to tune the parameters, so I tried: tune.svm(label ~ ., data=testdata[1:2000, ], gamma=10^(-6:3), cost=10^(1:2)) THIS FAILS WITH AN ERROR: '

[R] Normalize data

2009-07-20 Thread Noah Silverman
Hello, I'm coming from RapidMiner, so some of the "easy" things there are a bit difficult for me to find in R How do I normalize data in a data frame. Ideally I want to scale the values for each column in the range of (-1,1) Thank You, __ R-hel

[R] Strange Memory issue

2009-07-27 Thread Noah Silverman
Hi, I am testing out some things with the kernlab library. The dataframe is 22,000 rows of 32 columns. The command I execute is: model <- ksvm(label ~ ., data = traindata, type="C-svc", kernel = "rbfdot", class.weights= c("0" =1, "1" =3), kpar = "automatic", C = 10, cross = 3, prob.model = T

[R] Forumla format?

2009-07-27 Thread Noah Silverman
Hi, Quick question. I'm working on training an SVM. I have a dataframe with about 50 columns. I want to train on 46 of them. Is there a way to say "All except columns 22,23,25 and 31"? It would be nice to not have to do +c1 +c2 +c3 +c4, etc for all 48 columns. Thanks! -N [[alternat

Re: [R] Forumla format?

2009-07-27 Thread Noah Silverman
Hi, I'm not sure that would work for the "formula" format of an SVM function. the idea is normally svm(label ~ c1 + c2 +c3, data=mydata); It doesn't work to say svm(label ~ -c(22,23,24), data=mydata) On 7/27/09 12:17 PM, Steve Lianoglou wrote: > Hi, > > On J

[R] Watching tune parameters for SVM?

2009-07-28 Thread Noah Silverman
Hi, I'm switch over from RapidMiner to R. (The learning curve is steep, but there is so much more I can do with R and it runs much faster overall.) In RapidMiner, I can "tune" a parameter of my svm in a nice cross validation loop. The process will print out the progress as it goes. So for a

[R] Watching tune parameters for SVM?

2009-07-29 Thread Noah Silverman
Hi, I'm switch over from RapidMiner to R. (The learning curve is steep, but there is so much more I can do with R and it runs much faster overall.) In RapidMiner, I can "tune" a parameter of my svm in a nice cross validation loop. The process will print out the progress as it goes. So for

[R] scale subset of data

2009-07-31 Thread Noah Silverman
Hi, This should be an easy one, but I have some trouble formatting the data right I'm trying to replace the column of a subset of a dataframe with the scaled data for that column of the subset subset(rawdata, code== "foo", select = a) <- scale( subset(rawdata, code== "foo", select = a) ) It

Re: [R] scale subset of data

2009-07-31 Thread Noah Silverman
That works perfectly. Thanks! -N On 7/31/09 2:04 PM, Steve Lianoglou wrote: > Hi, > > On Jul 31, 2009, at 4:13 PM, Noah Silverman wrote: > >> Hi, >> >> This should be an easy one, but I have some trouble formatting the data >> right >> >> I

[R] scale subsets of grouped data in data frame

2009-07-31 Thread Noah Silverman
Hello, I'm trying to duplicate what's an easy process in RapidMiner. In RM, we can simply use two operators: subgroup iteration attribute value selection (Can use a regex for the attrribute name.) I can do this in R with a lot of code and manual steps. It would be really nice to find

[R] Strange column shifting with read.table

2009-08-02 Thread Noah Silverman
Hi, I am reading in a dataframe from a CSV file. It has 70 columns. I do not have any kind of unique "row id". rawdata <- read.table("r_work/train_data.csv", header=T, sep=",", na.strings=0) When training an svm, I keep getting an error So, as an experiment, I wrote the data back out to a

Re: [R] Strange column shifting with read.table

2009-08-02 Thread Noah Silverman
27;row.names=FALSE' in the write.table. > > On Sun, Aug 2, 2009 at 5:10 PM, Noah Silverman > wrote: > >> Hi, >> >> I am reading in a dataframe from a CSV file. It has 70 columns. I do >> not have any kind of unique "row id". >> >>

Re: [R] Strange column shifting with read.table

2009-08-02 Thread Noah Silverman
can see > what it is doing. Most likely you have a format problem, comment > characters, or mismatched quotes. > > On Sun, Aug 2, 2009 at 5:24 PM, Noah Silverman > wrote: > >> Jim, >> >> The "write.table" was simply a diagnostic step. >> >&g

Re: [R] Strange column shifting with read.table

2009-08-02 Thread Noah Silverman
Somehow, my data is still getting mangled. Running the SVM gives me the following error: "names" attribute[1994] must me the same length as the vector[1950] Any thoughts? -N On 8/2/09 2:35 PM, (Ted Harding) wrote: > On 02-Aug-09 21:10:12, Noah Silverman wrote: > >>

Re: [R] Strange column shifting with read.table

2009-08-02 Thread Noah Silverman
he data after the scale command. But, issuing the same 0 substitution AFTER the scale command makes everything work again. rawdata[is.na(rawdata)] <- 0 VERY strange behavior. -N On 8/2/09 3:57 PM, J Dougherty wrote: > On Sunday 02 August 2009 02:34:43 pm Noah Silverman wrote: >

Re: [R] Strange column shifting with read.table

2009-08-02 Thread Noah Silverman
ve realized that's a "bad thing", so am trying to learn R. Additionally, R seems MUCH MUCH faster.) I'm open to ideas. Thanks! -N On 8/2/09 4:14 PM, David Winsemius wrote: > > On Aug 2, 2009, at 7:02 PM, Noah Silverman wrote: > >> Hi, >> >> I

Re: [R] Strange column shifting with read.table

2009-08-02 Thread Noah Silverman
Just tried your suggestion. rawdata[is.na(rawdata), ] <- 0 It FAILS with the following error: Error in `[<-.data.frame`(`*tmp*`, is.na(rawdata), , value = 0) : non-existent rows not allowed __ R-help@r-project.org mailing list https://stat.ethz.ch/m

Re: [R] Strange column shifting with read.table

2009-08-02 Thread Noah Silverman
14 AM, David Winsemius wrote: > >> >> On Aug 2, 2009, at 7:02 PM, Noah Silverman wrote: >> >>> Hi, >>> >>> It seems as if the problem was caused by an odd quirk of the "scale" >>> function. >>> >>> Some of my da

[R] Scale set of 0 values returns NAN??

2009-08-02 Thread Noah Silverman
Hi, More questions in my ongoing quest to convert from RapidMiner to R. One thing has become VERY CLEAR: None of the issues I'm asking about here are addressed in RapidMiner. How it handles misisng values, scaling, etc. is hidden within the "black box". Using R is forcing me to take a much

Re: [R] Strange column shifting with read.table

2009-08-02 Thread Noah Silverman
Hi, Thanks for the continued support. I've been working on this all night, and have learned some things: 1) Since I'm really committed to using an SVM, I need to skip the examples with missing data. I have a training set of approximately 22,000 examples of which about 500 have missing values

  1   2   3   >