[R] Further Subsetting of Data for Log. Reg. Results in qr.default Error

2016-11-23 Thread Courtney Benjamin
Hello R Experts,

In further subsetting data within a logistic regression in the survey package, 
I am getting a qr.default error.

Below is an example of the original model that runs correctly, but when I 
subset the data further to look at students of a particular curriculum 
concentration, the qr.default error occurs.  I thought it may have been related 
to converting the F1RTRCC variable into a factor for use in the original model; 
I went back and restored that variable to its original form and it didn't help.

Any guidance is greatly appreciated.


library(RCurl)
library(survey)

data <- 
getURL("https://raw.githubusercontent.com/cbenjamin1821/careertech-ed/master/elsq1adj.csv";)
elsq1ch <- read.csv(text = data)

#Specifying the svyrepdesign object which applies the BRR weights
elsq1ch_brr<-svrepdesign(variables = elsq1ch[,1:16], repweights = 
elsq1ch[,18:217], weights = elsq1ch[,17], combined.weights = TRUE, type = "BRR")
elsq1ch_brr

##Resetting baseline levels for predictors
elsq1ch_brr <- update( elsq1ch_brr , F1HIMATH = relevel(F1HIMATH,"PreAlg or 
Less") )
elsq1ch_brr <- update( elsq1ch_brr , BYINCOME = relevel(BYINCOME,"0-25K") )
elsq1ch_brr <- update( elsq1ch_brr , F1RACE = relevel(F1RACE,"White") )
elsq1ch_brr <- update( elsq1ch_brr , F1SEX = relevel(F1SEX,"Male") )
elsq1ch_brr <- update( elsq1ch_brr , F1RTRCC = relevel(F1RTRCC,"Academic") )

#Log. Reg. model-all curric. concentrations including F1RTRCC as a predictor
allCC <- 
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH+F1RTRCC,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1,na.action=na.omit)
summary(allCC)

##CTE Log. Reg. model that is resulting in the qr.default error
CTE <- 
svyglm(formula=F3ATTAINB~F1PARED+BYINCOME+F1RACE+F1SEX+F1RGPP2+F1HIMATH,family="binomial",design=elsq1ch_brr,subset=BYSCTRL==1&G10COHRT==1&F1RTRCC=="Academic",na.action=na.omit)
summary(CTE)



Courtney Benjamin

Broome-Tioga BOCES

Automotive Technology II Teacher

Located at Gault Toyota

Doctoral Candidate-Educational Theory & Practice

State University of New York at Binghamton

cbenj...@btboces.org

607-763-8633

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to extract parameters from train in caret package

2016-11-23 Thread Michela Leone
 Dear R-users,
   I need to extract coefficients  from the  final model after running
train function of caret package.

I need to have something like "summary" used after running a regression
model
in this case  from the best model.

below my  script   as explanation

data(Sonar)
set.seed(107)
 inTrain <- createDataPartition(y = Sonar$Class,  p = .75,  list = FALSE)

training <- Sonar[ inTrain,]
testing <- Sonar[-inTrain,]


 plsFit <- train(Class ~ .,
   data = training,
  method = "pls",
  tuneLength = 15,
  preProc = c("center", "scale"))

What I need is a summary  which looks like


Call:
glm(formula = Class ~ ., family = "binomial", data = training)

Deviance Residuals:
   Min  1Q  Median  3Q Max
-2.418e-05  -2.110e-08  -2.110e-08   2.110e-08   2.697e-05

Coefficients:
  Estimate Std. Error z value Pr(>|z|)
(Intercept)  1.081e+02  2.668e+05   0.0001.000
V1   7.235e+02  4.390e+06   0.0001.000
V2  -1.024e+03  2.282e+06   0.0001.000
V3   1.113e+03  2.735e+06   0.0001.000
V4  -7.072e+02  1.828e+06   0.0001.000
V5  -1.542e+01  1.534e+06   0.0001.000
V6  -1.983e+02  1.438e+06   0.0001.000
V7   2.543e+02  1.465e+06   0.0001.000
V8   3.507e+02  8.199e+05   0.0001.000
V9  -4.010e+02  8.517e+05   0.0001.000
V10  4.317e+01  9.660e+05   0.0001.000
V11  1.364e+02  1.800e+06   0.0001.000
V12 -4.606e+02  2.350e+06   0.0001.000
V13  1.746e+02  1.246e+06   0.0001.000


Null deviance: 2.1688e+02  on 156  degrees of freedom
Residual deviance: 1.0630e-08  on  96  degrees of freedom
AIC: 122

Number of Fisher Scoring iterations: 25



Many thanks for your help ! Michela Leone

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The code itself disappears after starting to execute the for loop

2016-11-23 Thread Bert Gunter
In addition to Jim's comments, which you have not yet satisfactorily
addressed (buffering in GUI??),

1. Show your code!

2. Show ouput of sessionInfo()

3. Upgrade to the latest R version maybe

4. Perhaps write to package maintainer (see ?maintainer) if nothing or
no one helps.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Nov 22, 2016 at 10:05 AM, Maram SAlem  wrote:
> Thanks for helping Jim.
>
> I'm actually using the pbapply function together with the print function 
> within a loop. In earlier versions, the progress bar and the output of the 
> print function used to appear after each iteration of the loop. But with the 
> 3.3.1. Version nothing appears, instead the console turns white and the 
> cursor turns blue ( busy) and I know nothing about the progress of the 
> running code.
>
> I just want to see the bar and the output of the print function as I used to, 
> any help?
>
> Thanks in advance.
> Maram Salem
>
>
>
> Sent from my iPhone
>
>> On Nov 3, 2016, at 8:30 PM, jim holtman  wrote:
>>
>> A little more information would help.  How exactly are out creating the 
>> output to the console?  Are you using 'print', 'cat' or something else?  Do 
>> you have buffered output checked on the GUI (you probably don't want it 
>> checked or you output will be delayed till the buffer is full -- this might 
>> be the cause of your problem.
>>
>>
>> Jim Holtman
>> Data Munger Guru
>>
>> What is the problem that you are trying to solve?
>> Tell me what you want to do, not how you want to do it.
>>
>>> On Thu, Nov 3, 2016 at 1:55 PM, Maram SAlem  
>>> wrote:
>>> Hi all,
>>>
>>> I've a question concerning the R 3.3.1 version. I have a long code that I 
>>> used to run on versions earlier to the 3.3.1 version, and when I copied the 
>>> code to the R console, I can still see the code while the loop is executing 
>>> , along with the output printed after each iteration of the loop.
>>>
>>> Now, on the 3.3.1 version, after I copy the code to the console, it 
>>> disappears and I only see the printed output of only one iteration at a 
>>> time, that is, after the first iteration the printed output disappears ( 
>>> though it's only 6 lines, just giving me some guidance, not a long output).
>>> This is causing me some problems, so I don't know if there is a general 
>>> option for R that enables me to still see the code and the output of all 
>>> the iterations till the loop is over, as was the case with earlier R 
>>> versions.
>>>
>>> I didn't include the code as it's a long one.
>>>
>>> Thanks a lot in advance,
>>>
>>> Maram
>>>
>>>
>>> Sent from my iPhone
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The code itself disappears after starting to execute the for loop

2016-11-23 Thread Maram SAlem
Thanks a lot Bert , will check out your suggestions.

I've unchecked the buffer output option in GUI but still have the same problem.

Thanks for your time and concern.

Maram Salem 

Sent from my iPhone

> On Nov 23, 2016, at 5:55 PM, Bert Gunter  wrote:
> 
> In addition to Jim's comments, which you have not yet satisfactorily
> addressed (buffering in GUI??),
> 
> 1. Show your code!
> 
> 2. Show ouput of sessionInfo()
> 
> 3. Upgrade to the latest R version maybe
> 
> 4. Perhaps write to package maintainer (see ?maintainer) if nothing or
> no one helps.
> 
> Cheers,
> Bert
> 
> 
> Bert Gunter
> 
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> 
> 
>> On Tue, Nov 22, 2016 at 10:05 AM, Maram SAlem  
>> wrote:
>> Thanks for helping Jim.
>> 
>> I'm actually using the pbapply function together with the print function 
>> within a loop. In earlier versions, the progress bar and the output of the 
>> print function used to appear after each iteration of the loop. But with the 
>> 3.3.1. Version nothing appears, instead the console turns white and the 
>> cursor turns blue ( busy) and I know nothing about the progress of the 
>> running code.
>> 
>> I just want to see the bar and the output of the print function as I used 
>> to, any help?
>> 
>> Thanks in advance.
>> Maram Salem
>> 
>> 
>> 
>> Sent from my iPhone
>> 
>>> On Nov 3, 2016, at 8:30 PM, jim holtman  wrote:
>>> 
>>> A little more information would help.  How exactly are out creating the 
>>> output to the console?  Are you using 'print', 'cat' or something else?  Do 
>>> you have buffered output checked on the GUI (you probably don't want it 
>>> checked or you output will be delayed till the buffer is full -- this might 
>>> be the cause of your problem.
>>> 
>>> 
>>> Jim Holtman
>>> Data Munger Guru
>>> 
>>> What is the problem that you are trying to solve?
>>> Tell me what you want to do, not how you want to do it.
>>> 
 On Thu, Nov 3, 2016 at 1:55 PM, Maram SAlem  
 wrote:
 Hi all,
 
 I've a question concerning the R 3.3.1 version. I have a long code that I 
 used to run on versions earlier to the 3.3.1 version, and when I copied 
 the code to the R console, I can still see the code while the loop is 
 executing , along with the output printed after each iteration of the loop.
 
 Now, on the 3.3.1 version, after I copy the code to the console, it 
 disappears and I only see the printed output of only one iteration at a 
 time, that is, after the first iteration the printed output disappears ( 
 though it's only 6 lines, just giving me some guidance, not a long output).
 This is causing me some problems, so I don't know if there is a general 
 option for R that enables me to still see the code and the output of all 
 the iterations till the loop is over, as was the case with earlier R 
 versions.
 
 I didn't include the code as it's a long one.
 
 Thanks a lot in advance,
 
 Maram
 
 
 Sent from my iPhone
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
>>> 
>> 
>>[[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] variable selection problem

2016-11-23 Thread Anderson Eduardo
Hello

I am trying to run vignette

example for the MaxentVariableSelection package, but something wrong is
happening. I can't figure out.

Here is the code:

maxentPath = ("/home/anderson/R/x86_64-pc-linux-gnu-library/3.3/dismo/
java/maxent.jar")
gridfolder <- ("/home/anderson/Downloads/BioOracle_9090RV")
occurrencelocations <- system.file("extdata", "Occurrencedata.csv",package="
MaxentVariableSelection")
backgroundlocations <- system.file("extdata", "Backgrounddata.csv",package="
MaxentVariableSelection")
additionalargs="nolinear noquadratic noproduct nothreshold noautofeature"
contributionthreshold <- 5
correlationthreshold <- 0.9
betamultiplier=seq(2,6,0.5)

VariableSelection(maxent,
  outdir,
  gridfolder,
  occurrencelocations,
  backgroundlocations,
  additionalargs,
  contributionthreshold,
  correlationthreshold,
  betamultiplier
  )

Aand the error message:

> VariableSelection(maxent,
+   outdir,
+   gridfolder,
+   occurrencelocations,
+   backgroundlocations,
+   additionalargs,
+   contributionthreshold,
+   correlationthreshold,
+   betamultiplier
+   )
---
Choosing betamultiplier  2

 Number of remaining variables 4
Testing variable contributions...
Calculating average AUC values from 10 maxent models...
arguments 'show.output.on.console', 'minimized' and 'invisible' are for
Windows only
Error in as.vector(x, "character") :
  cannot coerce type 'closure' to vector of type 'character'


I have not found the solution in blogs and online forums and I would ask
earnestly the help of the members of this forum.

Thanks in advance.

Anderson A. Eduardo

--
Lattes  | Researcher ID
 | Google Acadêmico
 | Site


--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Unable to Install POT Package for R 3.1.0

2016-11-23 Thread Preetam Pal
Hi, I am trying to install the package POT for R* version 3.1.0* (spring
dance), using:

*install.packages("POT", repos="http://R-Forge.R-project.org
")*
*( link  )*

*But I am getting the following error:*


*package ‘POT’ is available as a source package but not as a binaryWarning
in install.packages :  package ‘POT’ is not available (for R version 3.1.0)*


Can anyone suggest how I can get it working please?
I need it for Peaks-Over-Threshold analysis under extreme value theory. I
am trying to make use of functions mentioned in this link ( link2
 ).Thanks.

Regards,
Preetam

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] browser() pre 3.1 behaviour

2016-11-23 Thread Falk Hildebrand via R-help
Dear R help list,I have been using the "browser()" function for a long time now 
to debug my function code. However, since 3.1 (I think) the behaviour has 
changed that browser() (or the debug interface) is being called also when a 
loop is being executed (copy pasted to the R interface). So I have to cancel 
this browser or confirm each iteration of the loop. And this I believe also 
extends to various other calls. In the end I would like to set up my R to 
behave like it did in R <= 3.0. This has been so bothersome, that I was 
sticking with old R versions, but now some packages are no longer supported in 
these old versions.
How to set up browser() to follow the old behaviour??
I have been looking for some time now to find an answer to this on the 
internet, but was so far unsuccessful, I would be grateful for any help.best, 
Falk Hildebrand
PS: I think this is the change that occured in 3.1:
[R] R 3.1.0 is released

  
|  
|   |  
[R] R 3.1.0 is released
   |  |

  |

 
 DEBUGGING:

   * The behaviour of the code browser has been made more consistent,
 in part following the suggestions in PR#14985.

   * Calls to browser() are now consistent with calls to the browser
 triggered by debug(), in that Enter will default to n rather than
 c.

   * A new browser command s has been added, to "step into" function
 calls.

   * A new browser command f has been added, to "finish" the current
 loop or function.

   * Within the browser, the command help will display a short list of
 available commands.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Chi2 algorithm - R

2016-11-23 Thread Luke Skywalker
Good evening,

I'm encountering a different kind of discretization with respect to the
1997 Liu and Setiono's one descripted in their papers, using Chi2 algorithm
for feature selection with discretization.

As stated in R documentation (discretization - R (from CRAN)
),
R package discretizion offers the function Chi2, which comes to life in the
following papers:

Liu, H. and Setiono, R. (1995). Chi2: Feature selection and discretization
of numeric attributes, Tools with Artificial Intelligence, 388–391.

Liu, H. and Setiono, R. (1997). Feature selection and discretization, IEEE
transactions on knowledge and data engineering, Vol.9, no.4, 642–645.

I wrote the following R programming language code, in which I have set
alpha and delta equal to the ones set in the papers above. Finally, the
following code prints out the discretized dataframe. I used Iris dataframe,
as in one of the examples in the two papers. The first paper above states
that alfa = 0.5 and delta = 5%, and that "the originally odd numbered data
are selected for training (75 patterns) and rest for testing (75
patterns)". With this asset, Sepal attributes should be removed.

library(discretization)
data(iris)
df1 <- iris[FALSE,]for(i in 1:nrow(iris)){
if(i %% 2 != 0){
df1 <- rbind(df1, iris[i,])
}}
chi2(df1, alp=0.5, del=0.05)$Disc.data

The point is that, observing the dataframe printed out by the last
instruction, you can see that no attribute is removed. The discretized data
frame still have 4 attributes discretized: if I correctly understood the
above papers, Sepal Length and Sepal Width should have been both
discretized in just one interval by Chi2 algorithm.

I have posted a question here: http://stats.stackexchange.com/questions/
247499/why-does-not-r-chi2-algorithm-discretize-in-the-
same-manner-as-in-the-paper-by-l?noredirect=1#comment470974_247499.


Moreover, it's really hard to understand the cut points that Chi2 algorithm
implemented in R makes. For example:

res <- chi2(iris, 0.5, 0.05)

cut(iris$Sepal.Length, res$cutp, labels=FALSE) is different from
res$Disc.data$Sepal.Length

Help me understand, please

Best regards

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] browser() pre 3.1 behaviour

2016-11-23 Thread Duncan Murdoch

On 23/11/2016 6:51 AM, Falk Hildebrand via R-help wrote:

Dear R help list,I have been using the "browser()" function for a long time now 
to debug my function code. However, since 3.1 (I think) the behaviour has changed that 
browser() (or the debug interface) is being called also when a loop is being executed 
(copy pasted to the R interface).


This doesn't really parse for me.  Could you please post detailed 
instructions for what you're doing and describe what you're seeing?


Duncan Murdoch

  So I have to cancel this browser or confirm each iteration of the loop. And this 
I believe also extends to various other calls. In the end I would like to set up 
my R to behave like it did in R <= 3.0. This has been so bothersome, that I was 
sticking with old R versions, but now some packages are no longer supported in 
these old versions.
How to set up browser() to follow the old behaviour??
I have been looking for some time now to find an answer to this on the 
internet, but was so far unsuccessful, I would be grateful for any help.best, 
Falk Hildebrand
PS: I think this is the change that occured in 3.1:
[R] R 3.1.0 is released

   
|

|   |
[R] R 3.1.0 is released
|  |

   |

  
  DEBUGGING:


* The behaviour of the code browser has been made more consistent,
  in part following the suggestions in PR#14985.

* Calls to browser() are now consistent with calls to the browser
  triggered by debug(), in that Enter will default to n rather than
  c.

* A new browser command s has been added, to "step into" function
  calls.

* A new browser command f has been added, to "finish" the current
  loop or function.

* Within the browser, the command help will display a short list of
  available commands.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Chi2 algorithm - R

2016-11-23 Thread peter dalgaard
Notice that this relates to an R _package_, which has a maintainer. You cannot 
expect general R users or developers to know about the details of the package. 
It doesn't look like there is dcoumentation beyond the help pages, so you may 
need to contact the maintainer or study the actual code.

-pd 

> On 23 Nov 2016, at 17:08 , Luke Skywalker  wrote:
> 
> Good evening,
> 
> I'm encountering a different kind of discretization with respect to the
> 1997 Liu and Setiono's one descripted in their papers, using Chi2 algorithm
> for feature selection with discretization.
> 
> As stated in R documentation (discretization - R (from CRAN)
> ),
> R package discretizion offers the function Chi2, which comes to life in the
> following papers:
> 
> Liu, H. and Setiono, R. (1995). Chi2: Feature selection and discretization
> of numeric attributes, Tools with Artificial Intelligence, 388–391.
> 
> Liu, H. and Setiono, R. (1997). Feature selection and discretization, IEEE
> transactions on knowledge and data engineering, Vol.9, no.4, 642–645.
> 
> I wrote the following R programming language code, in which I have set
> alpha and delta equal to the ones set in the papers above. Finally, the
> following code prints out the discretized dataframe. I used Iris dataframe,
> as in one of the examples in the two papers. The first paper above states
> that alfa = 0.5 and delta = 5%, and that "the originally odd numbered data
> are selected for training (75 patterns) and rest for testing (75
> patterns)". With this asset, Sepal attributes should be removed.
> 
> library(discretization)
> data(iris)
> df1 <- iris[FALSE,]for(i in 1:nrow(iris)){
>if(i %% 2 != 0){
>df1 <- rbind(df1, iris[i,])
>}}
> chi2(df1, alp=0.5, del=0.05)$Disc.data
> 
> The point is that, observing the dataframe printed out by the last
> instruction, you can see that no attribute is removed. The discretized data
> frame still have 4 attributes discretized: if I correctly understood the
> above papers, Sepal Length and Sepal Width should have been both
> discretized in just one interval by Chi2 algorithm.
> 
> I have posted a question here: http://stats.stackexchange.com/questions/
> 247499/why-does-not-r-chi2-algorithm-discretize-in-the-
> same-manner-as-in-the-paper-by-l?noredirect=1#comment470974_247499.
> 
> 
> Moreover, it's really hard to understand the cut points that Chi2 algorithm
> implemented in R makes. For example:
> 
> res <- chi2(iris, 0.5, 0.05)
> 
> cut(iris$Sepal.Length, res$cutp, labels=FALSE) is different from
> res$Disc.data$Sepal.Length
> 
> Help me understand, please
> 
> Best regards
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Chi2 algorithm - R

2016-11-23 Thread Luke Skywalker
What does it mean to "have a mantainer"? Is he a third party? Is he an
individual developer and you can install whose package on your risk? Are
the package created by maintainers not tested?

Anyway, I wrote him. I'm waiting for response.

Regards

Il 23/Nov/2016 22:21, "peter dalgaard"  ha scritto:

> Notice that this relates to an R _package_, which has a maintainer. You
> cannot expect general R users or developers to know about the details of
> the package. It doesn't look like there is dcoumentation beyond the help
> pages, so you may need to contact the maintainer or study the actual code.
>
> -pd
>
> > On 23 Nov 2016, at 17:08 , Luke Skywalker  wrote:
> >
> > Good evening,
> >
> > I'm encountering a different kind of discretization with respect to the
> > 1997 Liu and Setiono's one descripted in their papers, using Chi2
> algorithm
> > for feature selection with discretization.
> >
> > As stated in R documentation (discretization - R (from CRAN)
> >  discretization.pdf>),
> > R package discretizion offers the function Chi2, which comes to life in
> the
> > following papers:
> >
> > Liu, H. and Setiono, R. (1995). Chi2: Feature selection and
> discretization
> > of numeric attributes, Tools with Artificial Intelligence, 388–391.
> >
> > Liu, H. and Setiono, R. (1997). Feature selection and discretization,
> IEEE
> > transactions on knowledge and data engineering, Vol.9, no.4, 642–645.
> >
> > I wrote the following R programming language code, in which I have set
> > alpha and delta equal to the ones set in the papers above. Finally, the
> > following code prints out the discretized dataframe. I used Iris
> dataframe,
> > as in one of the examples in the two papers. The first paper above states
> > that alfa = 0.5 and delta = 5%, and that "the originally odd numbered
> data
> > are selected for training (75 patterns) and rest for testing (75
> > patterns)". With this asset, Sepal attributes should be removed.
> >
> > library(discretization)
> > data(iris)
> > df1 <- iris[FALSE,]for(i in 1:nrow(iris)){
> >if(i %% 2 != 0){
> >df1 <- rbind(df1, iris[i,])
> >}}
> > chi2(df1, alp=0.5, del=0.05)$Disc.data
> >
> > The point is that, observing the dataframe printed out by the last
> > instruction, you can see that no attribute is removed. The discretized
> data
> > frame still have 4 attributes discretized: if I correctly understood the
> > above papers, Sepal Length and Sepal Width should have been both
> > discretized in just one interval by Chi2 algorithm.
> >
> > I have posted a question here: http://stats.stackexchange.com/questions/
> > 247499/why-does-not-r-chi2-algorithm-discretize-in-the-
> > same-manner-as-in-the-paper-by-l?noredirect=1#comment470974_247499.
> >
> >
> > Moreover, it's really hard to understand the cut points that Chi2
> algorithm
> > implemented in R makes. For example:
> >
> > res <- chi2(iris, 0.5, 0.05)
> >
> > cut(iris$Sepal.Length, res$cutp, labels=FALSE) is different from
> > res$Disc.data$Sepal.Length
> >
> > Help me understand, please
> >
> > Best regards
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
>
>
>
>
>
>
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Chi2 algorithm - R

2016-11-23 Thread Olivier Crouzet
Hi,

(1) If the package has been installed from CRAN then it's been tested (which 
does not imply that it's'exempt of any bugs), any other source (e.g. Github) 
has not been tested... according to my understanding;

(2) The GNU GPL explicitly states that you install and use the software "at 
your own risk"... This is the case for any GPL licensed software, even for 
base-R... so this is no surprise!

Olivier.



--
Olivier Crouzet
LLING - Laboratoire de Linguistique de Nantes
UMR 6310 CNRS / Université de Nantes

-Original Message-
From: Luke Skywalker 
Sender: "R-help" Date: Wed, 23 Nov 2016 22:26:45 
To: peter dalgaard
Cc: 
Subject: Re: [R] Chi2 algorithm - R

What does it mean to "have a mantainer"? Is he a third party? Is he an
individual developer and you can install whose package on your risk? Are
the package created by maintainers not tested?

Anyway, I wrote him. I'm waiting for response.

Regards

Il 23/Nov/2016 22:21, "peter dalgaard"  ha scritto:

> Notice that this relates to an R _package_, which has a maintainer. You
> cannot expect general R users or developers to know about the details of
> the package. It doesn't look like there is dcoumentation beyond the help
> pages, so you may need to contact the maintainer or study the actual code.
>
> -pd
>
> > On 23 Nov 2016, at 17:08 , Luke Skywalker  wrote:
> >
> > Good evening,
> >
> > I'm encountering a different kind of discretization with respect to the
> > 1997 Liu and Setiono's one descripted in their papers, using Chi2
> algorithm
> > for feature selection with discretization.
> >
> > As stated in R documentation (discretization - R (from CRAN)
> >  discretization.pdf>),
> > R package discretizion offers the function Chi2, which comes to life in
> the
> > following papers:
> >
> > Liu, H. and Setiono, R. (1995). Chi2: Feature selection and
> discretization
> > of numeric attributes, Tools with Artificial Intelligence, 388–391.
> >
> > Liu, H. and Setiono, R. (1997). Feature selection and discretization,
> IEEE
> > transactions on knowledge and data engineering, Vol.9, no.4, 642–645.
> >
> > I wrote the following R programming language code, in which I have set
> > alpha and delta equal to the ones set in the papers above. Finally, the
> > following code prints out the discretized dataframe. I used Iris
> dataframe,
> > as in one of the examples in the two papers. The first paper above states
> > that alfa = 0.5 and delta = 5%, and that "the originally odd numbered
> data
> > are selected for training (75 patterns) and rest for testing (75
> > patterns)". With this asset, Sepal attributes should be removed.
> >
> > library(discretization)
> > data(iris)
> > df1 <- iris[FALSE,]for(i in 1:nrow(iris)){
> >if(i %% 2 != 0){
> >df1 <- rbind(df1, iris[i,])
> >}}
> > chi2(df1, alp=0.5, del=0.05)$Disc.data
> >
> > The point is that, observing the dataframe printed out by the last
> > instruction, you can see that no attribute is removed. The discretized
> data
> > frame still have 4 attributes discretized: if I correctly understood the
> > above papers, Sepal Length and Sepal Width should have been both
> > discretized in just one interval by Chi2 algorithm.
> >
> > I have posted a question here: http://stats.stackexchange.com/questions/
> > 247499/why-does-not-r-chi2-algorithm-discretize-in-the-
> > same-manner-as-in-the-paper-by-l?noredirect=1#comment470974_247499.
> >
> >
> > Moreover, it's really hard to understand the cut points that Chi2
> algorithm
> > implemented in R makes. For example:
> >
> > res <- chi2(iris, 0.5, 0.05)
> >
> > cut(iris$Sepal.Length, res$cutp, labels=FALSE) is different from
> > res$Disc.data$Sepal.Length
> >
> > Help me understand, please
> >
> > Best regards
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
>
>
>
>
>
>
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.ht

Re: [R] GAM with the negative binomial distribution: why do predictions no match with original values?

2016-11-23 Thread Simon Wood
?predict.gam (mgcv) says

"Note that, in common with other prediction functions, any offset
  supplied to ‘gam’ as an argument is always ignored when
  predicting, unlike offsets specified in the gam model formula."

 which was originally implemented to prevent surprises to people 
familiar with predict.lm (which used to behave that way, and was 
documented to behave that way) the problem is that predict.lm 
doesn't behave like that any more, so I guess at some point I should 
remove this feature...

best,
Simon




On 23/11/16 00:24, Marine Regis wrote:
> Thanks a lot for your answers.
> Peter: sorry, here is the missing information:
>
>*   I use the function gam() of the package �mgcv�
>*   Yes, the output changes when I use offset(log_trap_eff) instead of 
> offset=log_trap_eff. By using offset(log_trap_eff), the output is more 
> coherent with the observed values. Here are the new predictions:
>> summary(mod$fit)
> Min. 1st Qu.  MedianMean 3rd Qu.Max.
> 10.01   68.14   85.71   83.16  101.00  130.20
>
>
>*   I have tried to create a reproductive example to show the difference 
> between offset(log_trap_eff) and offset=log_trap_eff.
> nb_unique <- rnegbin(58, mu=82, theta=13.446)
> x <- runif(58,min=-465300,max=435200)
> prop_forest <- runif(58,min=0,max=1)
> log_trap_eff <- runif(58,min=4,max=6)
>
> With offset=log_trap_eff:
>
> mod1 <- gam(nb_unique ~ s(x,prop_forest), offset=log_trap_eff, 
> family=nb(theta=NULL, link="log"), method = "REML", select = TRUE)
>
>> mod1Pred <- predict.gam(mod1, se.fit=TRUE, type="response")
>> summary(mod1Pred$fit)
> Min. 1st Qu.  MedianMean 3rd Qu.Max.
>
>   0.5852  0.5852  0.5852  0.5852  0.5852  0.5852
>
>
> With offset(log_trap_eff):
> mod2 <- gam(nb_unique ~ s(x,prop_forest) + offset(log_trap_eff), 
> family=nb(theta=NULL, link="log"), method = "REML", select = TRUE)
>
>> mod2Pred <- predict.gam(mod2, se.fit=TRUE, type="response")
>> summary(mod2Pred$fit)
> Min. 1st Qu.  MedianMean 3rd Qu.Max.
>
>32.03   61.18   97.20  112.20  165.00  226.00
>
>
>
> Value range of observed data:
>
>> summary(nb_unique)
> Min. 1st Qu.  MedianMean 3rd Qu.Max.
>
>43.00   67.00   81.00   84.16   92.75  153.00
>
>
>*   By using fitted(mod), I obtain NULL.
> I am a novice in GAMs. So, I don�t know why the results are different between 
> models with offset=argument and offset().
> Thanks a lot for your help.
> Have a nice day
> Marine
>
>
>
> 
> De : peter dalgaard 
> Envoy� : mardi 22 novembre 2016 23:52
> � : Bert Gunter
> Cc : Marine Regis; r-help@r-project.org
> Objet : Re: [R] GAM with the negative binomial distribution: why do 
> predictions no match with original values?
>
>
>> On 22 Nov 2016, at 23:07 , Bert Gunter  wrote:
>>
>> Define "very different."  Sounds like a subjective opinion to me, for
>> which I have no response. Apparently others are similarly flummoxed.
>> Of course they would not in general be identical.
> Er? I don't see much reason to disagree that a range 0.10-0.18 is different 
> from 17-147.
>
> However, other bits of information are missing: We don't know which gam() 
> function is being used (to my knowledge there is one in package gam but also 
> one in mgcv). We don't have the data, so we cannot reproduce and try to find 
> the root of the problem.
>
> Offhand, it looks like the predict.gam() function is misbehaving, which could 
> have something to do with the offset term and/or the nb dispersion parameter. 
> On a hunch, does anything change if you use
>
> nb_unique ~ s(x,prop_forest) + offset(log_trap_eff)
>
> instead of the offset= argument? And, by the way, does fitted(mod,...) change 
> anything?
>
> -pd
>
>> Cheers,
>> Bert
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Tue, Nov 22, 2016 at 1:29 PM, Marine Regis  
>> wrote:
>>> Hello,
>>>
  From capture data, I would like to assess the effect of longitudinal 
 changes in proportion of forests on abundance of skunks. To test this, I 
 built this GAM where the dependent variable is the number of unique skunks 
 and the independent variables are the X coordinates of the centroids of 
 trapping sites (called "X" in the GAM) and the proportion of forests 
 within the trapping sites (called "prop_forest" in the GAM):
>>> mod <- gam(nb_unique ~ s(x,prop_forest), offset=log_trap_eff, 
>>> family=nb(theta=NULL, link="log"), data=succ_capt_skunk, method = "REML", 
>>> select = TRUE)
>>> summary(mod)
>>>
>>> Family: Negative Binomial(13.446)
>>> Link function: log
>>>
>>> Formula:
>>> nb_unique ~ s(x, prop_forest)
>>>
>>> Parametric coefficients:
>>> Estimate Std. Error z value Pr(>|z|)
>>> (Intercept) -2.020950.03896  -51.87   <2e-16 ***
>>> ---

[R] Fixing non-UTF8 locale and other warning in R for Mac OS X

2016-11-23 Thread Umar Ahmad via R-help
Hello R Experts,
Please can anyone help or guide me through on how I can fix this problem below 
that is usual showed whenever I open R statitsics in Mac OS but I never have 
seen this to exist on Window computer. It seems to always refer me to R for Mac 
OS X FAQ (see Help) of which I went there many times but couldn't know how to 
fix this issue. 

"During startup - Warning messages:1: Setting LC_CTYPE failed, using "C" 2: 
Setting LC_COLLATE failed, using "C" 3: Setting LC_TIME failed, using "C" 4: 
Setting LC_MESSAGES failed, using "C" 5: Setting LC_MONETARY failed, using "C" 
[R.app GUI 1.68 (7288) x86_64-apple-darwin13.4.0]
WARNING: You're using a non-UTF8 locale, therefore only ASCII characters will 
work.Please read R for Mac OS X FAQ (see Help) section 9 and adjust your system 
preferences accordingly."


Thanks in anticipation for your guidance.
Umar. 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.