from:"Charles Determan"

Re: [R] About calculating average values from several matrices

2017-05-09 Thread Charles Determan

If you want the mean of each element across you list of matrices the
following should provide what you are looking for where Reduce sums all
your matrix elements across matrices and the simply divided my the number
of matrices for the element-wise mean.

Reduce(`+`, mylist)/length(mylist)

Regards,
Charles

On Tue, May 9, 2017 at 9:52 AM, lily li  wrote:

> I meant for each cell, it takes the average from other dataframes at the
> same cell. I don't know how to deal with row names and col names though, so
> it has the error message.
>
> On Tue, May 9, 2017 at 8:50 AM, Doran, Harold  wrote:
>
> > It’s not clear to me what your actual structure is. Can you provide
> > str(object)? Assuming it is a list, and you want the mean over all cells
> or
> > columns, you might want like this:
> >
> >
> >
> > myData <- vector("list", 3)
> >
> >
> >
> > for(i in 1:3){
> >
> > myData[[i]] <- matrix(rnorm(100), 10, 10)
> >
> > }
> >
> >
> >
> > ### mean over all cells
> >
> > sapply(myData, function(x) mean(x))
> >
> >
> >
> > ### mean over all columns
> >
> > sapply(myData, function(x) colMeans(x))
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > *From:* lily li [mailto:chocol...@gmail.com]
> > *Sent:* Tuesday, May 09, 2017 10:44 AM
> > *To:* Doran, Harold 
> > *Cc:* R mailing list 
> > *Subject:* Re: [R] About calculating average values from several matrices
> >
> >
> >
> > I'm trying to get a new dataframe or whatever to call, which has the same
> > structure with each file as listed above. For each cell in the new
> > dataframe or the new file, it is the average value from former dataframes
> > at the same location. Thanks.
> >
> >
> >
> > On Tue, May 9, 2017 at 8:41 AM, Doran, Harold  wrote:
> >
> > Are you trying to take the mean over all cells, or over rows/columns
> > within each dataframe. Also, are these different dataframes stored
> within a
> > list or are they standalone?
> >
> >
> >
> >
> > -Original Message-
> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of lily li
> > Sent: Tuesday, May 09, 2017 10:39 AM
> > To: R mailing list 
> > Subject: [R] About calculating average values from several matrices
> >
> > Hi R users,
> >
> > I have a question about manipulating the data.
> > For example, there are several such data frames or matrices, and I want
> to
> > calculate the average value from all the data frames or matrices. How to
> do
> > it? Also, should I convert them to data frame or matrix first? Right now,
> > when I use typeof() function, each one is a list.
> >
> > file1
> > jan   feb   mar   apr   may   jun   jul   aug   sep   oct
>  nov
> >
> > app1   1.1   1.20.80.9   1.31.5   2.2   3.2   3.01.2
>  1.1
> > app2   3.1   3.22.82.5   2.32.5   3.2   3.0   2.91.8
>  1.8
> > app3   5.1   5.23.84.9   5.35.5   5.2   4.2   5.04.2
>  4.1
> >
> > file2
> > jan   feb   mar   apr   may   jun   jul   aug   sep   oct
>  nov
> >
> > app1   1.9   1.50.50.9   1.21.8   2.5   3.7   3.21.5
>  1.6
> > app2   3.5   3.72.32.2   2.52.0   3.6   3.2   2.81.2
>  1.4
> > app3   5.5   5.03.54.4   5.45.6   5.3   4.4   5.24.3
>  4.2
> >
> > file3 has the similar structure and values...
> >
> > There are eight such files, and when I use the function mean(file1,
> file2,
> > file3, ..., file8), it returns the error below. Thanks for your help.
> >
> > Warning message:
> > In mean.default(file1, file2, file3, file4, file5, file6, file7,  :
> >   argument is not numeric or logical: returning NA
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] About calculating average values from several matrices

2017-05-09 Thread Charles Determan

Just call 'round' on your results then at your desired number of digits.

On Tue, May 9, 2017 at 10:09 AM, lily li  wrote:

> Thanks very much, it works. But how to round the values to have only 1
> decimal digit or 2 decimal digits? I think by dividing, the values are
> double type now. Thanks again.
>
>
> On Tue, May 9, 2017 at 9:04 AM, Charles Determan 
> wrote:
>
>> If you want the mean of each element across you list of matrices the
>> following should provide what you are looking for where Reduce sums all
>> your matrix elements across matrices and the simply divided my the number
>> of matrices for the element-wise mean.
>>
>> Reduce(`+`, mylist)/length(mylist)
>>
>> Regards,
>> Charles
>>
>> On Tue, May 9, 2017 at 9:52 AM, lily li  wrote:
>>
>>> I meant for each cell, it takes the average from other dataframes at the
>>> same cell. I don't know how to deal with row names and col names though,
>>> so
>>> it has the error message.
>>>
>>> On Tue, May 9, 2017 at 8:50 AM, Doran, Harold  wrote:
>>>
>>> > It’s not clear to me what your actual structure is. Can you provide
>>> > str(object)? Assuming it is a list, and you want the mean over all
>>> cells or
>>> > columns, you might want like this:
>>> >
>>> >
>>> >
>>> > myData <- vector("list", 3)
>>> >
>>> >
>>> >
>>> > for(i in 1:3){
>>> >
>>> > myData[[i]] <- matrix(rnorm(100), 10, 10)
>>> >
>>> > }
>>> >
>>> >
>>> >
>>> > ### mean over all cells
>>> >
>>> > sapply(myData, function(x) mean(x))
>>> >
>>> >
>>> >
>>> > ### mean over all columns
>>> >
>>> > sapply(myData, function(x) colMeans(x))
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > *From:* lily li [mailto:chocol...@gmail.com]
>>> > *Sent:* Tuesday, May 09, 2017 10:44 AM
>>> > *To:* Doran, Harold 
>>> > *Cc:* R mailing list 
>>> > *Subject:* Re: [R] About calculating average values from several
>>> matrices
>>>
>>> >
>>> >
>>> >
>>> > I'm trying to get a new dataframe or whatever to call, which has the
>>> same
>>> > structure with each file as listed above. For each cell in the new
>>> > dataframe or the new file, it is the average value from former
>>> dataframes
>>> > at the same location. Thanks.
>>> >
>>> >
>>> >
>>> > On Tue, May 9, 2017 at 8:41 AM, Doran, Harold  wrote:
>>> >
>>> > Are you trying to take the mean over all cells, or over rows/columns
>>> > within each dataframe. Also, are these different dataframes stored
>>> within a
>>> > list or are they standalone?
>>> >
>>> >
>>> >
>>> >
>>> > -Original Message-
>>> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of lily
>>> li
>>> > Sent: Tuesday, May 09, 2017 10:39 AM
>>> > To: R mailing list 
>>> > Subject: [R] About calculating average values from several matrices
>>> >
>>> > Hi R users,
>>> >
>>> > I have a question about manipulating the data.
>>> > For example, there are several such data frames or matrices, and I
>>> want to
>>> > calculate the average value from all the data frames or matrices. How
>>> to do
>>> > it? Also, should I convert them to data frame or matrix first? Right
>>> now,
>>> > when I use typeof() function, each one is a list.
>>> >
>>> > file1
>>> > jan   feb   mar   apr   may   jun   jul   aug   sep   oct
>>>  nov
>>> >
>>> > app1   1.1   1.20.80.9   1.31.5   2.2   3.2   3.01.2
>>>  1.1
>>> > app2   3.1   3.22.82.5   2.32.5   3.2   3.0   2.91.8
>>>  1.8
>>> > app3   5.1   5.23.84.9   5.35.5   5.2   4.2   5.04.2
>>>  4.1
>>> >
>>> > file2
>>> > jan   feb   mar   apr   may   jun   jul   aug   sep   oct
>>>  nov
>>> &

[R] [R-pkgs] gpuR 2.0.0 released

2017-10-20 Thread Charles Determan

Dear R users,

I am happy to announce the most recent version of gpuR has been
released. There are several new enhancements to the package including
the ability to use user written OpenCL kernels.  A full list of
changes from the NEWS are shown below.

API Changes:

1. deviceType, gpuInfo, cpuInfo not longer accepts 'platform_idx'
parameter as OpenCL contexts cannot contain more than one platform.

New Features:

1. Added functionality to create custom OpenCL functions from user
provided kernels

2. Added 'synchronize' function to assure completion of device calls
(necessary for benchmarking)

3. Added determinant function (det)

4. Allow for gpuR object - base object interaction (e.g. vclMatrix * matrix)

5. Added ‘inplace' function for ’inplace' operations. These operations
include '+', '-', '*', '/', 'sin', 'asin', 'sinh', 'cos', 'acos',
'cosh', 'tan', 'atan', 'tanh'.

6. Added 'sqrt', 'sum', 'sign','pmin', and 'pmax' functions

7. Methods to pass two gpuR matrix objects to 'cov'

8. Added 'norm' method

9. Added gpuRmatrix/gpuRvector Arith '+','-' methods

Bug Fixes:

1. Fixed incorrect device info when using different contexts

2. Fixed Integer Matrix Multiplication

3. All OpenCL devices will be initialized on startup (previous version
occasionally would omit some devices)


There are many more features in the works.  Suggestions and
contributions continue to be welcomed.  Please submit all through my
github issues https://github.com/cdeterman/gpuR/issues



Also, thanks to all those as well for testing this package on various
GPU devices and operating systems.  A lot of the stability of this
package is made possible by your efforts.


Kind regards,

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Where are the PCA outputs?

2016-09-12 Thread Charles Determan

Hi Nick,

"prcomp" returns an object of class "prcomp" so when you simply 'print' the
object it gets passed to the "print.prcomp" function.  If you want to see
all the objects you should assign the results to an object.

Regards,
Charles

On Mon, Sep 12, 2016 at 7:56 AM, WRAY NICHOLAS 
wrote:

> Hi R Folk  I have been kicking some data around and one thing has been to
> try a
> PC analysis  on it, but whereas in the online examples I've looked at the
> prcomp
> function gives a set of five outputs when I use the prcomp function it only
> gives me a set of standard deviations and the rotation matrix
>
> My data (pcl) is this:
>
> resmat.3...2. resmat.3...3. resmat.3...4.
> 1 0.08749276   0.015706470 0.259
> 2 0.08749276   0.039266176 0.198
> 3 0.10630841   0.047119411 0.235
> 4 0.25307047   0.062825881 0.374
> 5 0.14393971   0.117798527 0.534
> 6 0.23049169   0.023559705 0.355
> 7 0.15052518   0.007853235 0.179
> 8 0.09784137   0.031412940 0.219
> 9 0.09878215   0.039266176 0.301
> 100.14111736   0.157064702 0.285
> 110.03951286   0.015706470 0.036
> 120.16181457   0.125651762 0.324
> 130.13359110   0.031412940 0.304
> 140.08278885   0.031412940 0.221
> 150.08561120   0.023559705 0.207
> 160.12042015   0.039266176 0.194
> 170.13359110   0.047119411 0.164
> 180.08937433   0.047119411 0.216
> 190.12700562   0.023559705 0.230
>
> the output is then
> > prcomp(pcl,scale.=T)
> Standard deviations:
> [1] 1.4049397 0.8447366 0.5590747
>
> Rotation:
> PC1 PC2PC3
> resmat.3...2. 0.5599782 -0.64434772 -0.5208075
> resmat.3...3. 0.5229417  0.76245515 -0.3810434
> resmat.3...4. 0.6426168 -0.05897597  0.7639146
>
> Does anyone know why the other things are not appearing?
>
> Thanks, Nick
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] regexpr - ignore all special characters and punctuation in a string

2015-04-20 Thread Charles Determan

You can use the [:alnum:] regex class with gsub.

str1 <- "What a nice day today! - Story of happiness: Part 2."
str2 <- "What a nice day today: Story of happiness (Part 2)"

gsub("[^[:alnum:]]", "", str1) == gsub("[^[:alnum:]]", "", str2)
[1] TRUE

The same can be done with the stringr package if you really are partial to
it.

library(stringr)





On Mon, Apr 20, 2015 at 9:10 AM, Sven E. Templer 
wrote:

> Hi Dimitri,
>
> str_replace_all is not in the base libraries, you could use 'gsub' as well,
> for example:
>
> a = "What a nice day today! - Story of happiness: Part 2."
> b = "What a nice day today: Story of happiness (Part 2)"
> sa = gsub("[^A-Za-z0-9]", "", a)
> sb = gsub("[^A-Za-z0-9]", "", b)
> a==b
> # [1] FALSE
> sa==sb
> # [1] TRUE
>
> Take care of the extra space in a after the '-', so also replace spaces...
>
> Best,
> Sven.
>
> On 20 April 2015 at 16:05, Dimitri Liakhovitski <
> dimitri.liakhovit...@gmail.com> wrote:
>
> > I think I found a partial answer:
> >
> > str_replace_all(x, "[[:punct:]]", " ")
> >
> > On Mon, Apr 20, 2015 at 9:59 AM, Dimitri Liakhovitski
> >  wrote:
> > > Hello!
> > >
> > > Please point me in the right direction.
> > > I need to match 2 strings, but focusing ONLY on characters, ignoring
> > > all special characters and punctuation signs, including (), "", etc..
> > >
> > > For example:
> > > I want the following to return: TRUE
> > >
> > > "What a nice day today! - Story of happiness: Part 2." ==
> > >"What a nice day today: Story of happiness (Part 2)"
> > >
> > >
> > > --
> > > Thank you!
> > > Dimitri Liakhovitski
> >
> >
> >
> > --
> > Dimitri Liakhovitski
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] feature selection

2015-04-20 Thread Charles Determan

Although I am sure many here would be happy to help you your question is
far too vague.  There are many methods for feature selection.  You should
review the literature and see what would work best for you or consult a
statistician.  Once you have selected a method and began an initial attempt
at the R code then this list will be far more helpful to you.  This help
list is meant to help people with their R programming not design their
analysis for them.

Some places to start with R include the very popular 'caret' package.  Max
Kuhn (the author) has a wonderful website with many tutorials.  Here is the
front page for feature selection,
http://topepo.github.io/caret/featureselection.html

I also have developed a package on Bioconductor called 'OmicsMarkeR' which
you can find at
http://bioconductor.org/packages/release/bioc/html/OmicsMarkeR.html that
you may find useful depending upon the data you possess.

Regards,
Charles

On Mon, Apr 20, 2015 at 12:19 PM, ismail hakkı sonalcan <
ismaelhakk...@hotmail.com> wrote:

> Hi,
>
> I want to make feature selection.
> Could you help me.
>
> Thanks.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Package Build Recommendations

2015-05-06 Thread Charles Determan

Hi Glenn,

Generally data files are stored in the 'data' directory.  If you visit
Hadley's R packages site on the data page (http://r-pkgs.had.co.nz/data.html)
this is described quite clearly.  You can use the devtools package
functions like `use_data` to have data files properly stored in your
package.

Cheers,
Charles

On Tue, May 5, 2015 at 7:16 PM, Glenn Schultz  wrote:

> Hi All,
>
> I have my R package built and it passes the CRAN tests.  Now, I have a
> question.  The file structure is not standard.  There are addition data
> files as follows outlined below.  Each, if you will, represents separation
> of concerns with respect to structured securities like MBS and REMICs
> (CMOs).  They referenced via connection in the software via a connection
> string ~/users/BondLab.  I am looking for recommendations to create the
> directory and copy the appropriate folders with their data to a user
> directory on install.  Any help will be appreciated.
>
> Glenn
>
> BondData
> Groups
> PrepaymentModel
> REMIC
> RDME
> RAID
> RatesData
> Scenario
> Schedules
> Tranches
> Waterfall
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] compiling R with tuned BLAS

2015-05-22 Thread Charles Determan

Which OS are you using (Windows, Linux (distro), Mac)?  When you mention
.so files I tend to assume you are using a Linux system.  If you are using
ubuntu, changing the BLAS used by R is relatively trivial by using
'update-alternatives'.  More detail is provided at the following link:
http://www.stat.cmu.edu/~nmv/2013/07/09/for-faster-r-use-openblas-instead-better-than-atlas-trivial-to-switch-to-on-ubuntu/

Charles

On Thu, May 21, 2015 at 2:19 PM, Michael Gooch  wrote:

> I am looking at the instructions on
> http://cran.r-project.org/doc/manuals/r-patched/R-admin.html#ATLAS
>
> I have noticed that ATLAS produces two shared libs in addition to the *.a
> files:
> http://math-atlas.sourceforge.net/atlas_install/node22.html
>
> contents of the ATLAS lib directory:
> libatlas.a  libcblas.a  libf77blas.a  liblapack.a  libptcblas.a
> libptf77blas.a  libsatlas.so  libtatlas.so
>
> The instructions do not appear to match up with the *.a files & *.so files
> as described. (it appears to want me to use shared libs, but the names
> defined are static libs, not shared libs).
>
> should I simply be having it link against libtatlas.so (and pthreads) for
> shared threaded atlas and libsatlas.so for shared sequential atlas?
> do I need shared versions of the other static libraries?
>
> I think the help is a bit out of date, or at least unclear as to what it
> intends of me.
>
> M. Gooch
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Path analysis

2015-05-26 Thread Charles Determan

Given that your problem primarily focuses on a biological context you
probably would have better luck with bioconductor (www.bioconductor.org).

Regards,
Charles

On Tue, May 26, 2015 at 12:43 AM, Alberto Canarini <
alberto.canar...@sydney.edu.au> wrote:

> Hi there,
>
> As I'm approaching path analysis I was wondering which packages may suite
> a path analysis for my data. My data are on interaction of soil biotic and
> abiotic factor, like microbial biomass carbon, soil carbon, water content,
> temperature etc.
>
> Thanks in advance,
>
> Best regards.
>
> Alberto
>
> Alberto Canarini
> PhD Student l Faculty of Agriculture and Environment
> THE UNIVERSITY OF SYDNEY
> Shared room l CCWF l Camden Campus l NSW 2570
> P 02 935 11892
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] An Odd Request

2015-05-29 Thread Charles Determan

If you are primarily interested in making your R analyses in to a website
you should look in to the 'Shiny' package.  It makes generating web pages
very easy.  Here is a link to the Shiny Gallery providing some examples (
http://shiny.rstudio.com/gallery/).

Regards,
Charles

On Fri, May 29, 2015 at 7:48 AM, Josh Grant  wrote:

> Hello R-Users
>
> I apologize in advance if my post is inappropriate. I read the entire
> posting guide and found nothing to say so, but you never know. I am seeking
> a knowledgable R-user that might be interested (for whatever reason) in
> helping out on what I hope would be considered a worthy project.
>
> I am a research scientist, albeit one with little programming ability. I
> recently started a website which allows patients of different sorts to
> suggest research studies. Everything is completely free and anonymous. When
> several members express interest in a particular idea I attempt to build it
> so they can actually run through the study. Clearly there are limits but we
> currently we have 4 communities, chronic fatigue syndrome, fibromyalgia,
> multiple sclerosis and pernicious anaemia and there are several active
> studies in which people are submitting data every day. It's quite exciting
> and I think it has great potential to help people, particularly with
> disorders that have defied explanation.
>
> I'm currently using google spreadsheets/forms to create symptom trackers
> and interactive dashboards of the results which (most of the time) show
> group results by default but which can show individual results if an ID is
> entered. Unfortunately google spreadsheets is a little limited and I now
> require the use of more complicated stats such as linear mixed models.
>
> I know that I need to move to R, I understand the basics of running
> statistical tests with packages such as LMER, but I have no clue how to go
> about integrating such analyses into a website. I could certainly learn
> how, would love to, and ultimately will, but if someone was interested in
> joining me in this endeavour much more could be accomplished.
>
> If you're interested in knowing more let me know.
>
> Josh
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help on Neural Network package

2015-06-02 Thread Charles Determan

You model is reaching the error threshold otherwise you would be receiving
an 'actual' error message.  You model is just converging very quickly.  If
you want to see the error reducing, change lifesign="minimal" to
lifesign="full" and set the lifesign.step=1 to make it very verbose for
this model.

Alternatively you could just look at winequal$result.matrix and you will
see how many steps were taken to reach the threshold as well as the
reached.threshold your model finished on.

Cheers,
Charles

On Tue, Jun 2, 2015 at 2:59 PM, ravishankar narayanan 
wrote:

> Hi,
>
> Developing a Neural Network in R to predict the quality of wine.Attached is
> the Wine Data Set.Tried twice once by selecting specific features and once
> by using all features to predict quality of wine
>
> Each time I run the Neural Network I am getting a  huge error.Tried using
> feature selection and also used all features
> *With Feature Selection*
>
> *winequal<-
>
> neuralnet(quality~volatile.acidity+citric.acid+sulphates+alcohol,winetrain,hidden=2,lifesign="minimal",linear.output=FALSE,threshold=0.1)*
>
>
>
>
>
> *hidden:2thresh:0.1rep:1/1steps:  52  error:
> 8486.0   time: 2.41secs*
>
> *No feature selection*
>
>
> *winequal1<-neuralnet(quality~fixed.acidity+volatile.acidity+citric.acid+residual.sugar+chlorides+free.sulfur.dioxide+total.sulfur.dioxide+density+pH*
>
>
>
>
>
> *++sulphates+alcohol,winetrain1,hidden=4,lifesign="minimal",linear.output=FALSE,threshold=0.1)*
>
> Model Produced:
>
>
>
>
>
> *hidden: 4thresh:0.1rep: 1/1steps: 26  error: 8486.08992
>  time: 0.1 secs*
>
> Is there any method I can use to reduce the error ? Changing the threshold
> or the number of hidden layers does not help.Any tips will be
> helpful.Thanks.
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rcpp cpp11 and R CMD build

2015-06-24 Thread Charles Determan

Hi Edwin,

If you look at the build output you will notice that the C++11 compiler
flag is not being used.  I just created a small package using Rcpp11 and
your function and it worked without a problem.  I can't give you a specific
reason without seeing your package but there are some possibilities I would
guess right away.

1. Make sure you are 'LinkingTo' Rcpp11 in your DESCRIPTION
2. Unless you are using some custom Makevars file, you should set
'SystemRequirements: C++11' in your DESCRIPTION

Charles

On Wed, Jun 24, 2015 at 10:07 AM, Edwin van Leeuwen 
wrote:

> Hi all,
>
> I've just started using Rcpp and am trying to get cpp11 support working. As
> suggested I added [[Rcpp:plugins(cpp11)]] to my source file and a test
> function:
> // [[Rcpp::export]]
> int useCpp11() {
>   auto x = 10;
>   return x;
> }
>
> This works fine when using:
> sourceCpp(filename)
> from R, but I would like to be able to compile the package from the command
> line.
> R CMD build mypackage
> fails with the following error:
> R CMD build ../fluEvidenceSynthesis
> * checking for file ‘../fluEvidenceSynthesis/DESCRIPTION’ ... OK
> * preparing ‘fluEvidenceSynthesis’:
> * checking DESCRIPTION meta-information ... OK
> * cleaning src
> * installing the package to process help pages
>   ---
> * installing *source* package ‘fluEvidenceSynthesis’ ...
> ** libs
> g++ -I/usr/share/R/include -DNDEBUG
> -I"/home/edwin/R/x86_64-pc-linux-gnu-library/3.2/Rcpp/include"
> -I"/home/edwin/R/x86_64-pc-linux-gnu-library/3.2/BH/include"   -fpic  -g
> -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat
> -Werror=format-security -D_FORTIFY_SOURCE=2 -g  -c RcppExports.cpp -o
> RcppExports.o
> g++ -I/usr/share/R/include -DNDEBUG
> -I"/home/edwin/R/x86_64-pc-linux-gnu-library/3.2/Rcpp/include"
> -I"/home/edwin/R/x86_64-pc-linux-gnu-library/3.2/BH/include"   -fpic  -g
> -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat
> -Werror=format-security -D_FORTIFY_SOURCE=2 -g  -c rcpp_hello_world.cpp -o
> rcpp_hello_world.o
> rcpp_hello_world.cpp: In function ‘int useCpp11()’:
> rcpp_hello_world.cpp:33:10: error: ‘x’ does not name a type
>  auto x = 10;
>   ^
> rcpp_hello_world.cpp:34:12: error: ‘x’ was not declared in this scope
>  return x;
> ^
> make: *** [rcpp_hello_world.o] Error 1
> ERROR: compilation failed for package ‘fluEvidenceSynthesis’
> * removing ‘/tmp/RtmpWdUduu/Rinst2b601aa285e9/fluEvidenceSynthesis’
>   ---
> ERROR: package installation failed
>
>
> Any help appreciated.
>
> Cheers, Edwin
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rcpp cpp11 and R CMD build

2015-06-24 Thread Charles Determan

Glad to help,

The SystemRequirements is for a package.  I believe the example in the
gallery is intended to demonstrate a function where if you set the
CXX_FLAGS with:

Sys.setenv("PKG_CXXFLAGS"="-std=c++11")

And then compiled a single *.cpp file with Rcpp::sourceCpp("test.cpp")
I believe it should work fine.  But for package purposes you want the
user to not have to care about setting flags manually.
It ultimately just comes down to context.

Regards,

Charles


On Wed, Jun 24, 2015 at 11:57 AM, Edwin van Leeuwen 
wrote:

> Thank you! I was missing the SystemRequirements. I guess it could be
> useful to add this to the example given here:
> http://gallery.rcpp.org/articles/simple-lambda-func-c++11/
>
> Cheers, Edwin
>
> On Wed, 24 Jun 2015 at 17:50 Charles Determan 
> wrote:
>
>> Hi Edwin,
>>
>> If you look at the build output you will notice that the C++11 compiler
>> flag is not being used.  I just created a small package using Rcpp11 and
>> your function and it worked without a problem.  I can't give you a specific
>> reason without seeing your package but there are some possibilities I would
>> guess right away.
>>
>> 1. Make sure you are 'LinkingTo' Rcpp11 in your DESCRIPTION
>> 2. Unless you are using some custom Makevars file, you should set
>> 'SystemRequirements: C++11' in your DESCRIPTION
>>
>> Charles
>>
>> On Wed, Jun 24, 2015 at 10:07 AM, Edwin van Leeuwen 
>> wrote:
>>
>>> Hi all,
>>>
>>> I've just started using Rcpp and am trying to get cpp11 support working.
>>> As
>>> suggested I added [[Rcpp:plugins(cpp11)]] to my source file and a test
>>> function:
>>> // [[Rcpp::export]]
>>> int useCpp11() {
>>>   auto x = 10;
>>>   return x;
>>> }
>>>
>>> This works fine when using:
>>> sourceCpp(filename)
>>> from R, but I would like to be able to compile the package from the
>>> command
>>> line.
>>> R CMD build mypackage
>>> fails with the following error:
>>> R CMD build ../fluEvidenceSynthesis
>>> * checking for file ‘../fluEvidenceSynthesis/DESCRIPTION’ ... OK
>>> * preparing ‘fluEvidenceSynthesis’:
>>> * checking DESCRIPTION meta-information ... OK
>>> * cleaning src
>>> * installing the package to process help pages
>>>   ---
>>> * installing *source* package ‘fluEvidenceSynthesis’ ...
>>> ** libs
>>> g++ -I/usr/share/R/include -DNDEBUG
>>> -I"/home/edwin/R/x86_64-pc-linux-gnu-library/3.2/Rcpp/include"
>>> -I"/home/edwin/R/x86_64-pc-linux-gnu-library/3.2/BH/include"   -fpic  -g
>>> -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat
>>> -Werror=format-security -D_FORTIFY_SOURCE=2 -g  -c RcppExports.cpp -o
>>> RcppExports.o
>>> g++ -I/usr/share/R/include -DNDEBUG
>>> -I"/home/edwin/R/x86_64-pc-linux-gnu-library/3.2/Rcpp/include"
>>> -I"/home/edwin/R/x86_64-pc-linux-gnu-library/3.2/BH/include"   -fpic  -g
>>> -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat
>>> -Werror=format-security -D_FORTIFY_SOURCE=2 -g  -c rcpp_hello_world.cpp
>>> -o
>>> rcpp_hello_world.o
>>> rcpp_hello_world.cpp: In function ‘int useCpp11()’:
>>> rcpp_hello_world.cpp:33:10: error: ‘x’ does not name a type
>>>  auto x = 10;
>>>   ^
>>> rcpp_hello_world.cpp:34:12: error: ‘x’ was not declared in this scope
>>>  return x;
>>> ^
>>> make: *** [rcpp_hello_world.o] Error 1
>>> ERROR: compilation failed for package ‘fluEvidenceSynthesis’
>>> * removing ‘/tmp/RtmpWdUduu/Rinst2b601aa285e9/fluEvidenceSynthesis’
>>>   ---
>>> ERROR: package installation failed
>>>
>>>
>>> Any help appreciated.
>>>
>>> Cheers, Edwin
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] modifying a package installed via GitHub

2015-07-20 Thread Charles Determan

Steve,

You are able to work with a github package the same as any github repo.  If
you clone the repo:

git clone https://github.com/user/repo.git

If using RStudio it is simple enough to create a new project in that new
directory (if the .Rproj file does not exist, otherwise open that).  Once
you have the project open for that directory you can modify source files
and rebuild and install as you like.  If at the CMD line, you do as Bob
instructed with R CMD install .

I recommend, however, either creating a new branch for you changes (if you
familiar with git management) or at least make sure to change the
subversion of the package so it doesn't conflict with the 'original'.  That
way you 'know' which version of the package is installed at a given time.

Naturally, if you feel your modifications are valuable you may want to
actually fork the package on github and create a pull request of your
changes for the maintainer to incorporate in to the next release.

Hope this helps clarify things,

Charles

On Sat, Jul 18, 2015 at 8:49 AM, boB Rudis  wrote:

> You can go to the package directory:
>
> cd /some/path/to/package
>
> and do
>
> R CMD install .
>
> from a command-line there.
>
> Many github-based packages are also made using RStudio and you can
> just open the .Rproj file (i.e. load it into R studio) and build the
> package there which will install it.
>
> The same-named package will overwrite what you have previously installed.
>
> Just:
>
>devtools::install_github("owner/package")
>
> to go back to the original.
>
> On Fri, Jul 17, 2015 at 8:12 PM, Steve E.  wrote:
> > Hi Folks,
> >
> > I am working with a package installed via GitHub that I would like to
> > modify. However, I am not sure how I would go about loading a 'local'
> > version of the package after I have modified it, and whether that process
> > would including uninstalling the original unmodified package (and,
> > conversely, how to uninstall my local, modified version if I wanted to go
> > back to the unmodified version available on GitHub).
> >
> > Any advice would be appreciated.
> >
> >
> > Thanks,
> > Steve
> >
> >
> >
> > --
> > View this message in context:
> http://r.789695.n4.nabble.com/modifying-a-package-installed-via-GitHub-tp4710016.html
> > Sent from the R help mailing list archive at Nabble.com.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] modifying a package installed via GitHub

2015-07-20 Thread Charles Determan

You essentially have it but you can just click the 'build and install'
button to rebuild on the changes you made. But technically it would still
work pushing to your repo and using devtools.


On Monday, July 20, 2015, Stevan Earl  wrote:

> Bob and Charles,
>
> Thanks very much for taking the time to write, I greatly appreciate your
> help. I have been so spoiled by Rstudio for so long that I cannot recall
> the last time I had to use R CMD install. Although I installed this package
> from GitHub using devtools, I do not see that an .Rproj exists, and the R
> code is in the .rdb and .rdx formats.
>
> However, if I understand Charles correctly, one approach would be to (1)
> fork the repo, (2) clone it, (3) make my edits, (4) push the edits to my
> fork of the repo, then (5) (re)install the package from my forked repo
> (e.g., install_github("myreponame/packagename"))...then I should be able
> to call all the functions with my edits. If I wanted to go back to the
> original, published version of the package, then I can just reinstall from
> the source (e.g.,install_github("author/packagename"), and that will
> overwrite what I have done locally. Do I have that right?
>
> Thanks again for your thoughtful advice!
>
>
> Steve
>
> On Mon, Jul 20, 2015 at 5:52 AM, Charles Determan  > wrote:
>
>> Steve,
>>
>> You are able to work with a github package the same as any github repo.
>> If you clone the repo:
>>
>> git clone https://github.com/user/repo.git
>>
>> If using RStudio it is simple enough to create a new project in that new
>> directory (if the .Rproj file does not exist, otherwise open that).  Once
>> you have the project open for that directory you can modify source files
>> and rebuild and install as you like.  If at the CMD line, you do as Bob
>> instructed with R CMD install .
>>
>> I recommend, however, either creating a new branch for you changes (if
>> you familiar with git management) or at least make sure to change the
>> subversion of the package so it doesn't conflict with the 'original'.  That
>> way you 'know' which version of the package is installed at a given time.
>>
>> Naturally, if you feel your modifications are valuable you may want to
>> actually fork the package on github and create a pull request of your
>> changes for the maintainer to incorporate in to the next release.
>>
>> Hope this helps clarify things,
>>
>> Charles
>>
>>
>>
>> On Sat, Jul 18, 2015 at 8:49 AM, boB Rudis > > wrote:
>>
>>> You can go to the package directory:
>>>
>>> cd /some/path/to/package
>>>
>>> and do
>>>
>>> R CMD install .
>>>
>>> from a command-line there.
>>>
>>> Many github-based packages are also made using RStudio and you can
>>> just open the .Rproj file (i.e. load it into R studio) and build the
>>> package there which will install it.
>>>
>>> The same-named package will overwrite what you have previously installed.
>>>
>>> Just:
>>>
>>>devtools::install_github("owner/package")
>>>
>>> to go back to the original.
>>>
>>> On Fri, Jul 17, 2015 at 8:12 PM, Steve E. >> > wrote:
>>> > Hi Folks,
>>> >
>>> > I am working with a package installed via GitHub that I would like to
>>> > modify. However, I am not sure how I would go about loading a 'local'
>>> > version of the package after I have modified it, and whether that
>>> process
>>> > would including uninstalling the original unmodified package (and,
>>> > conversely, how to uninstall my local, modified version if I wanted to
>>> go
>>> > back to the unmodified version available on GitHub).
>>> >
>>> > Any advice would be appreciated.
>>> >
>>> >
>>> > Thanks,
>>> > Steve
>>> >
>>> >
>>> >
>>> > --
>>> > View this message in context:
>>> http://r.789695.n4.nabble.com/modifying-a-package-installed-via-GitHub-tp4710016.html
>>> > Sent from the R help mailing list archive at Nabble.com.
>>> >
>>> > __
>>> > R-help@r-project.org
>>>  mailing list --
>>> To UNSUBSCRIBE and more, see
>>> > https://stat.ethz.ch/mailman/listinfo/r-help
>>> > PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> > and provide commented, minimal, self-contained, reproducible code.
>>>
>>> __
>>> R-help@r-project.org
>>>  mailing list --
>>> To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] FPMC?

2018-08-14 Thread Charles Determan

Greetings R users,

I recently came across an interesting paper regarding recommender systems.
The particular method defined in the manuscript was Factorizing
Personalized Markov Chains.  You can find the article in question here (
http://www.ra.ethz.ch/cdstore/www2010/www/p811.pdf).  I am curious if
anyone here has ever come across anything like this before in the R
community.  I have found multiple packages on Markov Chains but nothing
with respect to combining them with matrix factorization.  I will continue
to search around but thought I would pose the question here as well.

Regards,
Charles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculate the area under a curve

2015-08-24 Thread Charles Determan

Hi Carsten,

This list is meant to help you solve specific coding problems.  What have
you tried?  A quick google search will provide several packages including
caTools, ROCR, AUC, pROC.  Look in to some of them, try them out and report
back if you have problems 'using' a function instead of just asking 'how
can I do this?'  As with everything in R, there are many different ways to
accomplish the same thing.

Regards,
Charles

On Mon, Aug 24, 2015 at 4:10 AM, CarstenH  wrote:

> Hi all
>
> I need to calculate the area under a curve (integral) for the following
> data
> pairs:
>
> Depth SOC
> 22.50.143
> 28.50.165
> 34.50.131
> 37.50.134
> 40.50.138
> 43.50.107
> 46.50.132
> 49.50.175
> 52.50.087
> 55.50.117
> 58.50.126
> 61.50.13
> 64.50.122
> 67.50.161
> 71.50.144
> 76.50.146
> 82.50.156
> 94.50.132
>
> (Table name is P)
>
> After reading the data set I assiged the collumns by:
>
> /x <- (P$Depth)
> y <- (P$SOC)
> /
>
> and decided to make a ploynominal function (3rd order):
>
> /fitP <- lm( y~poly(x,3,raw=TRUE) )/
>
> At the next step I failed. I can plot point and function but am not able to
> integrate the curve between e.g. depths 20 and 80.
>
> If I try:
> /
> integrand <-function(fitP1)
>   predict(y)
> integrate(integrand, lower = 25, upper = 80)/
>
> the "Conosle" opend with the message: "Source unavailable or out of sync"
> and
> /
> function(fitP1)
> predict(y)
> /
> )
>
>
> Would be great if somebody could help!
>
> Thanks
>
> Carsten
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Calculate-the-area-under-a-curve-tp4711418.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to find the mean and sd :(

2015-09-11 Thread Charles Determan

massmatics,

You are trying to take the mean/sd of an entire data.frame and therefore
you receive an error.  You must do some form of subset and take the mean of
the 'breaks' column.  This can be done a few ways (as with almost anything
in R).

AM.warpbreaks2 <- subset(AM.warpbreaks, breaks <= 30)
mean(AM.warpbreaks2$breaks)

or

mean(AM.warpbreaks$breaks[Am.warpbreaks$breaks <= 30])

or more concisely

with(AM.warpbreaks, mean(breaks[breaks <= 30]))

Again, the main point here is that you need to specify the column when
working with a data.frame object.

Regards,
Charles

On Fri, Sep 11, 2015 at 11:02 AM, Tom Wright  wrote:

> On Fri, 2015-09-11 at 07:48 -0700, massmatics wrote:
> > AM.warpbreaks<=30
>
> The above command is not returning what you expected, what part of the
> AM.warpbreaks dataframe is expected to be <= 30?
>
> Effectively you are using a two stage process.
> 1) Create a logical vector identifying rows in the dataframe with a
> breaks value <= 30
> 2) use the vector in 1. to extract just the rows you are interested in
> and use that to calculate the mean of the breaks column.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] New package: gpuR

2015-11-30 Thread Charles Determan

R Users,

I am happy to inform you that my 'gpuR' package has just been accepted to
CRAN.

https://cran.r-project.org/web/packages/gpuR/index.html

The gpuR package is designed to provide simple to use functions for
leveraging GPU's for computing.  Although there are a couple existing
packages for GPU's in R most are specific to NVIDIA GPU's and have very
limited and specific functions.  The packaged is based on an OpenCL backend
in conjunction with the ViennaCL library (which is packaged within the
RViennaCL package).  This allows the user to use almost any GPU (Intel,
AMD, or NVIDIA).  It is my hope that these functions can be used to more
rapidly develop algorithms within R that can leverage GPUs.

The package is structured to use a few new S4 classes that retain the
object either on the host CPU or in GPU memory (thereby avoiding transfer
time).  I have included a minimal introductory vignette describing the
package further, providing a simple use case, and listing currently
available functions.

https://cran.r-project.org/web/packages/gpuR/vignettes/gpuR.pdf

You can view the github page here:

https://github.com/cdeterman/gpuR

which also contains a wiki to help with installation.  Although it must be
compiled, it is able to be installed on Linux, Mac OSX, and Windows
platforms.

I welcome any comments, issues (please submit on the github), and of course
additional contributions.

Regards,
Charles Determan

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] GPU package crowd-source testing

2016-02-09 Thread Charles Determan

Greetings R users,

I would like to request any users who would be willing to test one of my
packages.  Normally I would be content using testthat and continuous
integration services but this particular package is used for GPU computing
(hence the cross-posting).  It is intended to be as general as possible for
available devices but I only have access to so much hardware.  I can't
possibly test it against every GPU available.

As such, I would sincerely appreciate any user that has at least one GPU
device (Intel, AMD, or NVIDIA) and is willing to experiment with the
package to try it out.  Note, this will require installing an OpenCL SDK of
some form.  Installation instructions for the package are found here (
https://github.com/cdeterman/gpuR/wiki).

At the very least, if you have a valid device, you would only need to
download the 'development' version of the package and experiment with the
functions such as a matrix multiplication.

devtools::install_github("cdeterman/gpuR", ref = "develop")

library(gpuR)
A <- gpuMatrix(rnorm(1), 100, 100)
A %*% A

You could also clone my github repo and run all the unit tests I have
included

git clone -b develop https://github.com/cdeterman/gpuR.git

If using RStudio, just open the package in a new project and press
'Ctrl-Shift-T' or more directly run  `devtools::test()`

If using command-line R, switch to the gpuR directory, start R and run
`devtools::test()`.

If you find any errors or bugs, please report them in my github issues (
https://github.com/cdeterman/gpuR/issues).  Naturally any recommendations
on additional features are welcome.

Thank you in advance for any support you can provide.  I want to continue
improving this package but I am beginning to reach the end of what I can
accomplish from a hardware perspective.

Best Regards,
Charles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] GPU package crowd-source testing

2016-02-11 Thread Charles Determan

R Users,

My sincere thanks to all those who have been coming forward to test my GPU
package and provide bug reports.  I want to followup on my initial request
with a few qualifiers.

1. I neglected to tell users to also use my github version of 'RViennaCL'
instead of the CRAN version.  I have made some updates that I was
postponing release of until I can solve the multiple device issues with
'gpuR'.

devtools::install_github("cdeterman/RViennaCL")

2. When reporting bugs, either directly to me or ideally in my github
issues (https://github.com/cdeterman/gpuR/issues), please provide your
Operating System, OpenCL version (e.g. 1.0, 1.2, 2.0), OpenCL SDK (e.g.
AMD, CUDA toolkit, etc.) and GPU device.  If you don't know these things
you can get them from Sys.info() for the OS, platformInfo() for the OpenCL
SDK, gpuInfo() for the GPU information, and check your OpenCL header (cl.h)
for the /* OpenCL Version */ section for the highest version number.

3. If you have installed 'gpuR' and it is running without problems I would
still like to know this.  It would be good to begin generating a list of
'tested' devices and associated platform.  I have just created a gitter
account.  I am relatively new to it but I'm hoping it can be used to try
and consolidate responses.  In this case, you can simply reply on the
Tested_GPUs thread (https://gitter.im/cdeterman/gpuR/Tested_GPUs) with your
device and platform backend.

Again, thanks to all for taking the time to try out this package.

Regards,
Charles

On Tue, Feb 9, 2016 at 12:20 PM, Charles Determan 
wrote:

> Greetings R users,
>
> I would like to request any users who would be willing to test one of my
> packages.  Normally I would be content using testthat and continuous
> integration services but this particular package is used for GPU computing
> (hence the cross-posting).  It is intended to be as general as possible for
> available devices but I only have access to so much hardware.  I can't
> possibly test it against every GPU available.
>
> As such, I would sincerely appreciate any user that has at least one GPU
> device (Intel, AMD, or NVIDIA) and is willing to experiment with the
> package to try it out.  Note, this will require installing an OpenCL SDK of
> some form.  Installation instructions for the package are found here (
> https://github.com/cdeterman/gpuR/wiki).
>
> At the very least, if you have a valid device, you would only need to
> download the 'development' version of the package and experiment with the
> functions such as a matrix multiplication.
>
> devtools::install_github("cdeterman/gpuR", ref = "develop")
>
> library(gpuR)
> A <- gpuMatrix(rnorm(1), 100, 100)
> A %*% A
>
> You could also clone my github repo and run all the unit tests I have
> included
>
> git clone -b develop https://github.com/cdeterman/gpuR.git
>
> If using RStudio, just open the package in a new project and press
> 'Ctrl-Shift-T' or more directly run  `devtools::test()`
>
> If using command-line R, switch to the gpuR directory, start R and run
> `devtools::test()`.
>
> If you find any errors or bugs, please report them in my github issues (
> https://github.com/cdeterman/gpuR/issues).  Naturally any recommendations
> on additional features are welcome.
>
> Thank you in advance for any support you can provide.  I want to continue
> improving this package but I am beginning to reach the end of what I can
> accomplish from a hardware perspective.
>
> Best Regards,
> Charles
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RSNNS neural network

2016-03-03 Thread Charles Determan

Unfortunately we can only provide so much help without a reproducible
example.  Can you use a dataset that everyone would have access to to
reproduce the problem?  Otherwise it is difficult for anyone to help you.

Regards,
Charles

On Tue, Mar 1, 2016 at 12:35 AM, jake88  wrote:

> I am new to R and neural networks . So I trained and predicted an elman
> network like so :
>
> require ( RSNNS )
> mydata = read.csv("mydata.csv",header = TRUE)
> mydata.train = mydata[1000:2000,]
> mydata.test = mydata[800:999,]
>
> fit <- elman ( mydata.train[,2:10],mydata.train[,1], size =100
>  learnFuncParams =c (0.1) , maxit =1000)
> pred <-predict (fit , mydata.test[,2:10])
>
> So pred contains the predictions .
> The problem I am having is that when I run pred <-predict (fit ,
> mydata.test[1,2:10]) repeatedly , it gives me different results each time .
> Should not the weights and bias be set permanently in the network and give
> the same result everytime   ?
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] gpuR 1.1.0 Release

2016-03-18 Thread Charles Determan

Dear R users,

The next release of gpuR (1.1.0) has been accepted to CRAN (
http://cran.r-project.org/package=gpuR).

There have been multiple additions including:

1. Scalar operations for gpuMatrix/vclMatrix objects (e.g. 2 * X)
2. Unary '-' operator added (e.g. -X)
3. 'slice' and 'block' methods for vector & matrix objects respectively
4. 'deepcopy' methods
5. 'abs', 'max', 'min' methods added
6. 'cbind' & 'rbind' methods added for matrices
7. 't' method
8. 'distance' method for pairwise distances (euclidean and sqEuclidean)

Introductory vignette can be found at
https://cran.r-project.org/web/packages/gpuR/vignettes/gpuR.pdf

Help with installation can be found at
https://github.com/cdeterman/gpuR/wiki

Bug reports, suggestions, and feature requests are appreciated at
https://github.com/cdeterman/gpuR/issues

Happy GPU computing,
Charles

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fit a smooth closed shape through 4 points

2016-03-21 Thread Charles Determan

Hi Allie,

What is you goal here?  Do you just want to plot a curve to the data?  Do
you want a function to approximate the data?

You may find the functions spline() and splinefun() useful.

Quick point though, with so few points you are only going to get a very
rough approximation no matter the method used.

Regards,
Charles

On Mon, Mar 21, 2016 at 7:59 AM, Alexander Shenkin  wrote:

> Hello all,
>
> I have sets of 4 x/y points through which I would like to fit closed,
> smoothed shapes that go through those 4 points exactly.  smooth.spline
> doesn't like my data, since there are only 3 unique x points, and even
> then, i'm not sure smooth.spline likes making closed shapes.
>
> Might anyone else have suggestions for fitting algorithms I could employ?
>
> Thanks,
> Allie
>
>
> shapepoints = structure(c(8.9, 0, -7.7, 0, 0, 2, 0, 3.8), .Dim = c(4L,
> 2L), .Dimnames = list(NULL, c("x", "y")))
>
> smooth.spline(shapepoints)
>
> # repeat the first point to close the shape
> shapepoints = rbind(shapepoints, shapepoints[1,])
>
> smooth.spline(shapepoints)
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] TensorFlow in R

2016-04-01 Thread Charles Determan

Hi Axel,

Looks like the only thing right now is rflow (
https://github.com/terrytangyuan/rflow).  It appears to simply wrap around
the python bindings.  I am not aware of any others.  Be interesting to keep
an eye on.

Regards,
Charles

On Fri, Apr 1, 2016 at 11:32 AM, Axel Urbiz  wrote:

> Hi All,
>
> I didn't have much success through my Google search in finding any active
> R-related projects to create a wrapper around TensorFlow in R. Anyone know
> if this is on the go?
>
> Thanks,
> Axel.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A Neural Network question

2016-04-19 Thread Charles Determan

Hi Phil,

I don't think this is the correct list for this.  You question has nothing
to do with R specifically which is the purpose here.  I suggest you pursue
other help lists related to neural networks to try and find someone to
assist you.

Regards,
Charles

On Sat, Apr 16, 2016 at 2:08 AM, Philip Rhoades  wrote:

> People,
>
> I thought I needed to have some familiarity with NNs for some of my
> current (non-profit, brain-related) projects so I started looking at
> various programming environments including R and I got this working:
>
>   http://gekkoquant.com/2012/05/26/neural-networks-with-r-simple-example
>
> however I needed pictures to help understand what was going on and then I
> found this:
>
>
> https://jamesmccaffrey.files.wordpress.com/2012/11/backpropagationcalculations.jpg
>
> which I thought was almost intelligible so I had an idea which I thought
> would help the learning process:
>
> - Create a very simple NN implemented as a spreadsheet where each sheet
> would correspond to an iteration
>
> I started doing this on LibreOffice:
>
> - I think am already starting to get a better idea of how NNs work just
> from the stuff I have done on the spreadsheet already
>
> - I have now transferred my LibreOffice SpreadSheet (SS) to a shared
> Google Docs Calc file and can share it for editing with others
>
>
> https://docs.google.com/spreadsheets/d/1eSCgGU5qeI3_PmQhwZn4RH0NznUekVP5BP7w4MpKSUc/pub?output=pdf
>
> - I think I have the SS calculations correct so far except for the stuff
> in the dashed purple box in the diagram
>
> - I am not sure how to implement the purple box . . so I thought I would
> ask for help on this mailing list
>
> If someone can help me with the last bit of the SS, from there I think I
> can then repeat the FR and BP sheets and see how the Diffs evolve . .
>
> Is anyone interested in helping to get this last bit of the spreadsheet
> working so I can move on to doing actual work with the R packages with
> better understanding?
>
> Thanks,
>
> Phil.
> --
> Philip Rhoades
>
> PO Box 896
> Cowra  NSW  2794
> Australia
> E-mail:  p...@pricom.com.au
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] openssl package install error

2016-05-06 Thread Charles Determan

I am trying to install 'openssl' on ubuntu 14.04.  I already of libssl-dev
and libcurl4-openssl-dev installed.  But when I try to install I get a
bunch of errors complaining about 'unknown type 'u_char''.

Thoughts?

Excerpt of output:

Found pkg-config cflags and libs!
Using PKG_CFLAGS=
Using PKG_LIBS=-lssl -lcrypto
** libs
ccache gcc-4.8 -I/usr/share/R/include -DNDEBUG  -fpic  -std=c99 -c
aes.c -o aes.o
ccache gcc-4.8 -I/usr/share/R/include -DNDEBUG  -fpic  -std=c99 -c
base64.c -o base64.o
ccache gcc-4.8 -I/usr/share/R/include -DNDEBUG  -fpic  -std=c99 -c
bignum.c -o bignum.o
ccache gcc-4.8 -I/usr/share/R/include -DNDEBUG  -fpic  -std=c99 -c
cert.c -o cert.o
ccache gcc-4.8 -I/usr/share/R/include -DNDEBUG  -fpic  -std=c99 -c
diffie.c -o diffie.o
ccache gcc-4.8 -I/usr/share/R/include -DNDEBUG  -fpic  -std=c99 -c
envelope.c -o envelope.o
ccache gcc-4.8 -I/usr/share/R/include -DNDEBUG  -fpic  -std=c99 -c
error.c -o error.o
ccache gcc-4.8 -I/usr/share/R/include -DNDEBUG  -fpic  -std=c99 -c
hash.c -o hash.o
ccache gcc-4.8 -I/usr/share/R/include -DNDEBUG  -fpic  -std=c99 -c
info.c -o info.o
ccache gcc-4.8 -I/usr/share/R/include -DNDEBUG  -fpic  -std=c99 -c
keygen.c -o keygen.o
ccache gcc-4.8 -I/usr/share/R/include -DNDEBUG  -fpic  -std=c99 -c
onload.c -o onload.o
ccache gcc-4.8 -I/usr/share/R/include -DNDEBUG  -fpic  -std=c99 -c
openssh.c -o openssh.o
ccache gcc-4.8 -I/usr/share/R/include -DNDEBUG  -fpic  -std=c99 -c
rand.c -o rand.o
ccache gcc-4.8 -I/usr/share/R/include -DNDEBUG  -fpic  -std=c99 -c
read.c -o read.o
ccache gcc-4.8 -I/usr/share/R/include -DNDEBUG  -fpic  -std=c99 -c
rsa.c -o rsa.o
ccache gcc-4.8 -I/usr/share/R/include -DNDEBUG  -fpic  -std=c99 -c
signing.c -o signing.o
ccache gcc-4.8 -I/usr/share/R/include -DNDEBUG  -fpic  -std=c99 -c
ssl.c -o ssl.o
In file included from /usr/include/resolv.h:65:0,
 from ssl.c:15:
/usr/include/arpa/nameser.h:115:2: error: unknown type name ‘u_char’
  const u_char *_msg, *_eom;
  ^
/usr/include/arpa/nameser.h:117:2: error: unknown type name ‘u_char’
  const u_char *_sections[ns_s_max];
  ^
/usr/include/arpa/nameser.h:120:2: error: unknown type name ‘u_char’
  const u_char *_msg_ptr;

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Converting a list to a data frame

2016-11-04 Thread Charles Determan

Hi Kevin,

There may be a more elegant way but the following do.call and lapply should
solve your problem.

do.call(rbind, lapply(seq(length(x)), function(i) data.frame(set=i,
x[[i]])))

Regards,
Charles

On Fri, Nov 4, 2016 at 7:37 AM, Kevin E. Thorpe 
wrote:

> There is probably a very simple elegant way to do this, but I have been
> unable to find it. Here is a toy example. Suppose I have a list of data
> frames like this.
>
>  print(x <- list('1'=data.frame(id=1:4,expand.grid(x1=0:1,x2=0:1)),'2'=
> data.frame(id=5:8,expand.grid(x1=2:3,x2=2:3
> $`1`
>   id x1 x2
> 1  1  0  0
> 2  2  1  0
> 3  3  0  1
> 4  4  1  1
>
> $`2`
>   id x1 x2
> 1  5  2  2
> 2  6  3  2
> 3  7  2  3
> 4  8  3  3
>
> The real application will have more than 2 elements so I'm looking for a
> general approach. I basically want to rbind the data frames in each list
> element and add a variable that adds the element name. In this example the
> result would look something like this.
>
> rbind(data.frame(set='1',x[[1]]),data.frame(set='2',x[[2]]))
>   set id x1 x2
> 1   1  1  0  0
> 2   1  2  1  0
> 3   1  3  0  1
> 4   1  4  1  1
> 5   2  5  2  2
> 6   2  6  3  2
> 7   2  7  2  3
> 8   2  8  3  3
>
> Obviously, for 2 elements the simple rbind works but I would like a
> general solution for arbitrary length lists. Hopefully that is clear.
>
> Kevin
>
> --
> Kevin E. Thorpe
> Head of Biostatistics,  Applied Health Research Centre (AHRC)
> Li Ka Shing Knowledge Institute of St. Michael's Hospital
> Assistant Professor, Dalla Lana School of Public Health
> University of Toronto
> email: kevin.tho...@utoronto.ca  Tel: 416.864.5776  Fax: 416.864.3016
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] gpuR 1.2.0 released

2016-12-22 Thread Charles Determan

Dear R users,



I am happy to announce the most recent version of gpuR has been released.
There are several new enhancements to the package including:

1.Automatically detect available SDK on install if available

2.Simplified installation to build OpenCL ICD when have OpenCL driver
but no SDK installed (thanks Yixuan Qui)

3.Control over individual OpenCL contexts to allow user to choose
device to use

4.Added as.* methods for vclMatrix/Vector and gpuMatrix/Vector objects

5.Added str method for matrix objects

6.Added length method for matrix objects

7.Added solve method for square vclMatrix objects

8.Added QR-decompsition, SVD, Cholesky for square gpuMatrix/vclMatrix
objects

9.Added diag and diag<- method for matix objects


There are many more features in the works.  Suggestions and contributions
continue to be welcomed.  Please submit all through my github issues
https://github.com/cdeterman/gpuR.git


Also, thanks to all those as well for testing this package on various GPU
devices and operating systems.  A lot of the stability of this package is
made possible by your efforts.


Kind regards,

Charles

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multi-GPU "Yinyang" K-means and K-nn for R

2017-02-23 Thread Charles Determan

Hi Vadim,

I would be happy to explore helping you out with this.  I am quite active
in development for GPU use in R.  You can see my work on my github (
https://github.com/cdeterman) and the group I created for additional
packages in development (https://github.com/gpuRcore).  I believe it would
be best though to take this conversation off list though.  If you would
like to discuss this further please email me separately.

Kind regards,
Charles


On Thu, Feb 23, 2017 at 4:37 AM, Vadim Markovtsev 
wrote:

> ¡Hola!
>
> This is to announce that [kmcuda](https://github.com/src-d/kmcuda) has
> obtained native R bindings and ask for the help with CRAN packaging.
> kmcuda is my child: an efficient GPGPU (CUDA) library to do K-means
> and K-nn on as much data as fits into memory. It supports running on
> multiple GPUs simultaneously, angular distance metric, Yinyang
> refinement, float16 (well, not in R for sure), K-means++ and AFK-MC2
> initialization. I am thinking about Minibatch in the near future.
>
> Usage example:
>
> dyn.load("libKMCUDA.so")
> samples <- replicate(4, runif(16000))
> result = .External("kmeans_cuda", samples, 50, tolerance=0.01,
>  seed=777, verbosity=1)
> print(result$centroids)
> print(result$assignments[1:10,])
>
> This library only supports Linux and macOS at the moment. Windows
> port is welcome.
>
> I knew pretty much nothing about R a week ago so would be glad to your
> suggestions. Besides, I've never published anything to CRAN and it
> will take some time for me to design a full package following the
> guidelines and rules. It will be awesome If somebody is willing to
> help! It seems to be the special fun to package the CUDA+OpenMP
> code for R and this fun doubles on macOS where you need a specific
> combination of two different clang compilers to make it work.
>
> Besides, I have a question which prevents me from sleeping at night:
> how is R able to support matrices with dimensions larger than
> INT32_MAX if the only integer type in C API is int (32-bit signed on
> Linux)? Even getting the dimensions with INTEGER() automatically leads
> to the overflow.
> --
> Best regards,
>
> Vadim Markovtsev
> Lead Machine Learning Engineer || source{d} / sourced.tech / Madrid
> StackOverflow: 69708/markhor | GitHub: vmarkovtsev | data.world:
> vmarkovtsev
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] DeepNet package - how to add hidden layers?

2015-01-19 Thread Charles Determan Jr

Hi Davide,

You really shouldn't post on multiple forums.  Please see my response on
SO,
http://stackoverflow.com/questions/27990932/r-deepnet-package-how-to-add-more-hidden-layers-to-my-neural-network,
where I tell you that you can add additional layers by simply adding to the
hidden vector.

Regards,
Charles

On Fri, Jan 16, 2015 at 12:29 PM, davide.chi...@gmail.com <
davide.chi...@gmail.com> wrote:

> Hi
> I just started to study the "deepnet" package:
> http://cran.r-project.org/web/packages/deepnet/index.html
>
> It is about "deep leaning", so about the usage of multi-layer neural
> networks.
> I've started to use the train() functions available in the package,
> but I really cannot understand how to add more hidden layers in the
> neural networks.
> Does some of you have an idea?
>
> I am using the sae.dnn.train() function but I cannot understand which
> parameter controls the number of hidden layers.
>
> Thanks a lot,
>
>  -- Davide Chicco
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Dr. Charles Determan, PhD
Integrated Biosciences

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Neural Network

2015-01-21 Thread Charles Determan Jr

Javad,

You question is a little too broad to be answered definitively.  Also, this
is not a code writing service.  You should make a meaningful attempt and we
are here to help when you get stuck.

1. If you want to know if you can do neural nets, the answer is yes.  The
three packages most commonly used (that I know of) are 'neuralnet', 'nnet'
and 'RSNNS'.  You should look in to these package documentation for how to
use them.  There are also many examples online if you simply google them.

2. You question is unclear, are you wanting to predict all the variables
(e.g. phosphorus, Total N, etc.) or do you have some metric for
eutrophication?  What exactly is the model supposed to predict?

3. If you want to know if a neuralnet is appropriate, that is more of a
statistical question.  It depends more on the question you want to answer.
Given your temporal data, you may want to look in to mixed effects models
(e.g nlme, lme4) as another potential approach.

Regards,

On Tue, Jan 20, 2015 at 11:35 PM, javad bayat via R-help <
r-help@r-project.org> wrote:

> Dear all;
> I am the new user of R. I want to simulation or prediction the
> Eutrophication of a lake. I have weekly data(almost for two years) for
> Total phosphorus, Total N, pH, Chlorophyll a, Alkalinity, Silica.
> Can I predict the Eutrophication by Neural Network in R?
> How can I simulation the Eutrophication by these parameter?
> please help me to write the codes.
> many thanks.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Dr. Charles Determan, PhD
Integrated Biosciences

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Neural Network

2015-01-22 Thread Charles Determan Jr

Javad,

First, please make sure to hit 'reply all' so that these messages go to the
R help list so others (many far more skilled than I) may possibly chime in.

The problem here is that you appear to have no dependent variable (i.e. no
eutrophication variable).  Without it, there is no way to a typical
'supervised' analysis.  Given that this is likely a regression type problem
(I assume eutrophication would be continous) I'm not quite sure
'supervised' is the correct description but it furthers my point that you
need a dependent variable for any neuralnet algorithm I am aware of.  As
such, if you don't have a dependent variable then you will need to look at
unsupervised methods such as PCA.  Other users may have other suggestions.

Regards,
Charles

On Wed, Jan 21, 2015 at 11:36 PM, javad bayat  wrote:

> Dear Charles;
> Many thanks for your attention. what I want to know is: How can I predict
> the Eutrophication by these parameters in the future?
> These variables are the most important variables that control the Eutro.
> in lakes.
> Let me break it to two parts.
> 1) How can I predict these variables by NN?
> 2) Is it possible to predict the Eutro. by these variables?
>
>
> Many thanks for your help.
>  Regards,
>
>
>
>
>
>
>
> 
> On Wed, 1/21/15, Charles Determan Jr  wrote:
>
>  Subject: Re: [R] Neural Network
>  To: "javad bayat" 
>  Cc: "r-help@r-project.org" 
>  Date: Wednesday, January 21, 2015, 9:10 PM
>
>  Javad,
>  You
>  question is a little too broad to be answered
>  definitively.  Also, this is not a code writing service.
>  You should make a meaningful attempt and we are here to help
>  when you get stuck.
>  1.
>  If you want to know if you can do neural nets, the answer is
>  yes.  The three packages most commonly used (that I know
>  of) are 'neuralnet', 'nnet' and
>  'RSNNS'.  You should look in to these package
>  documentation for how to use them.  There are also many
>  examples online if you simply google them.
>  2. You question is unclear, are you
>  wanting to predict all the variables (e.g. phosphorus, Total
>  N, etc.) or do you have some metric for eutrophication?
>  What exactly is the model supposed to predict?
>  3. If you want to know if a
>  neuralnet is appropriate, that is more of a statistical
>  question.  It depends more on the question you want to
>  answer.  Given your temporal data, you may want to look in
>  to mixed effects models (e.g nlme, lme4) as another
>  potential approach.
>  Regards,
>  On Tue, Jan 20, 2015 at
>  11:35 PM, javad bayat via R-help 
>  wrote:
>  Dear
>  all;
>
>  I am the new user of R. I want to simulation or prediction
>  the Eutrophication of a lake. I have weekly data(almost for
>  two years) for Total phosphorus, Total N, pH, Chlorophyll a,
>  Alkalinity, Silica.
>
>  Can I predict the Eutrophication by Neural Network in R?
>
>  How can I simulation the Eutrophication by these
>  parameter?
>
>  please help me to write the codes.
>
>  many thanks.
>
>
>
>  ______
>
>  R-help@r-project.org
>  mailing list -- To UNSUBSCRIBE and more, see
>
>  https://stat.ethz.ch/mailman/listinfo/r-help
>
>  PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>
>  and provide commented, minimal, self-contained, reproducible
>  code.
>
>
>
>
>  --
>  Dr. Charles Determan, PhD
>  Integrated Biosciences
>
>
>


-- 
Dr. Charles Determan, PhD
Integrated Biosciences

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot courses?

2015-01-22 Thread Charles Determan Jr

I don't know about any courses but I recommend the cookbook for R website:

http://www.cookbook-r.com/Graphs/

There are many examples implementing ggplot2 for different types of plots.

Hope this helps,

On Thu, Jan 22, 2015 at 12:14 PM, Erin Hodgess 
wrote:

> Hello!
>
> Are there any ggplot courses, please?  This would be for a beginner in
> ggplot; I know the base plot, but nothing from there.
>
> Thanks,
> Erin
>
>
> --
> Erin Hodgess
> Associate Professor
> Department of Mathematical and Statistics
> University of Houston - Downtown
> mailto: erinm.hodg...@gmail.com
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Dr. Charles Determan, PhD
Integrated Biosciences

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Neural Network

2015-01-26 Thread Charles Determan Jr

Javad,

You misunderstand what is meant be 'dependent' and 'independent'
variables.  What you are describing is with respect to statistical
independence.  Please review these basic statistical concepts
http://en.wikipedia.org/wiki/Dependent_and_independent_variables.  Perhaps,
the terms 'explanatory' (e.g. your phosphorus, nitrogen, etc.) and
'response' (e.g. eutrophication) variables are more approachable.

Now, as I was saying in my first response, you don't appear to have a
dependent/response variable (i.e. Eutrophication).  No where in your data
do you say that Eutrophication was measured or is represented in any way.
Now, I assume you have 'a priori' knowledge that those variables are
involved with eutrophication.  You are now asking if you can predict
eutrophication from these variables.  Well, without something for a
statistical model to evaluate against there is no means to do so, hence the
exploratory, unsupervised analysis I recommended.

With respect to your other question, "How can I predict these variables by
NN?", well you need something to test against.  For example, let's say I
want to predict how much ice cream will be sold today and I have a bunch of
data with amounts of ice cream sold but no other data.  No matter how you
approach this problem, you cannot get much out of a list of numbers with
nothing to test against.

Now, if my ice cream data has the amounts of ice cream and temperatures of
each day associated with the respective sold amount, now I can do
something.  I can do my basic linear regression so help predict how much
ice cream will be sold given today's temperature.

The same appears to be true of your data.  You have your variables, you
have all of your response variables (assuming you are trying to predict
Nitrogen, Chlorophyll, etc.) but nothing to test against.  The best you may
have is your time data which I can only assume is actual dates?  If so, you
could do some form of prediction based on the date.  If your data is just
every two weeks (no date, just repeated measures) you could analyze it
temporally to see if the various nutrients are changing over time and
potentially extrapolate (with caution) where the levels may ultimately
reach.  This may be of interest to you.

As a last point, seeing as this is environmental analysis you could also
try the R-sig-ecology mailing list.  I am admittedly not an ecologist and
there may be some other approaches or methods that could possibly be used.
Feel free to sign up on that list here
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

I hope this explanation helps you get a better grasp of what you are trying
to accomplish.
Regards,

On Sat, Jan 24, 2015 at 12:41 AM, javad bayat  wrote:

> Dear Charles;
> I think my variables are dependent. For e.g. the concentration of
> Phosphorus, Nitrogen, Silica and etc. have effect on the present of
> Chlorophyll a and the concentration of Chlorophyll a can make the
> Eutrophication in lake along with other algeas.
> So I think they are dependent variables.
> Regards.
>
>
>
> 
> On Thu, 1/22/15, Charles Determan Jr  wrote:
>
>  Subject: Re: [R] Neural Network
>  To: "javad bayat" , "r-help@r-project.org" <
> r-help@r-project.org>
>  Date: Thursday, January 22, 2015, 4:41 PM
>
>  Javad,
>  First,
>  please make sure to hit 'reply all' so that these
>  messages go to the R help list so others (many far more
>  skilled than I) may possibly chime in.
>  The problem here is that you appear
>  to have no dependent variable (i.e. no eutrophication
>  variable).  Without it, there is no way to a typical
>  'supervised' analysis.  Given that this is likely a
>  regression type problem (I assume eutrophication would be
>  continous) I'm not quite sure 'supervised' is
>  the correct description but it furthers my point that you
>  need a dependent variable for any neuralnet algorithm I am
>  aware of.  As such, if you don't have a dependent
>  variable then you will need to look at unsupervised methods
>  such as PCA.  Other users may have other
>  suggestions.
>  Regards,Charles
>  On Wed, Jan 21, 2015 at
>  11:36 PM, javad bayat 
>  wrote:
>  Dear
>  Charles;
>
>  Many thanks for your attention. what I want to know is: How
>  can I predict the Eutrophication by these parameters in the
>  future?
>
>  These variables are the most important variables that
>  control the Eutro. in lakes.
>
>  Let me break it to two parts.
>
>  1) How can I predict these variables by NN?
>
>  2) Is it possible to predict the Eutro. by these
>  variables?
>
>
>
>
>
>  Many thanks for your help.
>
>   Regards,
>
>
>
>
>
>
&g

Re: [R] missing in neural network

2015-03-24 Thread Charles Determan Jr

Hi Soheila,

You are using the formula argument incorrectly.  The neuralnet function has
a separate argument for data aptly names 'data'.  You can review the
arguments by looking at the documentation  with ?neuralnet.

As I cannot reproduce your data the following is not tested but I think
should work for you.

# Join your response variable to your data set.
mydata <- cbind(data, resp)

# Run neuralnet
out <- neuralnet(resp ~ ., data=mydata, hidden = 4, lifesign = "minimal",
   linear.output = FALSE, threshold = 0.1,na.rm = TRUE)

Best,
Charles

On Tue, Mar 24, 2015 at 4:47 AM, Soheila Khodakarim 
wrote:

> Dear All,
>
> I want to run "neural network" on my dataset.
> ##
> resp<-c(1,1,1,0,1,0,1,0,1,0,1,0,1,0,1,1,0,1,0,1)
> dim(data)
> #20*3110
>
> out <- neuralnet(y ~ data, hidden = 4, lifesign = "minimal", linear.output
> = FALSE, threshold = 0.1,na.rm = TRUE)
> 
> but I see this Error
> Error in varify.variables(data, formula, startweights, learningrate.limit,
>  :
>   argument "data" is missing, with no default
>
> What should I do now??
>
> Best Regards,
> Soheila
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] missing in neural network

2015-03-24 Thread Charles Determan Jr

I should have actually created some test code for you.  Here is an example:

library(neuralnet)
data(infert)

# create your formula
fm <- as.formula(paste("case ~ ", paste(colnames(infert)[c(3,4,6)],
collapse="+")))

# call neuralnet
net.infert <- neuralnet(fm, infert,
err.fct="ce", linear.output=FALSE, likelihood=TRUE)

You don't want to index your dataset like that, you should utilize the
formula interface.  Interestingly, the '.' notation doesn't seem to work
here.  Your formula call probably will look like this:

fm <- as.formula(paste("resp ~ ", paste(colnames(data), collapse="+")))

And you call would be

out1 <- neuralnet(fm,data=mydata, hidden = 4, lifesign = "minimal",
linear.output = FALSE, threshold = 0.1)

Regards,

Charles

On Tue, Mar 24, 2015 at 9:56 AM, Soheila Khodakarim 
wrote:

> Dear Charles,
>
> Thanks for your guide.
> I run this code:
>
> library("neuralnet")
> resp<-c(1,1,1,0,1,0,1,0,1,0,1,0,1,0,1,1,0,1,0,1)
> mydata <- cbind(data, resp)
> out1 <- neuralnet(resp~mydata[,1:3110],data=mydata, hidden = 4, lifesign =
> "minimal", linear.output = FALSE, threshold = 0.1)
>
> I saw this error
>
> Error in neurons[[i]] %*% weights[[i]] : non-conformable arguments
>
> :(:(:(
>
> What should I do now??
>
> Regards,
> Soheila
>
>
> On Tue, Mar 24, 2015 at 3:48 PM, Charles Determan Jr 
> wrote:
>
>> Hi Soheila,
>>
>> You are using the formula argument incorrectly.  The neuralnet function
>> has a separate argument for data aptly names 'data'.  You can review the
>> arguments by looking at the documentation  with ?neuralnet.
>>
>> As I cannot reproduce your data the following is not tested but I think
>> should work for you.
>>
>> # Join your response variable to your data set.
>> mydata <- cbind(data, resp)
>>
>> # Run neuralnet
>> out <- neuralnet(resp ~ ., data=mydata, hidden = 4, lifesign = "minimal",
>>linear.output = FALSE, threshold = 0.1,na.rm =
>> TRUE)
>>
>>
>> Best,
>> Charles
>>
>> On Tue, Mar 24, 2015 at 4:47 AM, Soheila Khodakarim <
>> lkhodaka...@gmail.com> wrote:
>>
>>> Dear All,
>>>
>>> I want to run "neural network" on my dataset.
>>> ##
>>> resp<-c(1,1,1,0,1,0,1,0,1,0,1,0,1,0,1,1,0,1,0,1)
>>> dim(data)
>>> #20*3110
>>>
>>> out <- neuralnet(y ~ data, hidden = 4, lifesign = "minimal",
>>> linear.output
>>> = FALSE, threshold = 0.1,na.rm = TRUE)
>>> 
>>> but I see this Error
>>> Error in varify.variables(data, formula, startweights,
>>> learningrate.limit,
>>>  :
>>>   argument "data" is missing, with no default
>>>
>>> What should I do now??
>>>
>>> Best Regards,
>>> Soheila
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>>
>>
>


-- 
Dr. Charles Determan, PhD
Integrated Biosciences

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fwd: missing in neural network

2015-03-25 Thread Charles Determan Jr

Soheila,

Set the name of the last column and remove the comma indexing the column
names.  It is a vector and therefore doesn't need a comma for indexing.
Also, after loading your dataset I realized you also have invalid column
names.  The mixes between hyphens and underscores makes the as.formula call
unhappy.  A simple way to fix is to just change all hyphens to underscores
with gsub.

# name 'resp'
colnames(mydata)[ncol(mydata)] <- "resp"

# change hyphens to underscores
colnames(mydata) <- gsub("-","_",colnames(mydata))

# create formula (without comma index on column names)
fm <- as.formula(paste("resp ~", paste(colnames(mydata)[1:3110],
collapse="+")))

# call neuralnet
out <- neuralnet(fm,data=mydata, hidden = 4, lifesign = "minimal",
linear.output = FALSE, threshold = 0.1)

Best regards,
Charles


On Wed, Mar 25, 2015 at 3:29 AM, Soheila Khodakarim 
wrote:

> Dear Charles,
>
> I rewrote code :
> library("neuralnet")
> resp<-c(1,1,1,0,1,0,1,0,1,0,1,0,1,0,1,1,0,1,0,1))
> mydata <- cbind(data24_2, resp)
> dim(mydata)
>  >  20 3111
> fm <- as.formula(paste("resp ~ ", paste(colnames(mydata)[,1:3110],
> collapse="+")))
> > Error in colnames(mydata)[, 1:3110] : incorrect number of dimensions
> :(((
>
> AND
>
> fm <- as.formula(paste(colnames(mydata)[,3111],
> paste(colnames(mydata)[,1:3110], collapse="+")))
> > Error in colnames(mydata)[, 3111] : incorrect number of dimensions
>
> Best,
> Soheila
>
> On Wed, Mar 25, 2015 at 11:12 AM, Soheila Khodakarim <
> lkhodaka...@gmail.com> wrote:
>
>> Hi Charles,
>> Many thanks for your help. I will check and let you know.
>>
>> Best Wishes,
>> Soheila
>> On Mar 25, 2015 12:17 AM, "Charles Determan Jr"  wrote:
>>
>>> Soheila,
>>>
>>> Did my second response help you?  It is polite to close say if so, that
>>> way others who come across the problem no that it was solved.  If not, feel
>>> free to update your question.
>>>
>>> Regards,
>>> Charles
>>>
>>> On Tue, Mar 24, 2015 at 9:58 AM, Soheila Khodakarim <
>>> lkhodaka...@gmail.com> wrote:
>>>
>>>> Dear Charles,
>>>>
>>>> Thanks for your guide.
>>>> I run this code:
>>>>
>>>> library("neuralnet")
>>>> resp<-c(1,1,1,0,1,0,1,0,1,0,1,0,1,0,1,1,0,1,0,1)
>>>> mydata <- cbind(data, resp)
>>>> out1 <- neuralnet(resp~mydata[,1:3110],data=mydata, hidden = 4,
>>>> lifesign =
>>>> "minimal", linear.output = FALSE, threshold = 0.1)
>>>>
>>>> I saw this error
>>>>
>>>> Error in neurons[[i]] %*% weights[[i]] : non-conformable arguments
>>>>
>>>> :(:(:(
>>>>
>>>> What should I do now??
>>>>
>>>> Regards,
>>>> Soheila
>>>>
>>>>
>>>> On Tue, Mar 24, 2015 at 3:48 PM, Charles Determan Jr 
>>>> wrote:
>>>>
>>>> > Hi Soheila,
>>>> >
>>>> > You are using the formula argument incorrectly.  The neuralnet
>>>> function
>>>> > has a separate argument for data aptly names 'data'.  You can review
>>>> the
>>>> > arguments by looking at the documentation  with ?neuralnet.
>>>> >
>>>> > As I cannot reproduce your data the following is not tested but I
>>>> think
>>>> > should work for you.
>>>> >
>>>> > # Join your response variable to your data set.
>>>> > mydata <- cbind(data, resp)
>>>> >
>>>> > # Run neuralnet
>>>> > out <- neuralnet(resp ~ ., data=mydata, hidden = 4, lifesign =
>>>> "minimal",
>>>> >linear.output = FALSE, threshold = 0.1,na.rm =
>>>> > TRUE)
>>>> >
>>>> >
>>>> > Best,
>>>> > Charles
>>>> >
>>>> > On Tue, Mar 24, 2015 at 4:47 AM, Soheila Khodakarim <
>>>> lkhodaka...@gmail.com
>>>> > > wrote:
>>>> >
>>>> >> Dear All,
>>>> >>
>>>> >> I want to run "neural network" on my dataset.
>>>> >> ##
>>>> >> resp<-c(1,1,1,0,1,0,1,0,1,0,1,0,1,0,1,1,0,1,0,1)
>>>> >&g

Re: [R] simulation dichotomous data

2014-07-31 Thread Charles Determan Jr

Thanoon,

You should still send the question to the R help list even when I helped
you with the code you are currently using.  I will not always know the best
way or even how to proceed with some questions.  As for to your question
with the code below.

Firstly, there is no 'phi' method for cor in base R.  If you are using it,
you must have neglected to include a package you are using.  However, given
that the phi coefficient is equal to the pearson coefficient for
dichotomous data, you can use the 'pearson' method.

Secondly, with respect to your primary concern.  In this case, we have
randomly chosen variables to correlate between two INDEPENDENT DATASETS
(i.e. different groups of samples).  The idea with this code is that R1 and
R2 are datasets of 1000 samples and 10 variables.  It would be miraculous
if they correlated when each had variables randomly assigned as
correlated.  The code work correctly, the question now becomes if you want
to see correlations across variables for all samples (which this does for
each DATASET) or if you want two DATASETS to be correlated.

ords <- seq(0,1)
p <- 10
N <- 1000
percent_change <- 0.9

R1 <- as.data.frame(replicate(p, sample(ords, N, replace = T)))
R2 <- as.data.frame(replicate(p, sample(ords, N, replace = T)))

# phi is more appropriate for dichotomous data
cor(R1, method = "phi")
cor(R2, method = "phi")

# subset variable to have a stronger correlation
v1 <- R1[,1, drop = FALSE]
v1 <- R2[,1, drop = FALSE]

# randomly choose which rows to retain
keep <- sample(as.numeric(rownames(v1)), size = percent_change*nrow(v1))
change <- as.numeric(rownames(v1)[-keep])

# randomly choose new values for changing
new.change <- sample(ords, ((1-percent_change)*N)+1, replace = T)

# replace values in copy of original column
v1.samp <- v1
v1.samp[change,] <- new.change

# closer correlation
cor(v1, v1.samp, method = "phi")

# set correlated column as one of your other columns
R1[,2] <- v1.samp
R2[,2] <- v1.samp
R1
R2

On Thu, Jul 31, 2014 at 7:29 AM, thanoon younis 
wrote:

> dear Dr. Charles
> i have a problem with the following R - program in simulation data with 2
> different samples and with high correlation between variables in each
> sample so when i applied the program i got on a results but without
> correlation between each sample.
> i appreciate your help and your time
> i did not send this code to R- help because you helped me before to write
> it .
>
> many thanks to you
>
> Thanoon
>

-- 
Dr. Charles Determan, PhD
Integrated Biosciences

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] simulation dichotomous data

2014-08-01 Thread Charles Determan Jr

Please remember the 'reply all' for the r-help page.

First Question: How can i use Pearson correlation with dichotomous data? i
want to use a correlation between dichotomous variables like spearman
correlation in ordered categorical variables?

cor(variable1, variable2, *method = "pearson"*)

Second Question: Would like two separate populations (1000 samples, 10
var).  Variables *within* datasets highly correlated, minimal correlation
*between* datasets.

As I have stated in a previous response, the code you have is sufficient.
You can go through as many variables as you like *for each dataset* and
induce correlations.  You should do this for as many variables as you
require to be correlated.  As the code induces these correlations randomly,
there should be *minimal* correlation between datasets but still some if
the datasets have the same structure (same variables correlated within).
If different variables are correlated within each, then the correlation
between datasets would likely be lower.  It is extremely unrealistic to
believe that there will be absolutely no correlation between datasets so
you must decide at which point you consider it sufficiently low.

One final point, in the code section "# subset variable to have a stronger
correlation", you can only do one at a time or you must change the name of
the second object otherwise you are just overwriting the previous 'v1'.

You have described what you want to me and you have the code to do it.  The
major hurdle here would be an implementation of some 'for loops', which is
not terribly complex if you are working on your programming.  However, they
are not necessary if you just want to write several lines with new object
names for each variable in each dataset.  Give it a try, you know how to
induce correlations now.  Just chose which variables to correlate and do it
for all of those for each dataset and compare.

Regards,
Dr. Charles Determan

On Thu, Jul 31, 2014 at 9:10 AM, thanoon younis 
wrote:

> Many thanks to you
>
> firstly : how can i use Pearson correlation with dichotomous data? i want
> to use a correlation between dichotomous variables like spearman
> correlation in ordered categorical variables.
>
> secondly: i have two different population and each population has 1000
> samples and 10 var. so i want to put a high correlation coefficient between
> variables in the  first population and also put a high correlation
> coefficient between variables in the  second population and no correlation
> between two populations because i want to use multiple group structural
> equation models.
>
>
> many thanks again
>
> Thanoon
>
>
>
>
> On 31 July 2014 16:45, Charles Determan Jr  wrote:
>
>> Thanoon,
>>
>> You should still send the question to the R help list even when I helped
>> you with the code you are currently using.  I will not always know the best
>> way or even how to proceed with some questions.  As for to your question
>> with the code below.
>>
>> Firstly, there is no 'phi' method for cor in base R.  If you are using
>> it, you must have neglected to include a package you are using.  However,
>> given that the phi coefficient is equal to the pearson coefficient for
>> dichotomous data, you can use the 'pearson' method.
>>
>> Secondly, with respect to your primary concern.  In this case, we have
>> randomly chosen variables to correlate between two INDEPENDENT DATASETS
>> (i.e. different groups of samples).  The idea with this code is that R1 and
>> R2 are datasets of 1000 samples and 10 variables.  It would be miraculous
>> if they correlated when each had variables randomly assigned as
>> correlated.  The code work correctly, the question now becomes if you want
>> to see correlations across variables for all samples (which this does for
>> each DATASET) or if you want two DATASETS to be correlated.
>>
>> ords <- seq(0,1)
>> p <- 10
>> N <- 1000
>> percent_change <- 0.9
>>
>> R1 <- as.data.frame(replicate(p, sample(ords, N, replace = T)))
>> R2 <- as.data.frame(replicate(p, sample(ords, N, replace = T)))
>>
>> # phi is more appropriate for dichotomous data
>> cor(R1, method = "phi")
>> cor(R2, method = "phi")
>>
>> # subset variable to have a stronger correlation
>> v1 <- R1[,1, drop = FALSE]
>> v1 <- R2[,1, drop = FALSE]
>>
>> # randomly choose which rows to retain
>> keep <- sample(as.numeric(rownames(v1)), size = percent_change*nrow(v1))
>> change <- as.numeric(rownames(v1)[-keep])
>>
>> # randomly choose new values for changing
>> new.change <- sample(ords, ((1-percent_change)*N)+1, re

[R] shiny datatables column filtering plugin

2014-09-02 Thread Charles Determan Jr

Greetings,

I am currently exploring some capabilities of the 'Shiny' package.  I am
currently working with the most recent version of 'shiny' from the rstudio
github repository (version - 0.10.1.9006) in order to use the most up to
date datatables plugin.  Using the ggplot2 diamonds dataset, I can easily
set columns as unsearchable (commented out below) and I could also subset
out all the 'Ideal' diamonds for example, however I cannot filter out
multiple conditions such as 'Ideal' and 'Fair' diamonds together.  From my
searching, this multiple filtering can be done with checkboxes from the
column using the jquery column filtering plugin (
http://jquery-datatables-column-filter.googlecode.com/svn/trunk/checkbox.html).
Despite this, I cannot get this plugin to work with my shiny app.  Any
insight would be appreciated.

library(shiny)
library(ggplot2)
runApp(
  list(ui = basicPage(
h1('Diamonds DataTable with TableTools'),

# added column filter plugin

singleton(tags$head(tags$script(src='https://code.google.com/p/jquery-datatables-column-filter/source/browse/trunk/media/js/jquery.dataTables.columnFilter.js',
type='text/javascript'))),
dataTableOutput("mytable")
  )
  ,server = function(input, output) {
output$mytable = renderDataTable({
  diamonds[,1:6]
}, options = list(
  pageLength = 10,#   columnDefs = I('[{"targets": [0,1],
"searchable": false}]')
  columnFilter = I('[{
columnDefs: ["targets": [0,1], type: "checkbox"]
}]')

)
)
  }
  ))



Charles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] shiny datatables column filtering plugin

2014-09-03 Thread Charles Determan Jr

Thank you for checking Yihui, on the off chance are you familiar with any
other methods to filter on multiple conditions?


On Tue, Sep 2, 2014 at 11:07 PM, Yihui Xie  wrote:

> I just tested it and this plugin does not seem to work with the new
> .DataTable() API in DataTables 1.10.x, so I guess it is unlikely to
> make it work in (the current development version of) shiny. It is not
> in the official list of plugins, either:
> http://www.datatables.net/extensions/index
>
> Regards,
> Yihui
> --
> Yihui Xie 
> Web: http://yihui.name
>
>
> On Tue, Sep 2, 2014 at 11:59 AM, Charles Determan Jr 
> wrote:
> > Greetings,
> >
> > I am currently exploring some capabilities of the 'Shiny' package.  I am
> > currently working with the most recent version of 'shiny' from the
> rstudio
> > github repository (version - 0.10.1.9006) in order to use the most up to
> > date datatables plugin.  Using the ggplot2 diamonds dataset, I can easily
> > set columns as unsearchable (commented out below) and I could also subset
> > out all the 'Ideal' diamonds for example, however I cannot filter out
> > multiple conditions such as 'Ideal' and 'Fair' diamonds together.  From
> my
> > searching, this multiple filtering can be done with checkboxes from the
> > column using the jquery column filtering plugin (
> >
> http://jquery-datatables-column-filter.googlecode.com/svn/trunk/checkbox.html
> ).
> > Despite this, I cannot get this plugin to work with my shiny app.  Any
> > insight would be appreciated.
> >
> > library(shiny)
> > library(ggplot2)
> > runApp(
> >   list(ui = basicPage(
> > h1('Diamonds DataTable with TableTools'),
> >
> > # added column filter plugin
> > singleton(tags$head(tags$script(src='
> https://code.google.com/p/jquery-datatables-column-filter/source/browse/trunk/media/js/jquery.dataTables.columnFilter.js
> ',
> > type='text/javascript'))),
> > dataTableOutput("mytable")
> >   )
> >   ,server = function(input, output) {
> > output$mytable = renderDataTable({
> >   diamonds[,1:6]
> > }, options = list(
> >   pageLength = 10,#   columnDefs = I('[{"targets": [0,1],
> > "searchable": false}]')
> >   columnFilter = I('[{
> > columnDefs: ["targets": [0,1], type: "checkbox"]
> > }]')
> >
> > )
> > )
> >   }
> >   ))
> >
> >
> >
> > Charles
>


Charles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] shiny datatables column filtering plugin

2014-09-03 Thread Charles Determan Jr

Thank you Yihui, this would certainly work for me however I have having
trouble getting the regex to work appropriately.  I am using the
developmental version of shiny and have copied your code.  I launch the app
and the filtering of numbers works fine (i.e. 4,5) but the search for
setosa and versicolor gives me a blank datatable.  Is there some dependency
that I am missing that would prevent this regex to work with shiny?


On Wed, Sep 3, 2014 at 11:27 AM, Yihui Xie  wrote:

> The built-in version of DataTables in shiny has already supported
> numeric ranges. For a numeric column x in data, if you type a,b in the
> search box, the data will be filtered using a <= x <= b. The check
> boxes are not supported, but you can use regular expressions (more
> flexible) to achieve the same thing, e.g. (this example requires the
> development version of shiny:
> https://groups.google.com/forum/#!topic/shiny-discuss/-0u-wTnq_lA)
>
> library(shiny)
> runApp(list(
>   ui = fluidPage(
> dataTableOutput("mytable")
>   ),
>   server = function(input, output) {
> output$mytable = renderDataTable(
>   iris[sample(nrow(iris)), ],
>   options = list(search = list(regex = TRUE))
> )
>   }
> ))
>
>
> Then you can search for ^setosa|versicolor$, which means both setosa
> and versicolor in the iris data. Or 4,5 in the search box of
> Sepal.Length to filter this column. Depending on what you want, this
> may or may not be enough.
>
> Regards,
> Yihui
> --
> Yihui Xie 
> Web: http://yihui.name
>
>
> On Wed, Sep 3, 2014 at 7:12 AM, Charles Determan Jr 
> wrote:
> > Thank you for checking Yihui, on the off chance are you familiar with any
> > other methods to filter on multiple conditions?
> >
> >
> > On Tue, Sep 2, 2014 at 11:07 PM, Yihui Xie  wrote:
> >>
> >> I just tested it and this plugin does not seem to work with the new
> >> .DataTable() API in DataTables 1.10.x, so I guess it is unlikely to
> >> make it work in (the current development version of) shiny. It is not
> >> in the official list of plugins, either:
> >> http://www.datatables.net/extensions/index
> >>
> >> Regards,
> >> Yihui
> >> --
> >> Yihui Xie 
> >> Web: http://yihui.name
> >>
> >>
> >> On Tue, Sep 2, 2014 at 11:59 AM, Charles Determan Jr 
> >> wrote:
> >> > Greetings,
> >> >
> >> > I am currently exploring some capabilities of the 'Shiny' package.  I
> am
> >> > currently working with the most recent version of 'shiny' from the
> >> > rstudio
> >> > github repository (version - 0.10.1.9006) in order to use the most up
> to
> >> > date datatables plugin.  Using the ggplot2 diamonds dataset, I can
> >> > easily
> >> > set columns as unsearchable (commented out below) and I could also
> >> > subset
> >> > out all the 'Ideal' diamonds for example, however I cannot filter out
> >> > multiple conditions such as 'Ideal' and 'Fair' diamonds together.
> From
> >> > my
> >> > searching, this multiple filtering can be done with checkboxes from
> the
> >> > column using the jquery column filtering plugin (
> >> >
> >> >
> http://jquery-datatables-column-filter.googlecode.com/svn/trunk/checkbox.html
> ).
> >> > Despite this, I cannot get this plugin to work with my shiny app.  Any
> >> > insight would be appreciated.
> >> >
> >> > library(shiny)
> >> > library(ggplot2)
> >> > runApp(
> >> >   list(ui = basicPage(
> >> > h1('Diamonds DataTable with TableTools'),
> >> >
> >> > # added column filter plugin
> >> >
> >> > singleton(tags$head(tags$script(src='
> https://code.google.com/p/jquery-datatables-column-filter/source/browse/trunk/media/js/jquery.dataTables.columnFilter.js
> ',
> >> > type='text/javascript'))),
> >> > dataTableOutput("mytable")
> >> >   )
> >> >   ,server = function(input, output) {
> >> > output$mytable = renderDataTable({
> >> >   diamonds[,1:6]
> >> > }, options = list(
> >> >   pageLength = 10,#   columnDefs = I('[{"targets": [0,1],
> >> > "searchable": false}]')
> >> >   columnFilter = I('[{
> >> > columnDefs: ["targets": [0,1], type:
> "checkbox"]
> >> > }]')
> >> >
> >> > )
> >> > )
> >> >   }
> >> >   ))
> >> >
> >> >
> >> >
> >> > Charles
> >
> >
> >
> > Charles
>



-- 
Dr. Charles Determan, PhD
Integrated Biosciences

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Margins to fill matrix

2014-09-11 Thread Charles Determan Jr

Do you have an example of what you would like your output to look like?  It
is a little difficult to fully understand what you are looking for.  You
only have 18 values but are looking to fill at 10x8 matrix (i.e. 80
values).  If you can clarify better we may be better able to help you.

Charles

On Thu, Sep 11, 2014 at 3:47 AM, Stefan Petersson  wrote:

> Hi,
>
> I have two vector of margins. Now I want to create "fill" matrix that
> reflects the margins.
>
>  seats <- c(17,24,28,30,34,36,40,44,46,50)
>  mandates <- c(107,23,24,19,112,19,25,20)
>
> Both vectors adds up to 349. So I want a 10x8 matrix with row sums
> corresponding to "seats" and column sums corresponding to "mandates".
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Dr. Charles Determan, PhD
Integrated Biosciences

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to sum some columns based on their names

2014-10-13 Thread Charles Determan Jr

You can use grep with some basic regex, index your dataframe, and colSums

colSums(df[,grep("*6574*|*7584*|*85*", colnames(df))])
colSums(df[,grep("f6574*|f7584*|f85*", colnames(df))])


Regards,
Dr. Charles Determan

On Mon, Oct 13, 2014 at 7:57 AM, Kuma Raj  wrote:

> I want to sum columns based on their names. As an exampel how could I
> sum columns which contain 6574, 7584 and 85 as column names?  In
> addition, how could I sum those which contain 6574, 7584 and 85 in
> ther names and have a prefix "f". My data contains several variables
> with
>
> I want to sum columns based on their names. As an exampel how could I
> sum columns which contain 6574, 7584 and 85 as column names?  In
> addition, how could I sum those which contain 6574, 7584 and 85 in
> ther names and have a prefix "f". My data contains several variables
> with
>
> dput(df1)
> structure(list(date = structure(c(1230768000, 1230854400, 1230940800,
> 1231027200, 1231113600, 123120, 1231286400, 1231372800, 1231459200,
> 1231545600, 1231632000), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
> f014card = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), f1534card = c(0,
> 1, 1, 0, 0, 1, 0, 0, 1, 0, 1), f3564card = c(1, 6, 1, 5,
> 5, 4, 4, 7, 6, 4, 6), f6574card = c(3, 6, 4, 5, 5, 2, 10,
> 3, 4, 2, 4), f7584card = c(13, 6, 1, 4, 10, 6, 8, 12, 10,
> 4, 3), f85card = c(5, 3, 1, 0, 2, 10, 7, 9, 1, 7, 3), m014card = c(0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), m1534card = c(0, 0, 1, 0,
> 0, 0, 0, 1, 1, 1, 0), m3564card = c(12, 7, 4, 7, 12, 13,
> 12, 7, 12, 2, 11), m6574card = c(3, 4, 8, 8, 8, 10, 7, 6,
> 7, 7, 5), m7584card = c(8, 10, 5, 4, 12, 7, 14, 11, 9, 1,
> 11), m85card = c(1, 4, 3, 0, 3, 4, 5, 5, 4, 5, 0)), .Names = c("date",
> "f014card", "f1534card", "f3564card", "f6574card", "f7584card",
> "f85card", "m014card", "m1534card", "m3564card", "m6574card",
> "m7584card", "m85card"), class = "data.frame", row.names = c("1",
> "2", "3", "4", "5", "6", "7", "8", "9", "10", "11"))
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Dr. Charles Determan, PhD
Integrated Biosciences

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using sapply instead of for loop

2014-11-19 Thread Charles Determan Jr

Amit,

Your question isn't necessarily complete.  You haven't provided a
reproducible example of your data or an error message.  At first glance you
aren't passing anything to your 'far' function except for 'p' and yet it
uses i,j,k,l,m,n,testsize1, and act1.  You should generally try to avoid
global variables as they can lead to broken code.  You should redefine your
function with all the needed parameters and try again.

Regards,

On Wed, Nov 19, 2014 at 3:47 AM, Amit Thombre 
wrote:

> I am trying to replace a for loop by using sapply, The code is for
> forecasting using arima. The code is as follows:-
> ---
> far<-function(p)
> {
>
> cat("does it come here value of p", p)
> tryCatch({
> air.model <-Arima(tsa,order=c(i-1,j-1,k-1),
> seasonal=list(order=c(l-1,m-1,n-1),period=p-1), lambda=lbda)  # the arima
> model
>
> f<- forecast(air.model,h=testsize1) # for getting the error
>
> ervalue[i,j,k,l,m,n,p]<-errf(act1,f$mean,testsize1,flagarima)
>
> }, error=function(e)
> {
>
> return(NA)
> }
> )
> cat("Value of error", ervalue[i,j,k,l,m,n,p])
> cat("Value of i,j,k,l,m,n,p", i, j, k, l, m, n,p)
> print(ervalue)
> return(ervalue)
> }
> ---
> maxval=2  # set the array size as well as the maximum parameter value here.
> pmax=maxval  # set max p value of the ARIMA model
> dmax=maxval  # set max d value of the ARIMA model
> qmax=maxval  # set max q value of the ARIMA model
> Pmax=maxval  # set max P value of the ARIMA model
> Dmax=maxval  # set max D value of the ARIMA model
> Qmax=maxval  # set max Q value of the ARIMA model
> Permax=2 # maximum value of period.
>
> st=2013   # start year value for getting the time series
> month=4 d<-c(10, 13, 14, 4, 5, 6, 7, 10, 12, 13, 14, 20, 3, 4, 5, 19, 23,
> 21, 18, 19, 21, 14, 15, 16, 17, 12, 20, 19, 17)
> tsa<-ts(d, frequency=freq, start=c(st,month))  # store the data in tsa as
> the time
>
> A<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) # depdending
> on the max value set the , also it stores the AIC valuearray size
> ervalue<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) #
> depdending on the max value set the , stores the error value.array size
>
> for (i in 1:pmax)
> {
> for (j in 1:dmax)
> {
> for (k in 1:qmax)
> {
> for (l in 1:Pmax)
> {
> for (m in 1:Dmax)
> {
> for (n in 1:Qmax)
> {
> A<-sapply((1:Permax),function(p) far(p),simplify=FALSE)
>
> }
> }
> }
> }
> }  #for looping through period value
> }
> --
> The sapply replaces the for loop
> for (p in 1:Permax)
> {
> cat("does it come here value of p", p)
> tryCatch({
> air.model <-Arima(tsa,order=c(i-1,j-1,k-1),
> seasonal=list(order=c(l-1,m-1,n-1),period=p), lambda=lbda)  # the arima
> model
> A[i,j,k,l,m,n,p]<-AIC(air.model)
> f<- forecast(air.model,h=testsize1) # for getting the error
> er[i,j,k,l,m,n,p]<-errf(act1,f$mean,testsize1,flagarima)
> }, error=function(e)
> {
>
> return(NA)
> }
> )
>  cat("Value of error", er[i,j,k,l,m,n,p])
>  cat("Value of i,j,k,l,m,n,p", i, j, k, l, m, n,p)
> }
> --
> Now the er[I,j,k,l,m,n,p] I.e the error get populated but on every call to
> the function far() the array loses the previous value and gets replaced
> with NA and gets the newly calculated error value. Finally the array A gets
> populated with only the latest value and does not hold the old values.
> Please help
>
>
>
> 
> Disclaimer:  This message and the information contained herein is
> proprietary and confidential and subject to the Tech Mahindra policy
> statement, you may review the policy at
> http://www.techmahindra.com/Disclaimer.html externally
> http://tim.techmahindra.com/tim/disclaimer.html internally within
> TechMahindra.
>
> ========
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Dr. Charles Determan, PhD
Integrated Biosciences

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using sapply instead of for loop

2014-11-19 Thread Charles Determan Jr

Amit,

Even if you aren't getting an error with your original global variables it
is far better practice to avoid global variables to make you code much more
stable.  Of course you ultimately get to decide how your code is written.

That said, your error from the modified far function to include the
variables is because you added too much to the sapply statement.  Here is
what it should look like:

A<-sapply((1:Permax),function(p) far(p, i, j, k, l, m,n,
ervalue),simplify=FALSE)

You can think apply statements as nothing more than a for loop that has
been made 'pretty'.  You wanted to iterate from 1:Permax and use the other
variables, therefore you only have the anonymous function (i.e.
function(p)) only include the iterator and supply the other values from
your nested for loops to the function.  When I run this with you code,
making sure the function accepts the extra parameters, the A array appears
to fill appropriately whereby most are 'NA' as specified by your 'far'
function.  Is this what you expect?


On Wed, Nov 19, 2014 at 8:16 AM, Amit Thombre 
wrote:

>  Charles ,
>
> I am not getting an error . The final array A does not have the values in
> it. Here is the reproducible code.  I have even tried using paasing ervalue
> as a parameter to the function far.
>
>
> ---
>
> errf<-function(act, res, testsize, flag)
> {
> j=1
> if(flag==1)
> {
> j<-nrow(d)-testsize
> }
>
> print(act)
> print(res)
> print(flag)
> diff<-0
> s<-0
> # loop for iterating to each value of the actual value and finding the
> difference with thepredicted value
> for (mn in 1:length(act))
> {
> cat("Value of mn in err", mn)
> cat("Value of j in err", j)
> cat("Value of res[j] in err", res[j])
> diff<-(act[mn]-res[j])
> print(act[mn])
> print(res[j])
> print(diff)
> s<-s+(diff*diff)
>
> j<-j+1
> }
>
> er1<-sqrt(s/length(act)) #forecasting error
> print(er1)
> return(er1)
> }
>
>
>
> far<-function(p)
> {
>
> cat("does it come here value of p", p)
> tryCatch({
> air.model <-Arima(tsa,order=c(i-1,j-1,k-1),
> seasonal=list(order=c(l-1,m-1,n-1),period=p-1), lambda=lbda)  # the arima
> model
>
> f<- forecast(air.model,h=testsize1) # for getting the error
>
> ervalue[i,j,k,l,m,n,p]<-errf(act1,f$mean,testsize1,flagarima)
>
> }, error=function(e)
> {
>
> return(NA)
> }
> )
> cat("Value of error", ervalue[i,j,k,l,m,n,p])
> cat("Value of i,j,k,l,m,n,p", i, j, k, l, m, n,p)
> print(ervalue)
> return(ervalue)
> }
> ---
> library('TTR')
> library('forecast')
> library('timeSeries')
> library('xts')
> library('RODBC')
>
>
> maxval=2  # set the array size as well as the maximum parameter value here.
> pmax=maxval  # set max p value of the ARIMA model
> dmax=maxval  # set max d value of the ARIMA model
> qmax=maxval  # set max q value of the ARIMA model
> Pmax=maxval  # set max P value of the ARIMA model
> Dmax=maxval  # set max D value of the ARIMA model
> Qmax=maxval  # set max Q value of the ARIMA model
> Permax=2 # maximum value of period.
> freq=12
> d<-c(10, 13, 14, 4, 5, 6, 7, 10, 12, 13, 14, 20, 3, 4, 5, 19, 23, 21, 18,
> 19, 21, 14, 15, 16, 17, 12, 20, 19, 17)
> st=2013   # start year value for getting the time series
> month=4
> tsa<-ts(d, frequency=freq, start=c(st,month))  # store the data in tsa as
> the time
>
> A<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) # depdending
> on the max value set the , also it stores the AIC valuearray size
> er<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval,2)) # depdending
> on the max value set the , stores the error value.array size
> ervalue<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) #
> depdending on the max value set the , stores the error value.array size
> erval1<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) #
> depdending on the max value set the , stores the error value.array size
> for (i in 1:pmax)
> {
> for (j in 1:dmax)
> {
> for (k in 1:qmax)
> {
> for (l in 1:Pmax)
> {
> for (m in 1:Dmax)
> {
> for (n in 1:Qmax)
> {
> A<-sapply((1:Permax),function(p) far(p),simplify=FALSE)
>
> }
> }
> }
> }
> }  #for looping through period value
> }
>
>
>
>
>
>  --
> *From:* Charles Determan Jr [deter...@umn.e

Re: [R] Using sapply instead of for loop

2014-11-19 Thread Charles Determan Jr

"does it come here value of p", p)
> tryCatch({
> air.model <-Arima(tsa,order=c(i-1,j-1,k-1),
> seasonal=list(order=c(l-1,m-1,n-1),period=p-1), lambda=-0.254)  # the arima
> model
>
> f<- forecast(air.model,h=5) # for getting the error
>
> ervalue[i,j,k,l,m,n,p]<-errf(act1,f$mean,testsize1,flagarima)
>
> }, error=function(e)
> {
>
> return(NA)
> }
> )
> cat("Value of error", ervalue[i,j,k,l,m,n,p])
> cat("Value of i,j,k,l,m,n,p", i, j, k, l, m, n,p)
> print(ervalue)
> return(ervalue)
> }
> ---
> library('TTR')
> library('forecast')
> library('timeSeries')
> library('xts')
> library('RODBC')
>
>
> maxval=2  # set the array size as well as the maximum parameter value here.
> pmax=maxval  # set max p value of the ARIMA model
> dmax=maxval  # set max d value of the ARIMA model
> qmax=maxval  # set max q value of the ARIMA model
> Pmax=maxval  # set max P value of the ARIMA model
> Dmax=maxval  # set max D value of the ARIMA model
> Qmax=maxval  # set max Q value of the ARIMA model
> Permax=2 # maximum value of period.
> freq=12
> d<-c(3, 2, 5,29, 6, 10, 8, 4, 4, 5, 4, 6, 6, 1, 2, 3,5, 6, 9, 10)
> st=2013   # start year value for getting the time series
> month=4
> tsa<-ts(d, frequency=freq, start=c(st,month))  # store the data in tsa as
> the time
> testsize1=5
> act1<-d[16:20] # the array of actual values, the forecasted values will be
> compared against these values
>
>
>
> A<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) # depdending
> on the max value set the , also it stores the AIC valuearray size
> er<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval,2)) # depdending
> on the max value set the , stores the error value.array size
> ervalue<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) #
> depdending on the max value set the , stores the error value.array size
> erval1<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) #
> depdending on the max value set the , stores the error value.array size
> for (i in 1:pmax)
> {
> for (j in 1:dmax)
> {
> for (k in 1:qmax)
> {
> for (l in 1:Pmax)
> {
> for (m in 1:Dmax)
> {
> for (n in 1:Qmax)
> {
> A<-sapply((1:Permax),function(p) far(p),simplify=FALSE)
>
> }
> }
> }
> }
> }  #for looping through period value
> }
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>  --
> *From:* Charles Determan Jr [deter...@umn.edu]
> *Sent:* Wednesday, November 19, 2014 8:40 PM
>
> *To:* Amit Thombre
> *Cc:* r-help@r-project.org
> *Subject:* Re: [R] Using sapply instead of for loop
>
>   Amit,
>
>  Even if you aren't getting an error with your original global variables
> it is far better practice to avoid global variables to make you code much
> more stable.  Of course you ultimately get to decide how your code is
> written.
>
>  That said, your error from the modified far function to include the
> variables is because you added too much to the sapply statement.  Here is
> what it should look like:
>
>  A<-sapply((1:Permax),function(p) far(p, i, j, k, l, m,n,
> ervalue),simplify=FALSE)
>
>  You can think apply statements as nothing more than a for loop that has
> been made 'pretty'.  You wanted to iterate from 1:Permax and use the other
> variables, therefore you only have the anonymous function (i.e.
> function(p)) only include the iterator and supply the other values from
> your nested for loops to the function.  When I run this with you code,
> making sure the function accepts the extra parameters, the A array appears
> to fill appropriately whereby most are 'NA' as specified by your 'far'
> function.  Is this what you expect?
>
>
> On Wed, Nov 19, 2014 at 8:16 AM, Amit Thombre 
> wrote:
>
>>  Charles ,
>>
>> I am not getting an error . The final array A does not have the values in
>> it. Here is the reproducible code.  I have even tried using paasing ervalue
>> as a parameter to the function far.
>>
>>
>> ---
>>
>> errf<-function(act, res, testsize, flag)
>> {
>> j=1
>> if(flag==1)
>> {
>> j<-nrow(d)-testsize
>> }
>>
>> print(act)
>> print(res)
>> print(flag)
>> diff<-0
>> s<-0
>> # loo

Re: [R] Using sapply instead of for loop

2014-11-19 Thread Charles Determan Jr

Ah, this is because you are overwriting your 'A' with each loop.  As a
simple way to demonstrate this I changed:

A<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2))

to

A <- list()

and then I changed
A <- sapply((1:Permax),function(p) far(p, i, j, k, l, m,n,
ervalue),simplify=FALSE)

to

A<-append(A, sapply((1:Permax),function(p) far(p, i, j, k, l, m,n,
ervalue),simplify=FALSE))

Once the run is complete you can find the 6.28757 in A[126].  You can
easily create another index so you can find it easily in the list but the
ervalue is indeed , , 2,2,2,1,2 as you show above.


On Wed, Nov 19, 2014 at 11:46 AM, Amit Thombre 
wrote:

>  The following is printed for  i,j,k,l,m,n,p 2 2 2 2 2 1 2
>
> "Value of error 6.281757Value of i,j,k,l,m,n,p 2 2 2 2 2 1 2, , 1, 1, 1,
> 1, 1"
>  Thus ervalue[2,2,2,2,2,1,2] should be 6.28175, But after all the runs if
> you try to get this array value it is NA. Also I think A is a list so not
> sure how to extract the same but the following is displayed for the same
> array as ervalue for A when I type A after all the runs .
> , , 2, 2, 2, 1, 2
>  [,1] [,2]
> [1,]   NA   NA
> [2,]   NA   NA
>  The ervalue itself loses the values , I think and hence A does not have
> it.
>
>  --
> *From:* Charles Determan Jr [deter...@umn.edu]
> *Sent:* Wednesday, November 19, 2014 10:04 PM
>
> *To:* Amit Thombre
> *Cc:* r-help@r-project.org
> *Subject:* Re: [R] Using sapply instead of for loop
>
>   The following provides array A with 3.212016 as the last value.  The
> error values are indeed in the array here.  There is also another with
> 6.281757 that I noticed at first glance.
>
>  errf<-function(act, res, testsize, flag)
> {
>   j=1
>   if(flag==1)
>   {
> j<-nrow(d)-testsize
>   }
>
>   print(act)
>   print(res)
>   print(flag)
>   diff<-0
>   s<-0
>   # loop for iterating to each value of the actual value and finding the
> difference with thepredicted value
>   for (mn in 1:length(act))
>   {
> cat("Value of mn in err", mn)
> cat("Value of j in err", j)
> cat("Value of res[j] in err", res[j])
> diff<-(act[mn]-res[j])
> print(act[mn])
> print(res[j])
> print(diff)
> s<-s+(diff*diff)
>
> j<-j+1
>   }
>
>   er1<-sqrt(s/length(act)) #forecasting error
>   print(er1)
>   return(er1)
> }
>
>
>
>  far<-function(p, i, j, k, l, m, n, ervalue)
> {
>   flagarima=0
>   testsize1 = 5
>   cat("does it come here value of p", p)
>   tryCatch({
> air.model <-Arima(tsa,order=c(i-1,j-1,k-1),
> seasonal=list(order=c(l-1,m-1,n-1),period=p-1), lambda=-0.254)  # the arima
> model  # the arima model
>
> f<- forecast(air.model,h=testsize1) # for getting the error
>
> ervalue[i,j,k,l,m,n,p]<-errf(act1,f$mean,testsize1,flagarima)
>
>   }, error=function(e)
>   {
>
> return(NA)
>   }
>   )
>   cat("Value of error", ervalue[i,j,k,l,m,n,p])
>   cat("Value of i,j,k,l,m,n,p", i, j, k, l, m, n,p)
>   print(ervalue)
>   return(ervalue)
> }
> ---
>
>library('TTR')
> library('forecast')
> library('timeSeries')
> library('xts')
> library('RODBC')
>
>
>  maxval=2  # set the array size as well as the maximum parameter value
> here.
> pmax=maxval  # set max p value of the ARIMA model
> dmax=maxval  # set max d value of the ARIMA model
> qmax=maxval  # set max q value of the ARIMA model
> Pmax=maxval  # set max P value of the ARIMA model
> Dmax=maxval  # set max D value of the ARIMA model
> Qmax=maxval  # set max Q value of the ARIMA model
> Permax=2 # maximum value of period.
> freq=12
> d<-c(3, 2, 5,29, 6, 10, 8, 4, 4, 5, 4, 6, 6, 1, 2, 3,5, 6, 9, 10)
> st=2013   # start year value for getting the time series
> month=4
> tsa<-ts(d, frequency=freq, start=c(st,month))  # store the data in tsa as
>  the time
> testsize1=5
> act1<-d[16:20] # the array of actual values, the forecasted values will be
> compared against these values
>
>  A<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) # depdending
> on the max value set the , also it stores the AIC valuearray size
> er<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval,2)) # depdending
> on the max value set the , stores the error value.array size
> ervalue<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) #
> depdending on the max value set the , stores the error value.array size
> erval1<-array(, c(maxval,maxval,maxval,maxval,maxval,maxval, 2)) #
> depdending on the

[R] Power of Kruskal-Wallis Test?

2013-07-09 Thread Charles Determan Jr

Greetings,

To calculate power for an ANOVA test I know I can use the pwr.anova.test()
from the pwr package.  Is there a similar function for the nonparamentric
equivalent, Kruskal-Wallis?  I have been searching but haven't come up with
anything.

Thanks,

-- 
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Power of Kruskal-Wallis Test?

2013-07-12 Thread Charles Determan Jr

Thank you Greg,
However, would you be able to direct me to either an example or further
information regarding simulations to measure power?

Charles


On Thu, Jul 11, 2013 at 4:56 PM, Greg Snow <538...@gmail.com> wrote:

> If there were a canned function for power for a non-parametric test, I
> would not trust it.  This is because there are many assumptions that would
> need to be made and I would not know if those in a canned function were
> reasonable for my study.
>
> I would compute power by simulation.  Simulate data sets that match what
> you think the real data will/may look like, analyze the simulated datasets
> and see what proportion give significant results (that will be your power).
>  You can do this for different sets of assumptions to get a  feel for how
> the different assumptions affect your results.  This way you know exactly
> what assumptions you are making to get your power.
>
>
> On Tue, Jul 9, 2013 at 2:18 PM, Charles Determan Jr wrote:
>
>> Greetings,
>>
>> To calculate power for an ANOVA test I know I can use the pwr.anova.test()
>> from the pwr package.  Is there a similar function for the nonparamentric
>> equivalent, Kruskal-Wallis?  I have been searching but haven't come up
>> with
>> anything.
>>
>> Thanks,
>>
>> --
>> Charles Determan
>> Integrated Biosciences PhD Candidate
>> University of Minnesota
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Gregory (Greg) L. Snow Ph.D.
> 538...@gmail.com
>



-- 
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] For loop output

2013-08-08 Thread Charles Determan Jr

Hi Jenny,

Firstly, to my knowledge you cannot assign the output of cat to an object
(i.e. it only prints it).
Second, you can just add the 'collapse' option of the paste function.

individual.proj.quote <- paste(individual.proj, collapse = ",")

if you really want the quotes
individual.proj.quote <- paste(individual.proj, collapse='","')

but you will be stuck with some backslashes I can't recall the syntax to
remove.

Hope this serves your purposes
Cheers,

Charles


On Thu, Aug 8, 2013 at 10:05 AM, Jenny Williams wrote:

> I am having difficulty storing the output of a for loop I have generated.
> All I want to do is find all the files that I have, create a string with
> all of the names in quotes and separated by commas. This is proving more
> difficult than I initially anticipated.
> I am sure it is either very simple or the construction of the for loop is
> not quite right
> The result gets automatically printed after the loop but I can't seem to
> save it.
> I have tried to create the element in advance but the result is the same:
> NULL
>
> individual.proj =
> Sys.glob("Arabica/proj_current/individual_projections/*.img", dirmark =
> FALSE)
> individual.proj
> [1]
> "Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_GBM.img"
>  [2]
> "Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_GLM.img"
>  [3]
> "Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_MARS.img"
>  [4]
> "Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_RF.img"
>  [5]
> "Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_RUN10_GBM.img"
>
>
> ##generate loop to create string out of the table of projected files.
> L.ip = length(individual.proj)
>   for (i in 1:L.ip){
>individual.proj.i <- individual.proj[i]
>individual.proj.quote = cat(paste('"', individual.proj.i, '"',
> ',',sep=""))
>}
>
> "Arabica/proj_current/individual_projections/proj_current_arabica_pa.data.tmp$pa.tab_Full_GBM.img","Arabica/proj_current/individual_projections/proj_current
>
> ##print output string
> individual.proj.quote
> NULL
>
> #command to be applied to individual.proj.quote to removed the final comma
> from the string
> substr(individual.proj.quote, 1, nchar(individual.proj.quote)-1)
>
> Any help or pointers would be greatly appreciated, no amount of extensive
> google searches have been fruitful so far.
>
>
> **
> Jenny Williams
> Spatial Information Scientist, GIS Unit
> Herbarium, Library, Art & Archives Directorate
> Royal Botanic Gardens, Kew
> Richmond, TW9 3AB, UK
>
> Tel: +44 (0)208 332 5277
> email: jenny.willi...@kew.org<mailto:jenny.willi...@kew.org>
> **
>
> Film: The Forgotten Home of Coffee - Beyond the Gardens<
> http://www.youtube.com/watch?v=-uDtytKMKpA&sns=tw>
> Stories: Coffee Expedition - Ethiopia<
> http://storify.com/KewGIS/coffee-expedition-ethiopia>
>  Kew in Harapan Rainforest Sumatra<
> http://storify.com/KewGIS/kew-in-harapan-rainforest>
> Articles: Seeing the wood for the trees<
> http://www.kew.org/ucm/groups/public/documents/document/kppcont_060602.pdf
> >
> How Kew's GIS team and South East Asia botanists are working to help
> conserve and restore a rainforest in Sumatra. Download a pdf of this
> article here.<
> http://www.kew.org/ucm/groups/public/documents/document/kppcont_060602.pdf
> >
>
>
> 
> The Royal Botanic Gardens, Kew is a non-departmental public body with
> exempt charitable status, whose principal place of business is at Royal
> Botanic Gardens, Kew, Richmond, Surrey TW9 3AB, United Kingdom.
>
> The information contained in this email and any attachments is intended
> solely for the addressee(s) and may contain confidential or legally
> privileged information. If you have received this message in error, please
> return it immediately and permanently delete it. Do not use, copy or
> disclose the information contained in this email or in any attachment.
>
> Any views expressed in this email do not necessarily reflect the opinions
> of RBG Kew.
>
> Any files attached to this email have been inspected with virus detection
> software by RBG Kew before transmission, however you should carry out your
> own virus checks before opening any attachments. RBG Kew accepts no
> liability for any loss

Re: [R] Randomization

2013-08-23 Thread Charles Determan Jr

Hi Silvano,

How about this?

id <- seq(80)
weight <- runif(80)

# randomize 4 groups with 'sample' function
group <- sample(rep(seq(4),20))
dat <- cbind(id, weight, group)

# ordered dataset by group
res <- data.frame(dat[order(group),])

# get mean and variance for each group
aggregate(res$weight, list(group=res$group), mean)
aggregate(res$weight, list(group=res$group), var)

Cheers,
Charles



On Fri, Aug 23, 2013 at 9:03 AM, Silvano  wrote:

> Hi,
>
> I have a set of 80 animals and their respective weights. I would like
> create 4 groups of 20 animals so that the groups have means and variances
> with values ??very close.
> How can I make this randomization in R?
>
> Thanks,
>
> --**
> Silvano Cesar da Costa
> Departamento de Estatística
> Universidade Estadual de Londrina
> Fone: 3371-4346
>
> __**
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide http://www.R-project.org/**
> posting-guide.html <http://www.R-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] caTools AUC different between mutli-class and subset binary

2013-08-25 Thread Charles Determan Jr

Greetings,

This is more of an explanation question but I was using the colAUC function
on the iris dataset and everything works smoothly.  This provides the AUC
for each pairwise comparison.  I decided to do the actual subset for one of
the comparisons and the numbers are different (.9326 v. .9152).  How/Why
could this be?

require(caTools)
data(iris)

colAUC(iris[,1], iris[,5], plotROC=FALSE, alg = "ROC")

> colAUC(iris[,1], iris[,5], plotROC=FALSE, alg = "ROC")
>[,1]
setosa vs. versicolor0.9326
setosa vs. virginica 0.9846
versicolor vs. virginica 0.7896

set_vers <- subset(iris, Species==c("setosa","versicolor"))
set_vers$Species <- factor(set_vers$Species)

colAUC(set_vers[,1], set_vers[,5], plotROC=FALSE, alg = "ROC")

> colAUC(set_vers[,1], set_vers[,5], plotROC=FALSE, alg = "ROC")        
> [,1]
setosa vs. versicolor 0.9152


Regards,

-- 
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Randomization

2013-08-26 Thread Charles Determan Jr

Silvano,

I am a little confused as to what you are looking for.  Do you want each
group to have approximately the same mean and variance?  Randomly assigning
groups should be sufficient for the means and variances to be somewhat
similar.  I'm not sure what your goal would be to randomly split your data
into four groups with the same mean and variance though.

Regards,

Charles


On Mon, Aug 26, 2013 at 8:04 AM, Silvano  wrote:

> **
> Charles,
>
> I think if the data present a high variability, there is likely to have
> heterogeneous groups.
>
> There would be the possibility to select the groups fixing the mean and
> variance?
>
> Thanks a lot,
>
> --
> Silvano Cesar da Costa
> Departamento de Estatística
> Universidade Estadual de Londrina
> Fone: 3371-4346
> ------
>
> - Original Message -
> *From:* Charles Determan Jr 
> *To:* Silvano 
> *Cc:* r-help@r-project.org
> *Sent:* Friday, August 23, 2013 11:25 AM
> *Subject:* Re: [R] Randomization
>
>  Hi Silvano,
>
> How about this?
>
> id <- seq(80)
> weight <- runif(80)
>
> # randomize 4 groups with 'sample' function
> group <- sample(rep(seq(4),20))
> dat <- cbind(id, weight, group)
>
> # ordered dataset by group
> res <- data.frame(dat[order(group),])
>
> # get mean and variance for each group
> aggregate(res$weight, list(group=res$group), mean)
> aggregate(res$weight, list(group=res$group), var)
>
> Cheers,
> Charles
>
>
>
> On Fri, Aug 23, 2013 at 9:03 AM, Silvano  wrote:
>
>> Hi,
>>
>> I have a set of 80 animals and their respective weights. I would like
>> create 4 groups of 20 animals so that the groups have means and variances
>> with values ??very close.
>> How can I make this randomization in R?
>>
>> Thanks,
>>
>> --**
>> Silvano Cesar da Costa
>> Departamento de Estatística
>> Universidade Estadual de Londrina
>> Fone: 3371-4346
>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Charles Determan
> Integrated Biosciences PhD Candidate
> University of Minnesota
>
>


-- 
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Citing Package Contributing Authors

2013-08-26 Thread Charles Determan Jr

Greetings,

I am familiar with the function cite('packageName') which provides the
output generated from the DESCRIPTION file.  In most cases this is
sufficient but I was wondering if there are contributing authors (in
addition to the primary) also listed on the CRAN page.  Is there a proper
way to account for them or are they generally not listed?

Regards,

-- 
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Citing Package Contributing Authors

2013-08-26 Thread Charles Determan Jr

Thank you for your reply Stephan,

I like to be very thorough and make sure all names are attributed so in the
case that I check the url of a package and it lists contributing authors
that aren't provided with citation() would it be appropriate to cite it
like this:

Smith, J. [pr] and Johnson, J. [cr] (2013). Awesome package name. R package
version 2.1. http://CRAN.R-project.org/package=awesome_package

I am just unsure if there is a standard approach for situations such as
this whether it is omitting the contributing author, different acronym
designation, etc.

Regards



On Mon, Aug 26, 2013 at 3:02 PM, Stephan Kolassa wrote:

> Hi,
>
> it usually is a good idea to look at the output of citation() (which,
> however, also often is auto-generated) or at the authors listed in package
> vignettes.
>
> And thanks for citing R package authors. When I review papers, I often
> have to remind authors of this...
>
> Best
> Stephan
>
>
> On 26.08.2013 21:56, Charles Determan Jr wrote:
>
>> Greetings,
>>
>> I am familiar with the function cite('packageName') which provides the
>> output generated from the DESCRIPTION file.  In most cases this is
>> sufficient but I was wondering if there are contributing authors (in
>> addition to the primary) also listed on the CRAN page.  Is there a proper
>> way to account for them or are they generally not listed?
>>
>> Regards,
>>
>>
>


-- 
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] glmnet lambda and number of variables

2013-09-04 Thread Charles Determan Jr

Greetings,

I have recently been exploring the 'glmnet' package and subsequently
cv.glmnet.  The basic code as follows:

model <- cv.glmnet(variables, group, family="multinomial", alpha=.5,
standardize=F)

I understand that cv.glmnet does k-fold cross-validation to return a value
of lambda.  However, sometimes when I follow up the cv.glmnet to extract
the coefficients either very few or all are zero.  If I understand this
correctly, it means that there aren't very many (if any) variables to
separate the groups.  Despite this, I would like to provide a list of
variables and rank them in terms of importance (even if not discriminatory
as this is for some simulation purposes and not working on a particular
question/experiment).  Is there a way for my to set up the analysis to
provide a user determined number of variables?  Or perhaps another way, is
it possible to determine the order with which variables are dropped from
the model?

Best regards,

-- 
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to read data from MSExcel into R

2013-09-11 Thread Charles Determan Jr

If there isn't multiple sheets you can use the 'gdata' package and
read.xls().

Otherwise you could re-save the file as a csv file and load that file with
read.csv() assuming not multiple sheets again which a csv cannot contain.

Regards,
Charles

On Wed, Sep 11, 2013 at 8:01 AM, Charles Thuo  wrote:

> how can one read data from MSEXcel into R especially in a case where one
> does not have administrator rights to install additional packages. In short
> how to read data from MSExcel into R with base packages only.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] renumber a list of numbers

2013-01-07 Thread Charles Determan Jr

Greetings R users,

I am trying to renumber my groups within the file shown below.  The groups
are currently set as 8,9,10,etc.  I would like to renumber this as
1,2,3,etc.  I have searched the help files and only come across using the
rownames to renumber the values but I need to match values.  Any assistance
is always appreciated,

Regards,
Charles

structure(list(Group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("NO", "YES"
), class = "factor"), Event_name = c(8, 9, 10, 11, 12, 13, 14,
15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
31, 33, 34, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30), Glucose.n = c(26, 3,
0, 26, 1, 25, 26, 25, 25, 26, 26, 25, 23, 23, 24, 24, 26, 26,
25, 25, 26, 26, 24, 21, 7, 12, 4, 0, 0, 4, 0, 4, 4, 4, 4, 4,
4, 4, 3, 4, 4, 4, 3, 3, 2, 2, 2, 1, 1), Glucose.m = c(92.5,
90.3,
NaN, 97.2307692307692, 116, 97.84, 107.653846153846, 105.32,
102.6, 94.6538461538462, 96.076923076923, 92.24, 87.5652173913043,
79.3913043478261, 81.29167, 77.5, 75.9230769230769,
74.4615384615385,
72.68, 76.32, 74.9615384615385, 72.2307692307692, 92.54167,
105.619047619048, 93.4285714285714, 96.5, 90, NaN, NaN, 86.5,
NaN, 87, 90.25, 92.5, 98.75, 95.75, 94, 88.25, 54.3,
52, 74.5, 77.75, 81, 97.3, 82.5, 85, 66.5, 51, 81
), Glucose.sd = c(18.9256439784753, 27.5922694487665, NA, 25.3050314242961,
NA, 17.3917605012642, 21.027491163127, 12.0094407308029, 28.0728219695373,
17.7334538264655, 10.7700439253443, 12.7778454104490, 11.0075432274935,
14.6992242542214, 12.2739709270814, 10.9266328000819, 10.4457573279225,
13.1338669682033, 8.2194890352138, 19.9556174213344, 17.6079090620795,
10.9299869800753, 19.3052801217623, 29.7883806046522, 17.2032665779607,
18.4563366797521, 18.0554700852678, NA, NA, 19.7399763593239,
NA, 22.3159136044214, 28.5116935075184, 21.7638844572072, 12.2848144742469,
12.0933866224478, 15.5777619273972, 11.842719282327, 39.1066916694999,
32.0936130717624, 66.8755062286136, 53.7796429887741, 17.6918060129541,
11.5902257671425, 34.6482322781408, 9.89949493661167, 26.1629509039023,
NA, NA), Glucose.se = c(3.71162415205983, 15.9304041937980, NA,
4.96272496248326, NA, 3.47835210025284, 4.12383029856479, 2.40188814616057,
5.61456439390745, 3.47781642709786, 2.11217938990701, 2.6908208981,
2.29523142628873, 3.06500013246251, 2.50541382409208, 2.23038958057991,
2.04858155574352, 2.57576322922309, 1.64389780704276, 3.99112348426689,
3.45319507311964, 2.14354680364304, 3.94067380331773, 6.50035756909396,
6.50222358617618, 5.32788547515463, 9.0277350426339, NA, NA,
9.86998817966195, NA, 11.1579568022107, 14.2558467537592, 10.8819422286036,
6.14240723712346, 6.04669331122391, 7.7096369861, 5.92135964116351,
22.5782589625015, 16.0468065358812, 33.4377531143068, 26.8898214943871,
10.2143689640297, 6.69161996662824, 24.5, 7, 18.5, NA, NA)), .Names =
c("Group",
"Event_name", "Glucose.n", "Glucose.m", "Glucose.sd", "Glucose.se"
), row.names = c(9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L,
18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L,
31L, 32L, 33L, 34L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L,
50L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 59L, 60L, 61L, 62L,
63L), class = "data.frame")

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] renumber a list of numbers

2013-01-07 Thread Charles Determan Jr

Thank you for your response, I didn't do the -7 because this is just a
small part of the dataset and the number are not all consistent (ie skip
some numbers occasionally).  However, I did not think of the
as.numeric(as.factor()) sequence.  That did the trick, simple lapse in my
thoughts that cost me a lot of time.

Thanks again,
Charles

On Mon, Jan 7, 2013 at 11:23 AM, Sarah Goslee wrote:

> Hi,
>
> It isn't entirely clear what you want, because it seems too simple.
> And most of your sample data are irrelevant, aren't they?
>
> Why not just use:
>
> testdata$Event_name2 <- testdata$Event_name - 7
>
> Or you could try:
>
> testdata$Event_name3 <- as.numeric(as.factor(testdata$Event_name))
>
> which will make the values into consecutive integers.
>Event_name Event_name2 Event_name3
> 9   8   1   1
> 10  9   2   2
> 11 10   3   3
> 12 11   4   4
> 13 12   5   5
> 14 13   6   6
> 15 14   7   7
> 16 15   8   8
> 17 16   9   9
> 18 17  10  10
> 19 18  11  11
> 20 19  12  12
> 21 20  13  13
> 22 21  14  14
> 23 22  15  15
> 24 23  16  16
> 25 24  17  17
> 26 25  18  18
> 27 26  19  19
> 28 27  20  20
> 29 28  21  21
> 30 29  22  22
> 31 30  23  23
> 32 31  24  24
> 33 33  26  25
> 34 34  27  26
> 41  8   1   1
> 42  9   2   2
> 43 10   3   3
> 44 11   4   4
> 45 12   5   5
> 46 13   6   6
> 47 14   7   7
> 48 15   8   8
> 49 16   9   9
> 50 17  10  10
> 51 18  11  11
> 52 19  12  12
> 53 20  13  13
> 54 21  14  14
> 55 22  15  15
> 56 23  16  16
> 57 24  17  17
> 58 25  18  18
> 59 26  19  19
> 60     27  20  20
> 61 28  21  21
> 62 29  22  22
> 63 30  23  23
>
>
> On Mon, Jan 7, 2013 at 11:41 AM, Charles Determan Jr 
> wrote:
> > Greetings R users,
> >
> > I am trying to renumber my groups within the file shown below.  The
> groups
> > are currently set as 8,9,10,etc.  I would like to renumber this as
> > 1,2,3,etc.  I have searched the help files and only come across using the
> > rownames to renumber the values but I need to match values.  Any
> assistance
> > is always appreciated,
> >
> > Regards,
> > Charles
> >
> > structure(list(Group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
> > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> > 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("NO", "YES"
> > ), class = "factor"), Event_name = c(8, 9, 10, 11, 12, 13, 14,
> > 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
> > 31, 33, 34, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
> > 21, 22, 23, 24, 25, 26, 27, 28, 29, 30), Glucose.n = c(26, 3,
> > 0, 26, 1, 25, 26, 25, 25, 26, 26, 25, 23, 23, 24, 24, 26, 26,
> > 25, 25, 26, 26, 24, 21, 7, 12, 4, 0, 0, 4, 0, 4, 4, 4, 4, 4,
> > 4, 4, 3, 4, 4, 4, 3, 3, 2, 2, 2, 1, 1), Glucose.m = c(92.5,
> > 90.3,
> > NaN, 97.2307692307692, 116, 97.84, 107.653846153846, 105.32,
> > 102.6, 94.6538461538462, 96.076923076923, 92.24, 87.5652173913043,
> > 79.3913043478261, 81.29167, 77.5, 75.9230769230769,
> > 74.4615384615385,
> > 72.68, 76.32, 74.9615384615385, 72.2307692307692, 92.54167,
> > 105.619047619048, 93.4285714285714, 96.5, 90, NaN, NaN, 86.5,
> > NaN, 87, 90.25, 92.5, 98.75, 95.75, 94, 88.25, 54.3,
> > 52, 74.5, 77.75, 81, 97.3, 82.5, 85, 66.5, 51, 81
> > ), Glucose.sd = c(18.9256439784753, 27.5922694487665, N

[R] context of runif()

2013-02-13 Thread Charles Determan Jr

Greetings,

I am exploring some random forest analysis methods and have come upon one
aspect I don't fully understand from any manual.  The code of interest is
as follows from the randomForest package:

myiris=cbind(iris[1:4], matrix(runif(508*nrow(iris)),nrow(iris),508))

This would be following by the rfcv() function for cross-validation but I
am confused about the former syntax.

My question is why 508?  Is this some arbitrary number that one just
chooses are is there some logic to the choice?  I have looked through the
package documentation and the runif() help which tells me that runif(n,
min=0, max=1):
n=length of observations
min&max = lower and upper limits

I still don't follow exactly what is taking place here.

Regards,
Charles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] caret pls model statistics

2013-03-02 Thread Charles Determan Jr

Greetings,

I have been exploring the use of the caret package to conduct some plsda
modeling.  Previously, I have come across methods that result in a R2 and
Q2 for the model.  Using the 'iris' data set, I wanted to see if I could
accomplish this with the caret package.  I use the following code:

library(caret)
data(iris)

#needed to convert to numeric in order to do regression
#I don't fully understand this but if I left as a factor I would get an
error following the summary function
iris$Species=as.numeric(iris$Species)
inTrain1=createDataPartition(y=iris$Species,
p=.75,
list=FALSE)

training1=iris[inTrain1,]
testing1=iris[-inTrain1,]

ctrl1=trainControl(method="cv",
number=10)

plsFit2=train(Species~.,
data=training1,
method="pls",
trControl=ctrl1,
metric="Rsquared",
preProc=c("scale"))

data(iris)
training1=iris[inTrain1,]
datvars=training1[,1:4]
dat.sc=scale(datvars)

n=nrow(dat.sc)
dat.indices=seq(1,n)

timematrix=with(training1,
classvec2classmat(Species[dat.indices]))

pls.dat=plsr(timematrix ~ dat.sc,
ncomp=3, method="oscorespls", data=training1)

x=crossval(pls.dat, segments=10)

summary(x)
summary(plsFit2)

I see two different R2 values and I cannot figure out how to get the Q2
value.  Any insight as to what my errors may be would be appreciated.

Regards,

-- 
Charles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] caret pls model statistics

2013-03-02 Thread Charles Determan Jr

I have discovered on of my errors.  The timematrix was unnecessary and an
unfortunate habit I brought from another package.  The following provides
the same R2 values as it should, however, I still don't know how to
retrieve Q2 values.  Any insight would again be appreciated:

library(caret)
library(pls)

data(iris)

#needed to convert to numeric in order to do regression
#I don't fully understand this but if I left as a factor I would get an
error following the summary function
iris$Species=as.numeric(iris$Species)
inTrain1=createDataPartition(y=iris$Species,
p=.75,
list=FALSE)

training1=iris[inTrain1,]
testing1=iris[-inTrain1,]

ctrl1=trainControl(method="cv",
number=10)

plsFit2=train(Species~.,
data=training1,
method="pls",
trControl=ctrl1,
metric="Rsquared",
preProc=c("scale"))

data(iris)
training1=iris[inTrain1,]
datvars=training1[,1:4]
dat.sc=scale(datvars)

pls.dat=plsr(as.numeric(training1$Species)~dat.sc,
ncomp=3, method="oscorespls", data=training1)

x=crossval(pls.dat, segments=10)

summary(x)
summary(plsFit2)

Regards,
Charles

On Sat, Mar 2, 2013 at 3:55 PM, Charles Determan Jr wrote:

> Greetings,
>
> I have been exploring the use of the caret package to conduct some plsda
> modeling.  Previously, I have come across methods that result in a R2 and
> Q2 for the model.  Using the 'iris' data set, I wanted to see if I could
> accomplish this with the caret package.  I use the following code:
>
> library(caret)
> data(iris)
>
> #needed to convert to numeric in order to do regression
> #I don't fully understand this but if I left as a factor I would get an
> error following the summary function
> iris$Species=as.numeric(iris$Species)
> inTrain1=createDataPartition(y=iris$Species,
> p=.75,
> list=FALSE)
>
> training1=iris[inTrain1,]
> testing1=iris[-inTrain1,]
>
> ctrl1=trainControl(method="cv",
> number=10)
>
> plsFit2=train(Species~.,
> data=training1,
> method="pls",
> trControl=ctrl1,
> metric="Rsquared",
> preProc=c("scale"))
>
> data(iris)
> training1=iris[inTrain1,]
> datvars=training1[,1:4]
> dat.sc=scale(datvars)
>
> n=nrow(dat.sc)
> dat.indices=seq(1,n)
>
> timematrix=with(training1,
> classvec2classmat(Species[dat.indices]))
>
> pls.dat=plsr(timematrix ~ dat.sc,
> ncomp=3, method="oscorespls", data=training1)
>
> x=crossval(pls.dat, segments=10)
>
> summary(x)
> summary(plsFit2)
>
> I see two different R2 values and I cannot figure out how to get the Q2
> value.  Any insight as to what my errors may be would be appreciated.
>
> Regards,
>
> --
> Charles
>



-- 
Charles Determan
Integrated Biosciences PhD Student
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] caret pls model statistics

2013-03-03 Thread Charles Determan Jr

Thank you for your response Max.  Is there some literature that you make
that statement?  I am confused as I have seen many publications that
contain R^2 and Q^2 following PLSDA analysis.  The analysis usually is to
discriminate groups (ie. classification).  Are these papers incorrect in
using these statistics?

Regards,
Charles

On Sat, Mar 2, 2013 at 10:39 PM, Max Kuhn  wrote:

> Charles,
>
> You should not be treating the classes as numeric (is virginica really
> three times setosa?). Q^2 and/or R^2 are not appropriate for classification.
>
> Max
>
>
> On Sat, Mar 2, 2013 at 5:21 PM, Charles Determan Jr wrote:
>
>> I have discovered on of my errors.  The timematrix was unnecessary and an
>> unfortunate habit I brought from another package.  The following provides
>> the same R2 values as it should, however, I still don't know how to
>> retrieve Q2 values.  Any insight would again be appreciated:
>>
>> library(caret)
>> library(pls)
>>
>> data(iris)
>>
>> #needed to convert to numeric in order to do regression
>> #I don't fully understand this but if I left as a factor I would get an
>> error following the summary function
>> iris$Species=as.numeric(iris$Species)
>> inTrain1=createDataPartition(y=iris$Species,
>> p=.75,
>> list=FALSE)
>>
>> training1=iris[inTrain1,]
>> testing1=iris[-inTrain1,]
>>
>> ctrl1=trainControl(method="cv",
>> number=10)
>>
>> plsFit2=train(Species~.,
>> data=training1,
>> method="pls",
>> trControl=ctrl1,
>> metric="Rsquared",
>> preProc=c("scale"))
>>
>> data(iris)
>> training1=iris[inTrain1,]
>> datvars=training1[,1:4]
>> dat.sc=scale(datvars)
>>
>> pls.dat=plsr(as.numeric(training1$Species)~dat.sc,
>> ncomp=3, method="oscorespls", data=training1)
>>
>> x=crossval(pls.dat, segments=10)
>>
>> summary(x)
>> summary(plsFit2)
>>
>> Regards,
>> Charles
>>
>> On Sat, Mar 2, 2013 at 3:55 PM, Charles Determan Jr > >wrote:
>>
>> > Greetings,
>> >
>> > I have been exploring the use of the caret package to conduct some plsda
>> > modeling.  Previously, I have come across methods that result in a R2
>> and
>> > Q2 for the model.  Using the 'iris' data set, I wanted to see if I could
>> > accomplish this with the caret package.  I use the following code:
>> >
>> > library(caret)
>> > data(iris)
>> >
>> > #needed to convert to numeric in order to do regression
>> > #I don't fully understand this but if I left as a factor I would get an
>> > error following the summary function
>> > iris$Species=as.numeric(iris$Species)
>> > inTrain1=createDataPartition(y=iris$Species,
>> > p=.75,
>> > list=FALSE)
>> >
>> > training1=iris[inTrain1,]
>> > testing1=iris[-inTrain1,]
>> >
>> > ctrl1=trainControl(method="cv",
>> > number=10)
>> >
>> > plsFit2=train(Species~.,
>> > data=training1,
>> > method="pls",
>> > trControl=ctrl1,
>> > metric="Rsquared",
>> > preProc=c("scale"))
>> >
>> > data(iris)
>> > training1=iris[inTrain1,]
>> > datvars=training1[,1:4]
>> > dat.sc=scale(datvars)
>> >
>> > n=nrow(dat.sc)
>> > dat.indices=seq(1,n)
>> >
>> > timematrix=with(training1,
>> > classvec2classmat(Species[dat.indices]))
>> >
>> > pls.dat=plsr(timematrix ~ dat.sc,
>> > ncomp=3, method="oscorespls", data=training1)
>> >
>> > x=crossval(pls.dat, segments=10)
>> >
>> > summary(x)
>> > summary(plsFit2)
>> >
>> > I see two different R2 values and I cannot figure out how to get the Q2
>> > value.  Any insight as to what my errors may be would be appreciated.
>> >
>> > Regards,
>> >
>> > --
>> > Charles
>> >
>>
>>
>>
>> --
>> Charles Determan
>> Integrated Biosciences PhD Student
>> University of Minnesota
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
>
> Max
>



-- 
Charles Determan
Integrated Biosciences PhD Student
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] caret pls model statistics

2013-03-03 Thread Charles Determan Jr

I was under the impression that in PLS analysis, R2 was calculated by 1-
(Residual sum of squares) / (Sum of squares).  Is this still what you are
referring to?  I am aware of the linear R2 which is how well two variables
are correlated but the prior equation seems different to me.  Could you
explain if this is the same concept?

Charles

On Sun, Mar 3, 2013 at 12:46 PM, Max Kuhn  wrote:

> > Is there some literature that you make that statement?
>
> No, but there isn't literature on changing a lightbulb with a duck either.
>
> > Are these papers incorrect in using these statistics?
>
> Definitely, if they convert 3+ categories to integers (but there are
> specialized R^2 metrics for binary classification models). Otherwise, they
> are just using an ill-suited "score".
>
> How would you explain such an R^2 value to someone? R^2 is a function of
> correlation between the two random variables. For two classes, one of them
> is binary. What does it mean?
>
> Historically, models rooted in computer science (eg neural networks) used
> RMSE or SSE to fit models with binary outcomes and that *can* work work
> well.
>
> However, I don't think that communicating R^2 is effective. Other metrics
> (e.g. accuracy, Kappa, area under the ROC curve, etc) are designed to
> measure the ability of a model to classify and work well. With 3+
> categories, I tend to use Kappa.
>
> Max
>
>
>
>
> On Sun, Mar 3, 2013 at 10:53 AM, Charles Determan Jr wrote:
>
>> Thank you for your response Max.  Is there some literature that you make
>> that statement?  I am confused as I have seen many publications that
>> contain R^2 and Q^2 following PLSDA analysis.  The analysis usually is to
>> discriminate groups (ie. classification).  Are these papers incorrect in
>> using these statistics?
>>
>> Regards,
>> Charles
>>
>>
>> On Sat, Mar 2, 2013 at 10:39 PM, Max Kuhn  wrote:
>>
>>> Charles,
>>>
>>> You should not be treating the classes as numeric (is virginica really
>>> three times setosa?). Q^2 and/or R^2 are not appropriate for classification.
>>>
>>> Max
>>>
>>>
>>> On Sat, Mar 2, 2013 at 5:21 PM, Charles Determan Jr wrote:
>>>
>>>> I have discovered on of my errors.  The timematrix was unnecessary and
>>>> an
>>>> unfortunate habit I brought from another package.  The following
>>>> provides
>>>> the same R2 values as it should, however, I still don't know how to
>>>> retrieve Q2 values.  Any insight would again be appreciated:
>>>>
>>>> library(caret)
>>>> library(pls)
>>>>
>>>> data(iris)
>>>>
>>>> #needed to convert to numeric in order to do regression
>>>> #I don't fully understand this but if I left as a factor I would get an
>>>> error following the summary function
>>>> iris$Species=as.numeric(iris$Species)
>>>> inTrain1=createDataPartition(y=iris$Species,
>>>> p=.75,
>>>> list=FALSE)
>>>>
>>>> training1=iris[inTrain1,]
>>>> testing1=iris[-inTrain1,]
>>>>
>>>> ctrl1=trainControl(method="cv",
>>>> number=10)
>>>>
>>>> plsFit2=train(Species~.,
>>>> data=training1,
>>>> method="pls",
>>>> trControl=ctrl1,
>>>> metric="Rsquared",
>>>> preProc=c("scale"))
>>>>
>>>> data(iris)
>>>> training1=iris[inTrain1,]
>>>> datvars=training1[,1:4]
>>>> dat.sc=scale(datvars)
>>>>
>>>> pls.dat=plsr(as.numeric(training1$Species)~dat.sc,
>>>> ncomp=3, method="oscorespls", data=training1)
>>>>
>>>> x=crossval(pls.dat, segments=10)
>>>>
>>>> summary(x)
>>>> summary(plsFit2)
>>>>
>>>> Regards,
>>>> Charles
>>>>
>>>> On Sat, Mar 2, 2013 at 3:55 PM, Charles Determan Jr >>> >wrote:
>>>>
>>>> > Greetings,
>>>> >
>>>> > I have been exploring the use of the caret package to conduct some
>>>> plsda
>>>> > modeling.  Previously, I have come across methods that result in a R2
>>>> and
>>>> > Q2 for the model.  Using the 'iris' data set, I wanted to see if I
>>>> could
>>>> > accomplish this with the caret package.  I use the following code:
>>

Re: [R] caret pls model statistics

2013-03-05 Thread Charles Determan Jr

Does anyone know of any literature on the kappa statistic with plsda?  I
have been trying to find papers that used plsda for classification and have
yet to come across this kappa value.  All the papers I come across
typically have R2 as an indicator of model fit.  I want to make sure I
conduct such analysis appropriately, any guidance is appreciated.

Regards,
Charles

On Sun, Mar 3, 2013 at 4:38 PM, Max Kuhn  wrote:

> That the most common formula, but not the only one. See
>
>   Kvålseth, T. (1985). Cautionary note about $R^2$. *American Statistician
> *, *39*(4), 279285.
>
> Traditionally, the symbol 'R' is used for the Pearson correlation
> coefficient and one way to calculate R^2 is... R^2.
>
> Max
>
>
> On Sun, Mar 3, 2013 at 3:16 PM, Charles Determan Jr wrote:
>
>> I was under the impression that in PLS analysis, R2 was calculated by 1-
>> (Residual sum of squares) / (Sum of squares).  Is this still what you are
>> referring to?  I am aware of the linear R2 which is how well two variables
>> are correlated but the prior equation seems different to me.  Could you
>> explain if this is the same concept?
>>
>> Charles
>>
>>
>> On Sun, Mar 3, 2013 at 12:46 PM, Max Kuhn  wrote:
>>
>>> > Is there some literature that you make that statement?
>>>
>>> No, but there isn't literature on changing a lightbulb with a duck
>>> either.
>>>
>>> > Are these papers incorrect in using these statistics?
>>>
>>> Definitely, if they convert 3+ categories to integers (but there are
>>> specialized R^2 metrics for binary classification models). Otherwise, they
>>> are just using an ill-suited "score".
>>>
>>>  How would you explain such an R^2 value to someone? R^2 is
>>> a function of correlation between the two random variables. For two
>>> classes, one of them is binary. What does it mean?
>>>
>>> Historically, models rooted in computer science (eg neural networks)
>>> used RMSE or SSE to fit models with binary outcomes and that *can* work
>>> work well.
>>>
>>> However, I don't think that communicating R^2 is effective. Other
>>> metrics (e.g. accuracy, Kappa, area under the ROC curve, etc) are designed
>>> to measure the ability of a model to classify and work well. With 3+
>>> categories, I tend to use Kappa.
>>>
>>> Max
>>>
>>>
>>>
>>>
>>> On Sun, Mar 3, 2013 at 10:53 AM, Charles Determan Jr 
>>> wrote:
>>>
>>>> Thank you for your response Max.  Is there some literature that you
>>>> make that statement?  I am confused as I have seen many publications that
>>>> contain R^2 and Q^2 following PLSDA analysis.  The analysis usually is to
>>>> discriminate groups (ie. classification).  Are these papers incorrect in
>>>> using these statistics?
>>>>
>>>> Regards,
>>>> Charles
>>>>
>>>>
>>>> On Sat, Mar 2, 2013 at 10:39 PM, Max Kuhn  wrote:
>>>>
>>>>> Charles,
>>>>>
>>>>> You should not be treating the classes as numeric (is virginica
>>>>> really three times setosa?). Q^2 and/or R^2 are not appropriate for
>>>>> classification.
>>>>>
>>>>> Max
>>>>>
>>>>>
>>>>> On Sat, Mar 2, 2013 at 5:21 PM, Charles Determan Jr 
>>>>> wrote:
>>>>>
>>>>>> I have discovered on of my errors.  The timematrix was unnecessary
>>>>>> and an
>>>>>> unfortunate habit I brought from another package.  The following
>>>>>> provides
>>>>>> the same R2 values as it should, however, I still don't know how to
>>>>>> retrieve Q2 values.  Any insight would again be appreciated:
>>>>>>
>>>>>> library(caret)
>>>>>> library(pls)
>>>>>>
>>>>>> data(iris)
>>>>>>
>>>>>> #needed to convert to numeric in order to do regression
>>>>>> #I don't fully understand this but if I left as a factor I would get
>>>>>> an
>>>>>> error following the summary function
>>>>>> iris$Species=as.numeric(iris$Species)
>>>>>> inTrain1=createDataPartition(y=iris$Species,
>>>>>> p=.75,
>>>>>> list=FALSE)
>>>>>>
>>>>>> training1=iri

[R] Multivariate Power Test?

2013-03-06 Thread Charles Determan Jr

Generic question... I am familiar with generic power calculations in R,
however a lot of the data I primarily work with is multivariate.  Is there
any package/function that you would recommend to conduct such power
analysis?  Any recommendations would be appreciated.

Thank you for your time,

Charles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multivariate Power Test?

2013-03-07 Thread Charles Determan Jr

I refer to a multivariate model.  For example, I have two groups (control
and test) and multiple variables measured for each (V1, V2, V3... Vn).  I
wasn't sure if there was any way to conduct power analysis other than
conducting it as you would with a single variable and just account for
multiple testing.  I will look into the Clinical Trials Task view.  If
there any recommendations by others or generally approach to multivariate
power calculations I would love to hear them.

Thanks again Marc,

Charles

On Thu, Mar 7, 2013 at 6:28 AM, Marc Schwartz  wrote:

> On Mar 6, 2013, at 10:50 PM, Charles Determan Jr  wrote:
>
> > Generic question... I am familiar with generic power calculations in R,
> > however a lot of the data I primarily work with is multivariate.  Is
> there
> > any package/function that you would recommend to conduct such power
> > analysis?  Any recommendations would be appreciated.
> >
> > Thank you for your time,
> >
> > Charles
>
>
>
> Are you referring to a multivariate response or a multivariable model?
> Just trying to parse the terminology better.
>
> If the former, I don't believe that there is anything in R, but could be
> wrong. If correct, then you might want to look at simulation.
>
> If the latter, you might want to look at the Clinical Trials Task View:
>
>   http://cran.r-project.org/web/views/ClinicalTrials.html
>
> as there are various packages that might fit what you need, but again,
> simulation is always an option.
>
> Regards,
>
> Marc Schwartz
>
>

-- 
Charles Determan
Integrated Biosciences PhD Student
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Kruskal-Wallis

2013-04-15 Thread Charles Determan Jr

One statistical point beyond A.K.'s well done response.  As you should well
know, Kruskal-Wallis is a non-parametric equivalent of ANOVA.  However, you
only have two groups and do not require an ANOVA approach.  You could
simply use a Mann-Whitney U (aka.. independent Wilcoxon) test using
wilcox.test().

Charles

On Mon, Apr 15, 2013 at 8:23 AM, arun  wrote:

> Hi,
>
> set.seed(25)
>  myFile1<-as.data.frame(matrix(sample(1:40,50,replace=TRUE),nrow=10))
>  row.names(myFile1)<- LETTERS[1:10]
> groups <- rep (0:1, c(3,2))
> kruskal<-apply(myFile1,1,kruskal.test,groups)
>  p_kruskal <- sapply(kruskal, function(x) x$p.value)
>  p_kruskal
> # A  B  C  D  E
> F  G
> #0.08326452 0.08326452 0.56370286 0.56370286 0.24821308 1.
> 0.08326452
>  #H  I  J
> #1. 0.37425932 0.56370286
> #or
>  sapply(seq_len(nrow(myFile1)),function(i)
> kruskal.test(unlist(myFile1[i,]),groups)$p.value)
>  [1] 0.08326452 0.08326452 0.56370286 0.56370286 0.24821308 1.
>  [7] 0.08326452 1. 0.37425932 0.56370286
> A.K.
>
> - Original Message -
> From: Chintanu 
> To: R help 
> Cc:
> Sent: Monday, April 15, 2013 1:18 AM
> Subject: [R] Kruskal-Wallis
>
> Hi,
>
> I have got two groups of samples; and for every row, I wish to calculate
> Kruskal-Wallis' p-value.
> In the example below, and the stars () show where I am struggling to
> design and put things together. Any help would be appreciated.
>
>
> myFile <- data.frame(Sample_1a = 1:10, Sample_1b = 2:11, Sample_1c = 3:12,
> Sample_2a=4:13, Sample_2b=7:16, row.names=LETTERS[1:10])
>
> groups <- rep (0:1, c(3,2))
>
> kruskal <- apply(myFile [1:nrow(myFile),], 1,  kruskal.test, **)
>
> p_kruskal <- sapply(kruskal, function(x) x$p.value)
>
> Thanks,
> Chintanu
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Charles Determan
Integrated Biosciences PhD Student
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading CSV file

2013-04-19 Thread Charles Determan Jr

Are you sure the file is in your current working directory?  Often people
simply put the full path such as "/Users/Name/RBS.csv"

Cheers,


On Fri, Apr 19, 2013 at 9:30 AM, Gafar Matanmi Oyeyemi
wrote:

> I am trying to read a csv file using the code;
> contol <- read.csv("RBS.csv")
> This is the error message I got;
> Error in file(file, "r") : unable to open connection
> In addition: Warning message:
> In file(file, "r") :
> cannot open file 'RBS.csv', reason 'No such file or directory'
>
>
> Where was the mistake?
>
> --
> OYEYEMI, Gafar Matanmi (Ph.D)
> Senior Lecturer
> Department of Statistics
> University of Ilorin.
> Area of Specialization: Multivariate Analysis, Statistical Quality Control
> & Total Quality Management.
> Tel: +2348052278655, +2348068241885
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Charles Determan
Integrated Biosciences PhD Student
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Can a column of a list be called?

2013-04-26 Thread Charles Determan Jr

If you are using the list as simply a collection of data frames a simple
example to accomplish what you are describing is this:

data(iris)
data(mtcars)
y=list(iris, mtcars)
#return Sepal.Length column from first data frame in list
#list[[number of list component]][number of column]
y[[1]][1]

Cheers,



On Thu, Apr 25, 2013 at 7:24 PM, Jana Makedonska wrote:

> Hello Everyone,
>
> I would like to know if I can call one of the columns of a list, to use it
> as a variable in a function.
>
> Thanks in advance for any advice!
>
> Jana
>
> --
>
> Jana Makedonska,
> B.Sc. Biology, Universite Paul Sabatier Toulouse III
> M.Sc. Paleontology, Paleobiology and Phylogeny, Universite de Montpellier
> II
> Ph.D. candidate in Physical Anthropology and Part-time lecturer
> Department of Anthropology
> College of Arts & Sciences
> State University of New York at Albany
> 1400 Washington Avenue
> 1 Albany, NY
> Office phone: 518-442-4699
> http://electricsongs.academia.edu/JanaMakedonska
> http://www.youtube.com/watch?v=OHbT9VvtonM<
> http://www.youtube.com/watch?v=jRoMoLjzpf4&list=PL5BF6ACDCC2E4AAA0&index=7
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Charles Determan
Integrated Biosciences PhD Student
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] topGO printGenes

2013-05-06 Thread Charles Determan Jr

Greetings R users,

I have a rather specific question I hope someone could assist me with.

I have been using the topGO package for some Gene Ontology analysis of some
RNA-seq data.  As such I use a organism database from the biomaRt library.
I can create a topGOdata object with the following command

GOdata=new("topGOdata", ontology="BP", allGenes=geneList,
   nodeSize=10,
   annot=annFUN.org,
   mapping="org.Ss.eg.db",
   ID="Symbol")

Everything works well except when I wish to look at the genes within an
individual GO where I can't get the function to work.

I initially thought to try this because I don't use a microarray library:
gt=printGenes(GOdata, whichTerms = goID, chip="org.Ss.eg.db", numChar = 40)

but I received the error:
Error in get(paste(chip, "ENTREZID", sep = "")):
   object 'org.Ss.egENTREZID' not found

Has anyone experienced this or have any thoughts?

Regards,

-- 
Charles Determan
Integrated Biosciences PhD Student
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] writing to the screen and extra "0"

2013-07-01 Thread Charles Determan Jr

Hi Thomas,

If you put the list.files statement inside the write function you won't
have the indices.
Try:

write(list.files(pattern="*"), file="my_files.txt")

Cheers,
Charles



On Mon, Jul 1, 2013 at 2:03 PM, Thomas Grzybowski <
thomasgrzybow...@gmail.com> wrote:

> Hi.
>
> list.files(pattern = "*")
>
> gives me output with the R list indices at the left of each line on the
> screen.  I want only file names.
>
> Thanks!
> Tom Grzybowski
>
>
> On 07/01/2013 02:55 PM, Rui Barradas wrote:
>
>> Hello,
>>
>> Try instead
>>
>>
>> list.files(pattern = "*")
>>
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>> Em 01-07-2013 19:23, Thomas Grzybowski escreveu:
>>
>>>
>>> I am using the "write" function like so (R 3.0.1 on linux):
>>>
>>> "wrt" <-
>>> function()
>>> {
>>>  write(system("ls *"),file="")
>>> }
>>>
>>> When the files are listed to the screen with wrt(), there is a "0"
>>> character prepended to the output on the screen.  Worse, when I remove
>>> the 'file=""' argument to "write", a file named "data" is created in my
>>> default directory, with a zero in it.
>>>
>>> All I am trying to do is output the "ls" of files in my directory,
>>> without any extra characters or type-attribute information. Thanks for
>>> your help!
>>>
>>> Thomas Grzybowski
>>>
>>> __**
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/**posting-guide.html<http://www.R-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
> __**
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide http://www.R-project.org/**
> posting-guide.html <http://www.R-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Better way of Grouping?

2012-09-28 Thread Charles Determan Jr

Hello R users,

This is more of a convenience question that I hope others might find useful
if there is a better answer.  I work with large datasets that requires
multiple parsing stages for different analysis.  For example, compare group
3 vs. group 4.  A more complicated comparison would be time B in group 3 of
group L with B in group 4 of group L.  I normally subset each group with
the following type of code.

data=read(...)

#L v D
L=data[LvD %in% c("L"),]
D=data[LvD %in% c("D"),]

#Groups 3 and 4 within L and D
group3L=L[group %in% c("3"),]
group4L=L[group %in% c("3"),]

group3D=D[group %in% c("3"),]
group4D=D[group %in% c("3"),]

#Times B, S45, FR2, FR8
you get the idea


Is there a more efficient way to subset groups?  Thanks for any insight.

Regards,
Charles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Better way of Grouping?

2012-10-01 Thread Charles Determan Jr

Thank you Jeff,

The main purpose I am looking to use these subsets for is to do comparisons
between groups and/or timepoints within groups.  For example the difference
in means between 3 and 4, or the percent difference between group L an D
within group 3.  The code I have provided is almost exactly as used
although David accurately noted I mistakenly typed '3' when I intended to
put '4'.  I am just looking for some general guidance in improving how I go
about my code.  Because there are so many subgroups that can be formed with
three layers of groups, it simply becomes a clutter and I wanted to learn
if there was a 'better' way to organize groups for comparisons.  Does that
help clarify?

Thanks again,
Charles

On Fri, Sep 28, 2012 at 4:25 PM, Jeff Newmiller wrote:

> You have not specified the objective function you are trying to optimize
> with your term "efficient", or what you do with all of these subsets once
> you have them.
>
> For notational simplification and completeness of coverage (not
> necessarily computational speedup) you might want to look at "tapply" or
> ddply/dlply from the plyr package. If you build lists of subsets you can
> index into them according to grouping value. You can use expand.grid to
> build all permutations of grouping values to use as indexes into those
> lists of subsets.
>
> To reiterate, you have not indicated what you want to do with these
> subsets, so there could be special-purpose functions that do what you want.
>  As always, reproducible code leads to reproducible answers. :)
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live
> Go...
>   Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> -------
> Sent from my phone. Please excuse my brevity.
>
> Charles Determan Jr  wrote:
>
> >Hello R users,
> >
> >This is more of a convenience question that I hope others might find
> >useful
> >if there is a better answer.  I work with large datasets that requires
> >multiple parsing stages for different analysis.  For example, compare
> >group
> >3 vs. group 4.  A more complicated comparison would be time B in group
> >3 of
> >group L with B in group 4 of group L.  I normally subset each group
> >with
> >the following type of code.
> >
> >data=read(...)
> >
> >#L v D
> >L=data[LvD %in% c("L"),]
> >D=data[LvD %in% c("D"),]
> >
> >#Groups 3 and 4 within L and D
> >group3L=L[group %in% c("3"),]
> >group4L=L[group %in% c("3"),]
> >
> >group3D=D[group %in% c("3"),]
> >group4D=D[group %in% c("3"),]
> >
> >#Times B, S45, FR2, FR8
> >you get the idea
> >
> >
> >Is there a more efficient way to subset groups?  Thanks for any
> >insight.
> >
> >Regards,
> >Charles
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Adding Time when Blanks

2012-10-12 Thread Charles Determan Jr

Greetings,

My data set has dates and times that I am working with.  Some of the times
in Time_of_end are blank.  This is supposed to dictate that the particular
experiment lasted 48 hours.  I would like to add 48 hours to the start
Start_of_Experiment for another column as End_of_Experiment including both
the ones with 48 added and those with early times.  I was thinking
something with a conditional statement but I can't seem to figure out what
code to use.  Any insight would be appreciated.  Let me know if there is
anything else you need.  Thanks for your time.

I have Start_of_Experiment in POSIX format for time calculations from the
following:

data$Start_of_Experiment=as.POSIXct(strptime(data$start_time, "%m/%d/%Y
%H:%M:%S"))

Here is a subset of my data.

IDgroup Start_date Time_of_experiment Time_of_end
120209 402/02/2009   12:38:00
26   30209 3   03/02/2009   12:40:00  13:32:00
27   31609 4   03/16/2009   11:28:00  12:26:00
28   40609 4   04/06/2009   11:17:00
53   42709 4   04/27/2009   11:15:00   9:30:00
76   51109 3   05/11/2009   11:51:00
101  51809 1  05/18/2009   12:28:00
126  62209 3  06/22/2009   11:31:00
150  71309 4  07/13/2009   12:12:00  13:37:00
173  81009 4  08/10/2009   11:32:00  20:52:00
Start_of_Experiment
1   2009-02-02 12:38:00
26  2009-03-02 12:40:00
27  2009-03-16 11:28:00
28  2009-04-06 11:17:00
53  2009-04-27 11:15:00
76  2009-05-11 11:51:00
101 2009-05-18 12:28:00
126 2009-06-22 11:31:00
150 2009-07-13 12:12:00
173 2009-08-10 11:32:00

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding Time when Blanks

2012-10-12 Thread Charles Determan Jr

This works perfect, thank you very much Rui!

On Fri, Oct 12, 2012 at 1:35 PM, Rui Barradas  wrote:

> Hello,
>
> Try the following.
>
>
> dat <- read.table(text="
>
>  IDgroup Start_date Time_of_experiment Time_of_end
> 120209 402/02/2009   12:38:00
> 26   30209 3   03/02/2009   12:40:00  13:32:00
> 27   31609 4   03/16/2009   11:28:00  12:26:00
> 28   40609 4   04/06/2009   11:17:00
> 53   42709 4   04/27/2009   11:15:00   9:30:00
> 76   51109 3   05/11/2009   11:51:00
> 101  51809 1  05/18/2009   12:28:00
> 126  62209 3  06/22/2009   11:31:00
> 150  71309 4  07/13/2009   12:12:00  13:37:00
> 173  81009 4  08/10/2009   11:32:00  20:52:00
> ", header=TRUE, fill=TRUE)
> str(dat)
>
> dat$start_time <- with(dat, paste(Start_date, Time_of_experiment))
> dat$Start_of_Experiment <-
> as.POSIXct(strptime(dat$start_**time, "%m/%d/%Y %H:%M:%S"))
>
> #--- Create End_of_Experiment
> idx <- dat$Time_of_end != ''
> dat$End_of_Experiment <- dat$Start_of_Experiment + 48*60*60
> dat$End_of_Experiment[idx] <-
> as.POSIXct(strptime(paste(dat$**Start_date, dat$Time_of_end)[idx],
> "%m/%d/%Y %H:%M:%S"))
> dat
>
>
> Hope this helps,
>
> Rui Barradas
> Em 12-10-2012 18:59, Charles Determan Jr escreveu:
>
>> Greetings,
>>
>> My data set has dates and times that I am working with.  Some of the times
>> in Time_of_end are blank.  This is supposed to dictate that the particular
>> experiment lasted 48 hours.  I would like to add 48 hours to the start
>> Start_of_Experiment for another column as End_of_Experiment including both
>> the ones with 48 added and those with early times.  I was thinking
>> something with a conditional statement but I can't seem to figure out what
>> code to use.  Any insight would be appreciated.  Let me know if there is
>> anything else you need.  Thanks for your time.
>>
>> I have Start_of_Experiment in POSIX format for time calculations from the
>> following:
>>
>> data$Start_of_Experiment=as.**POSIXct(strptime(data$start_**time,
>> "%m/%d/%Y
>> %H:%M:%S"))
>>
>> Here is a subset of my data.
>>
>>  IDgroup Start_date Time_of_experiment Time_of_end
>> 120209 402/02/2009   12:38:00
>> 26   30209 3   03/02/2009   12:40:00  13:32:00
>> 27   31609 4   03/16/2009   11:28:00  12:26:00
>> 28   40609 4   04/06/2009   11:17:00
>> 53   42709 4   04/27/2009   11:15:00   9:30:00
>> 76   51109 3   05/11/2009   11:51:00
>> 101  51809 1  05/18/2009   12:28:00
>> 126  62209 3  06/22/2009   11:31:00
>> 150  71309 4  07/13/2009   12:12:00  13:37:00
>> 173  81009 4  08/10/2009   11:32:00  20:52:00
>>  Start_of_Experiment
>> 1   2009-02-02 12:38:00
>> 26  2009-03-02 12:40:00
>> 27  2009-03-16 11:28:00
>> 28  2009-04-06 11:17:00
>> 53  2009-04-27 11:15:00
>> 76  2009-05-11 11:51:00
>> 101 2009-05-18 12:28:00
>> 126 2009-06-22 11:31:00
>> 150 2009-07-13 12:12:00
>> 173 2009-08-10 11:32:00
>>
>> [[alternative HTML version deleted]]
>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] loop of quartile groups

2012-10-17 Thread Charles Determan Jr

Greetings R users,

My goal is to generate quartile groups of each variable in my data set.  I
would like each experiment to have its designated group added as a
subsequent column.  I can accomplish this individually with the following
code:

brks <- with(data_variables,

 cut2(var2, g=4))

#I don't want the actual numbers, I need a numbered group

data$test1=factor(brks, labels=1:4)


However, I cannot get a loop to work nor can I get a loop to add the
columns with an appropriate name (ex. quartile_variable).  I have tried
multiple different ways but can't seem to get it to work.  I think it would
begin something like this:


for(i in 11:ncol(survival_data_variables)){
brks=as.data.frame(with(survival_data_variables,
cut2(survival_data_variables[,i], g=4)))


Any assistance would be sincerely appreciated.  I would like the final data
set to have the following layout:


IDvar1   var2var3 var4   quartile var1
quartile var2quartile var3  quartile var4


Here is a subset of my data to work with:

structure(list(ID = c(2L, 11811L, 12412L, 12510L, 13111L,

20209L, 20612L, 20711L, 21510L, 22012L), var1 = c(106, 107,

116, 67, 76, 146, 89, 62, 65, 116), var2 = c(0, 0, 201,

558, 526, 555, 576, 0, 531, 649), var3 = c(70.67, 81.33,

93.67, 84.33, 52, 74, 114, 101, 80.33, 91.33), var4 = c(136,

139, 142, 138, 140, 140, 136, 139, 140, 139)), .Names = c("ID",

"var1", "var2", "var3", "var4"), row.names = c(NA,

10L), class = "data.frame")


Regards,
Charles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] cut2 error

2012-10-17 Thread Charles Determan Jr

To R users,

I am trying to use cut2 function from the 'Hmisc' library.  However, when I
try and run the function on the following variable, I get an error message
(displayed below).  I suspect it is because of the NA but I have no idea
how to address the error.  Many thanks to any insights.

structure(list(var1 = c(97, 97, 98, 98, 97, 99, 97,
98, 99, 98, 99, 98, 98, 97, 97, 98, 99, 98, 96, 98, 98, 99, 98,
98, 99, 99, 98, 99, 98, 99, 99, 99, 99, 98, 99, 96, 99, 98, 98,
99, 97, 98, 99, 99, 97, 99, 99, 98, 98, 98, 99, NA, 99, 98, 98,
98, 98, 98, 98, 98, 99, 99, 98, 99, 99, 98, 98, 99, 99, 97, 98,
98, 98, 99, 98, 98, 98, 99, 98, 98)), .Names = "var1", row.names = c(NA,
80L), class = "data.frame")

cut2(dat[,1], g=4)

Warning message:
In min(xx[xx > upper]) : no non-missing arguments to min; returning Inf

Regards,
Charles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cut2 error

2012-10-17 Thread Charles Determan Jr

Hi A.K.

I tried your code exactly as you presented but I only get the following
output, did I somehow miss something?  Was there something else loaded?
Thanks,

 [1] [96, 99) [96, 99) [96, 99) [96, 99) [96, 99) 99   [96, 99) [96, 99)
 [9] 99   [96, 99) 99   [96, 99) [96, 99) [96, 99) [96, 99) [96, 99)
[17] 99   [96, 99) [96, 99) [96, 99) [96, 99) 99   [96, 99) [96, 99)
[25] 99   99   [96, 99) 99   [96, 99) 99   99   99
[33] 99   [96, 99) 99   [96, 99) 99   [96, 99) [96, 99) 99
[41] [96, 99) [96, 99) 99   99   [96, 99) 99   99   [96, 99)
[49] [96, 99) [96, 99) 99   99   [96, 99) [96, 99) [96, 99) [96, 99)
[57] [96, 99) [96, 99) [96, 99) 99   99   [96, 99) 99   99
[65] [96, 99) [96, 99) 99   99   [96, 99) [96, 99) [96, 99) [96, 99)
[73] 99   [96, 99) [96, 99) [96, 99) 99   [96, 99) [96, 99)
Levels: [96, 99) 99
Warning message:
In min(xx[xx > upper]) : no non-missing arguments to min; returning Inf


On Wed, Oct 17, 2012 at 3:30 PM, arun  wrote:

> Hi,
> Try this:
> dat1<-structure(list(var1 = c(97, 97, 98, 98, 97, 99, 97,
> 98, 99, 98, 99, 98, 98, 97, 97, 98, 99, 98, 96, 98, 98, 99, 98,
> 98, 99, 99, 98, 99, 98, 99, 99, 99, 99, 98, 99, 96, 99, 98, 98,
> 99, 97, 98, 99, 99, 97, 99, 99, 98, 98, 98, 99, NA, 99, 98, 98,
> 98, 98, 98, 98, 98, 99, 99, 98, 99, 99, 98, 98, 99, 99, 97, 98,
> 98, 98, 99, 98, 98, 98, 99, 98, 98)), .Names = "var1", row.names = c(NA,
> 80L), class = "data.frame")
> unique(dat1[,1])
> #[1] 97 98 99 96 NA
> dat2<-dat1[!is.na(dat1)]
> cut2(dat2,g=4)
> # [1] (96,97]   (96,97]   (97,98]   (97,98]   (96,97]   (98,99]   (96,97]
>  #[8] (97,98]   (98,99]   (97,98]   (98,99]   (97,98]   (97,98]   (96,97]
> #[15] (96,97]   (97,98]   (98,99]   (97,98]   (-Inf,96] (97,98]   (97,98]
> #[22] (98,99]   (97,98]   (97,98]   (98,99]   (98,99]   (97,98]   (98,99]
> #[29] (97,98]   (98,99]   (98,99]   (98,99]   (98,99]   (97,98]   (98,99]
> #[36] (-Inf,96] (98,99]   (97,98]   (97,98]   (98,99]   (96,97]   (97,98]
> #[43] (98,99]   (98,99]   (96,97]   (98,99]   (98,99]   (97,98]   (97,98]
> #[50] (97,98]   (98,99]   (98,99]   (97,98]   (97,98]   (97,98]   (97,98]
> #[57] (97,98]   (97,98]   (97,98]   (98,99]   (98,99]   (97,98]   (98,99]
> #[64] (98,99]   (97,98]   (97,98]   (98,99]   (98,99]   (96,97]   (97,98]
> #[71] (97,98]   (97,98]   (98,99]   (97,98]   (97,98]   (97,98]   (98,99]
> #[78] (97,98]   (97,98]
> #Levels: (-Inf,96] (96,97] (97,98] (98,99]
> A.K.
>
>
>
>
> - Original Message -
> From: Charles Determan Jr 
> To: r-help@r-project.org
> Cc:
> Sent: Wednesday, October 17, 2012 3:42 PM
> Subject: [R] cut2 error
>
> To R users,
>
> I am trying to use cut2 function from the 'Hmisc' library.  However, when I
> try and run the function on the following variable, I get an error message
> (displayed below).  I suspect it is because of the NA but I have no idea
> how to address the error.  Many thanks to any insights.
>
> structure(list(var1 = c(97, 97, 98, 98, 97, 99, 97,
> 98, 99, 98, 99, 98, 98, 97, 97, 98, 99, 98, 96, 98, 98, 99, 98,
> 98, 99, 99, 98, 99, 98, 99, 99, 99, 99, 98, 99, 96, 99, 98, 98,
> 99, 97, 98, 99, 99, 97, 99, 99, 98, 98, 98, 99, NA, 99, 98, 98,
> 98, 98, 98, 98, 98, 99, 99, 98, 99, 99, 98, 98, 99, 99, 97, 98,
> 98, 98, 99, 98, 98, 98, 99, 98, 98)), .Names = "var1", row.names = c(NA,
> 80L), class = "data.frame")
>
> cut2(dat[,1], g=4)
>
> Warning message:
> In min(xx[xx > upper]) : no non-missing arguments to min; returning Inf
>
> Regards,
> Charles
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cut2 error

2012-10-17 Thread Charles Determan Jr

David,

I am pursuing this because this is just one variable analyzed within a loop
where most have the 4 defined quartiles to assign 4 groups.  When I see
that error, I wanted to understand it to see if there is something wrong.
That is why I isolated it and tried the particular variable and still got
the error.  Is this something to just ignore and add a component in the
loop saying there isn't four groups with that variable?

Regards

On Wed, Oct 17, 2012 at 3:58 PM, David Winsemius wrote:

>
> On Oct 17, 2012, at 1:52 PM, Charles Determan Jr wrote:
>
>  Hi A.K.
>>
>> I tried your code exactly as you presented but I only get the following
>> output, did I somehow miss something?  Was there something else loaded?
>> Thanks,
>>
>> [1] [96, 99) [96, 99) [96, 99) [96, 99) [96, 99) 99   [96, 99) [96,
>> 99)
>> [9] 99   [96, 99) 99   [96, 99) [96, 99) [96, 99) [96, 99) [96,
>> 99)
>> [17] 99   [96, 99) [96, 99) [96, 99) [96, 99) 99   [96, 99) [96,
>> 99)
>> [25] 99   99   [96, 99) 99   [96, 99) 99   99   99
>> [33] 99   [96, 99) 99   [96, 99) 99   [96, 99) [96, 99) 99
>> [41] [96, 99) [96, 99) 99   99   [96, 99) 99   99   [96,
>> 99)
>> [49] [96, 99) [96, 99) 99   99   [96, 99) [96, 99) [96, 99) [96,
>> 99)
>> [57] [96, 99) [96, 99) [96, 99) 99   99   [96, 99) 99   99
>> [65] [96, 99) [96, 99) 99   99   [96, 99) [96, 99) [96, 99) [96,
>> 99)
>> [73] 99   [96, 99) [96, 99) [96, 99) 99   [96, 99) [96, 99)
>> Levels: [96, 99) 99
>> Warning message:
>> In min(xx[xx > upper]) : no non-missing arguments to min; returning Inf
>>
>>  For the quantiles requested with your data you get:
>
> > quantile(dat[1], prob=(1:4)/4, na.rm=TRUE)
>  25%  50%  75% 100%
>   98   98   99   99
>
> So you only have 2 groups in your answer. Why are you persisting in trying
> to do this operation when you only had 4 values to start with?
>
> --
> David.
>
>
>> On Wed, Oct 17, 2012 at 3:30 PM, arun  wrote:
>>
>>  Hi,
>>> Try this:
>>> dat1<-structure(list(var1 = c(97, 97, 98, 98, 97, 99, 97,
>>> 98, 99, 98, 99, 98, 98, 97, 97, 98, 99, 98, 96, 98, 98, 99, 98,
>>> 98, 99, 99, 98, 99, 98, 99, 99, 99, 99, 98, 99, 96, 99, 98, 98,
>>> 99, 97, 98, 99, 99, 97, 99, 99, 98, 98, 98, 99, NA, 99, 98, 98,
>>> 98, 98, 98, 98, 98, 99, 99, 98, 99, 99, 98, 98, 99, 99, 97, 98,
>>> 98, 98, 99, 98, 98, 98, 99, 98, 98)), .Names = "var1", row.names = c(NA,
>>> 80L), class = "data.frame")
>>> unique(dat1[,1])
>>> #[1] 97 98 99 96 NA
>>> dat2<-dat1[!is.na(dat1)]
>>> cut2(dat2,g=4)
>>> # [1] (96,97]   (96,97]   (97,98]   (97,98]   (96,97]   (98,99]   (96,97]
>>> #[8] (97,98]   (98,99]   (97,98]   (98,99]   (97,98]   (97,98]   (96,97]
>>> #[15] (96,97]   (97,98]   (98,99]   (97,98]   (-Inf,96] (97,98]   (97,98]
>>> #[22] (98,99]   (97,98]   (97,98]   (98,99]   (98,99]   (97,98]   (98,99]
>>> #[29] (97,98]   (98,99]   (98,99]   (98,99]   (98,99]   (97,98]   (98,99]
>>> #[36] (-Inf,96] (98,99]   (97,98]   (97,98]   (98,99]   (96,97]   (97,98]
>>> #[43] (98,99]   (98,99]   (96,97]   (98,99]   (98,99]   (97,98]   (97,98]
>>> #[50] (97,98]   (98,99]   (98,99]   (97,98]   (97,98]   (97,98]   (97,98]
>>> #[57] (97,98]   (97,98]   (97,98]   (98,99]   (98,99]   (97,98]   (98,99]
>>> #[64] (98,99]   (97,98]   (97,98]   (98,99]   (98,99]   (96,97]   (97,98]
>>> #[71] (97,98]   (97,98]   (98,99]   (97,98]   (97,98]   (97,98]   (98,99]
>>> #[78] (97,98]   (97,98]
>>> #Levels: (-Inf,96] (96,97] (97,98] (98,99]
>>> A.K.
>>>
>>>
>>>
>>>
>>> - Original Message -
>>> From: Charles Determan Jr 
>>> To: r-help@r-project.org
>>> Cc:
>>> Sent: Wednesday, October 17, 2012 3:42 PM
>>> Subject: [R] cut2 error
>>>
>>> To R users,
>>>
>>> I am trying to use cut2 function from the 'Hmisc' library.  However,
>>> when I
>>> try and run the function on the following variable, I get an error
>>> message
>>> (displayed below).  I suspect it is because of the NA but I have no idea
>>> how to address the error.  Many thanks to any insights.
>>>
>>> structure(list(var1 = c(97, 97, 98, 98, 97, 99, 97,
>>> 98, 99, 98, 99, 98, 98, 97, 97, 98, 99, 98, 96, 98, 98, 99, 98,
>>> 98, 99, 99, 98, 99, 98, 99, 99, 99, 99, 98, 99, 96, 99, 98, 98,
>>> 99, 97, 98, 99, 99, 97, 99, 99, 98, 98

[R] looping survdiff?

2012-10-18 Thread Charles Determan Jr

Hello,

I am trying to set up a loop that can run the survdiff function with the
ultimate goal to generate a csv file with the p-values reported.  However,
whenever I try a loop I get an error such as "invalid type (list) for
variable 'survival_data_variables[i]".

This is a subset of my data:

structure(list(time = c(1.516667, 72, 72, 25.78333,
72, 72, 72, 72, 72, 72, 1.18, 0.883,
1.15, 0.867, 72, 1.03, 72, 1.05, 72,
22.76667), group = c(2L, 1L, 3L, 3L, 3L, 4L, 4L,
1L, 3L, 3L, 3L, 3L, 4L, 3L, 3L, 4L, 3L, 4L, 3L, 4L), completion =
structure(c(2L,
1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L,
2L, 1L, 2L), .Label = c("1", "2"), class = "factor"), var1 =
structure(c(2L,
2L, 3L, 1L, 1L, 3L, 1L, 1L, 1L, 3L, 2L, 2L, 4L, 3L, 2L, 4L, 2L,
4L, 2L, 4L), .Label = c("1", "2", "3", "4"), class = "factor"),
var2 = structure(c(3L, 3L, 1L, 1L, 2L, 4L, 3L,
3L, 2L, 4L, 2L, 1L, 2L, 1L, 2L, 2L, 4L, 4L, 2L, 3L), .Label = c("1",
"2", "3", "4"), class = "factor"), var3 = structure(c(4L,
2L, 3L, 1L, 3L, 4L, 4L, 2L, 2L, 4L, 2L, 2L, 1L, 2L, 2L, 2L,
1L, 3L, 4L, 1L), .Label = c("1", "2", "3", "4"), class = "factor")),
.Names = c("time",
"group", "completion", "var1", "var2",
"var3"), row.names = c(NA, 20L), class = "data.frame")


The loop I have been trying for just group 3 is:

d=data.frame()
for(i in 4:6){
a=assign(paste("p-value",i,sep=""),
survdiff(Surv(time, completion=="2")~dat[i],
data=dat[group=="3",],
rho=0))
b=as.matrix(a$chisq)
d=rbind(d, b)
write.csv(d, file="C:/.../junk.csv", quote=FALSE)}

Perhaps I am making this more difficult than it needs to be.  Thanks for
any help,

Charles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Looping survdiff

2012-10-19 Thread Charles Determan Jr

Thank you for all your responses, I assure you this is not homework.  I am
a graduate student and my classes are complete.  I am trying multiple
different ways to analyze data and my lab requests different types of
scripts to accomplish various tasks.  I am the most computer savy in the
lab so it comes to me.  I am continually trying to learn more about using R
and I truly value all the suggestions.  Again, thank you for your
assistance,

Regards,
Charles

On Fri, Oct 19, 2012 at 8:30 AM, Terry Therneau  wrote:

> The number of recent questions from umn.edu makes me wonder if there's
> homework involved
>
> Simpler for your example is to use get and subset.
> dat <- structure(.as found below
> var.to.test <- names(dat)[4:6]   #variables of interest
> nvar <- length(var.to.test)
> chisq <- double(nvar)
> for (i in 1:nvar) {
> tfit <- survdiff(Surv(time, completion==2) ~ get(var.to.test[i]),
> data=dat, subset=(group==3))
> chisq[i] <- tfit$chisq
> }
> write.csv(data.frame(var.to.**test, chisq))
>
> On 10/19/2012 05:00 AM, r-help-requ...@r-project.org wrote:
>
>> Hello,
>>
>> I am trying to set up a loop that can run the survdiff function with the
>> ultimate goal to generate a csv file with the p-values reported.  However,
>> whenever I try a loop I get an error such as "invalid type (list) for
>> variable 'survival_data_variables[i]".
>>
>> This is a subset of my data:
>>
>> structure(list(time = c(1.516667, 72, 72, 25.78333,
>> 72, 72, 72, 72, 72, 72, 1.18, 0.883,
>> 1.15, 0.867, 72, 1.03, 72, 1.05, 72,
>> 22.76667), group = c(2L, 1L, 3L, 3L, 3L, 4L, 4L,
>> 1L, 3L, 3L, 3L, 3L, 4L, 3L, 3L, 4L, 3L, 4L, 3L, 4L), completion =
>> structure(c(2L,
>> 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L,
>> 2L, 1L, 2L), .Label = c("1", "2"), class = "factor"), var1 =
>> structure(c(2L,
>> 2L, 3L, 1L, 1L, 3L, 1L, 1L, 1L, 3L, 2L, 2L, 4L, 3L, 2L, 4L, 2L,
>> 4L, 2L, 4L), .Label = c("1", "2", "3", "4"), class = "factor"),
>>  var2 = structure(c(3L, 3L, 1L, 1L, 2L, 4L, 3L,
>>  3L, 2L, 4L, 2L, 1L, 2L, 1L, 2L, 2L, 4L, 4L, 2L, 3L), .Label = c("1",
>>  "2", "3", "4"), class = "factor"), var3 = structure(c(4L,
>>  2L, 3L, 1L, 3L, 4L, 4L, 2L, 2L, 4L, 2L, 2L, 1L, 2L, 2L, 2L,
>>  1L, 3L, 4L, 1L), .Label = c("1", "2", "3", "4"), class = "factor")),
>> .Names = c("time",
>> "group", "completion", "var1", "var2",
>> "var3"), row.names = c(NA, 20L), class = "data.frame")
>>
>>
>> The loop I have been trying for just group 3 is:
>>
>> d=data.frame()
>> for(i in 4:6){
>>  a=assign(paste("p-value",i,**sep=""),
>>  survdiff(Surv(time, completion=="2")~dat[i],
>>  data=dat[group=="3",],
>>  rho=0))
>>  b=as.matrix(a$chisq)
>>  d=rbind(d, b)
>> write.csv(d, file="C:/.../junk.csv", quote=FALSE)}
>>
>> Perhaps I am making this more difficult than it needs to be.  Thanks for
>> any help,
>>
>> Charles
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Kaplan Meier Post Hoc?

2012-10-24 Thread Charles Determan Jr

This is more of a general question without data.  After doing 'survdiff',
from the 'survival' package, on strata including four groups (so 4 curves
on a Kaplan Meier curve) you get a chi squared p-value whether to reject
the null hypothesis or not.  Is there a method to followup with pairwise
testing on the respective groups?  I have searched the library but have
come up with nothing.  Perhaps I am mistaken in something here.

Regards,
Charles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Merge matrices with different column names

2012-10-25 Thread Charles Determan Jr

A general question that I have been pursuing for some time but have set
aside.  When finishing some analysis, I can have multiple matrices that
have specific column names.  Ideally, I would like to combine these
separate matrices for a final output as a csv file.

A generic example:

Matrix 1
var1A  var1B  var1C
x  x   x
x  x   x

Matrix 2
var2A  var2B  var2C
x  x   x
x  x   x

I would like a final exportable matrix or dataframe or whichever format is
most workable.

Matrix 3
var1A  var1B  var1C
x  x   x
x  x   x

var2A  var2B  var2C
x  x   x
x  x   x

However, no matter which function I try reports an error that the column
names are not the same.

Any insights would be appreciated.
Thanks as always,
Charles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Merge matrices with different column names

2012-10-26 Thread Charles Determan Jr

Dennis,

This works well and is exactly what I wanted for these matrices.  Thank you
very much, however, when I would try to export the resulting list it just
is bound by columns.  Now this isn't so bad for 2 matrices but I hope to
apply this where there are many matrices and scrolling down a file such as
a 'csv' would be desired.  Any thoughts on exporting a list of matrices?

Thanks,
Charles

On Fri, Oct 26, 2012 at 12:00 AM, Dennis Murphy  wrote:

> Hi:
>
> You can't represent the same columns with different names in a matrix (or
> data frame, for that matter). If you want to preserve variable names in the
> various matrices and collect them into one R object, I'd suggest creating a
> list, each component of which is a matrix.
>
> Toy example:
>
> m1 <- matrix(1:9, nrow = 3, dimnames = list(NULL, paste0("var", 1:3)))
> m2 <- matrix(rpois(9, 5), nrow = 3, dimnames = list(NULL, paste0("var",
> 4:6)))
> L <- list(m1, m2)
> names(L) <- paste0("matrix", 1:2)
> L
>
> Dennis
>
> On Thu, Oct 25, 2012 at 8:51 PM, Charles Determan Jr wrote:
>
>> A general question that I have been pursuing for some time but have set
>> aside.  When finishing some analysis, I can have multiple matrices that
>> have specific column names.  Ideally, I would like to combine these
>> separate matrices for a final output as a csv file.
>>
>> A generic example:
>>
>> Matrix 1
>> var1A  var1B  var1C
>> x  x   x
>> x  x   x
>>
>> Matrix 2
>> var2A  var2B  var2C
>> x  x   x
>> x  x   x
>>
>> I would like a final exportable matrix or dataframe or whichever format is
>> most workable.
>>
>> Matrix 3
>> var1A  var1B  var1C
>> x  x   x
>> x  x   x
>>
>> var2A  var2B  var2C
>> x  x   x
>> x  x   x
>>
>> However, no matter which function I try reports an error that the column
>> names are not the same.
>>
>> Any insights would be appreciated.
>> Thanks as always,
>> Charles
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] simulation data

2014-04-04 Thread Charles Determan Jr

Hi Thanoon,

How about this?
# replicate p=10 times random sampling n=1000 from a vector containing your
ordinal categories (1,2,3,4)
R <- replicate(10, sample(as.vector(seq(4)), 1000, replace = T))

Cheers,
Charles



On Fri, Apr 4, 2014 at 7:10 AM, thanoon younis
wrote:

> dear sir
> i want to simulate multivariate ordinal data matrix with categories (1,4)
> and n=1000 and p=10.
> thanks alot
>
> thanoon
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] simulation data

2014-04-05 Thread Charles Determan Jr

Thanoon,

Firstly, please remember to reply to the R help list as well so that other
may benefit from your questions as well.

Regarding your second request, I have written the following as a very naive
way of inducing correlations.  Hopefully this makes it perfectly clear what
you change for different sample sizes.

ords <- seq(4)
p <- 10
N <- 1000
percent_change <- 0.9

R <- as.data.frame(replicate(p, sample(ords, N, replace = T)))

# spearman is more appropriate for ordinal data
cor(R, method = "spearman")

# subset variable to have a stronger correlation
v1 <- R[,1, drop = FALSE]

# randomly choose which rows to retain
keep <- sample(as.numeric(rownames(v1)), size = percent_change*nrow(v1))
change <- as.numeric(rownames(v1)[-keep])

# randomly choose new values for changing
new.change <- sample(ords, ((1-percent_change)*N)+1, replace = T)

# replace values in copy of original column
v1.samp <- v1
v1.samp[change,] <- new.change

# closer correlation
cor(v1, v1.samp, method = "spearman")

# set correlated column as one of your other columns
R[,2] <- v1.samp

This obviously only creates a correlation between two columns.  You need to
decide what you expect from this synthetic dataset.  Do you want perfect
correlations?  Does it matter which variables are correlated?  How many
variables will be correlated?  Are there correlations between multiple
variables?  Do you want negative correlations (hint: opposite values)?

All of these questions would be great exercises for you to improve your R.
You can also turn the above code into a function and have it randomly
select two columns to be correlated if that works for you.  Because of all
of these possibilities I cannot provide the 'right' code but rather guide
you towards something more useful.

Cheers,
Charles

On Fri, Apr 4, 2014 at 8:37 PM, thanoon younis
wrote:

> thanks alot for your help
> now i want two different sample size in R what should i  change in
> previous command? and how can i get correlated simulation data (there are
> an interrelationships between variables)
>
> regards
> thanoon
>
>
> On 4 April 2014 18:42, Charles Determan Jr  wrote:
>
>> Hi Thanoon,
>>
>> How about this?
>> # replicate p=10 times random sampling n=1000 from a vector containing
>> your ordinal categories (1,2,3,4)
>> R <- replicate(10, sample(as.vector(seq(4)), 1000, replace = T))
>>
>> Cheers,
>> Charles
>>
>>
>>
>> On Fri, Apr 4, 2014 at 7:10 AM, thanoon younis <
>> thanoon.youni...@gmail.com> wrote:
>>
>>> dear sir
>>> i want to simulate multivariate ordinal data matrix with categories (1,4)
>>> and n=1000 and p=10.
>>> thanks alot
>>>
>>> thanoon
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> Charles Determan
>> Integrated Biosciences PhD Candidate
>> University of Minnesota
>>
>
>

-- 
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] simulation data

2014-04-10 Thread Charles Determan Jr

Thanoon,

My reply to your previous post should be more than enough for you to
accomplish your goal.  Please look over that script again:

ords <- seq(4)
p <- 10
N <- 1000
percent_change <- 0.9

R <- as.data.frame(replicate(p, sample(ords, N, replace = T)))

or alternatively as Mr. Barradas suggests with rbinom(), I leave options
for you to figure out.  Look at the help page, feel free to experiment with
different numbers and look at the output.  It is important you learn how to
explore new functions you are unfamiliar with.
R <- as.data.frame(replicate(p, rbinom(n=#, size=#, p=#)))

These lists are meant to help people with their code but not do the work
for them.  Given your prior questions to me as well I strongly suggest you
explore some R tutorials.  There are dozens online that should help you
with the basics and understand the above code more clearly.  Also,
regarding your prior question about tetratorich correlations.  You got an
error previously because it is not a standard correlation within the corr()
function.  You can get further information about a function by checking the
help pages ?corr.  You will need to try and find a package that provides a
function to do so or write the function yourself.  This may sound daunting
but if you take some time to learn how to write functions and the method
for the tetratorich correlation isn't that complex you should not have too
much of a problem.

Summing up:
1. Find some R tutorials to get the basics down
2. Try to understand the above code for your problem
3. Find a suitable R package for your specific correlation needs
4. or Learn to write functions and find the means to calculate the
tetratorich correlation.

Regards,
Charles Determan

On Wed, Apr 9, 2014 at 8:28 PM, thanoon younis
wrote:

> hi
>
> i want to simulate multivariate dichotomous data matrix with categories
> (0,1) and n=1000 and p=10.
>
> thanks alot in advance
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help

2014-04-15 Thread Charles Determan Jr

Kafi,

I'm not sure why you contacted me directly so I have also forwarded this to
the r-help list.  I am unsure as to what your problem is.  At first glance,
I noticed you are missing a parentheses in the WL3 line near the end but
that is just after a quick scan of your code.  Please be more specific in
what your problem actually is.  'Bad result' is very vague and isn't
conducive to being helped in any form.

Regards,
Charles

On Tue, Apr 15, 2014 at 3:18 AM, kafi dano  wrote:

> Dear Sir.
> I need your to help me to correct the attached R-code.
> when I apply this code give me the bad result
>
> Attached the program by using R
> Thank you
>
> Kafi Dano Pati
> Ph.D candidate ( mathematics/statistics)
> Department of mathematical Science/ faculty of Science
> University Technology Malaysia
> 81310 UTM, Johor Bahru, Johor, Malaysia
> IC. NO. 201202F10234
> Matric No. PS113113
> HP. No.  00601117517559
> E-mail: kafi_d...@yahoo.com
> supervisor- Assoc. Prof. Robiah Binti Adnan
>
>

-- 
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] conditional probability removal

2014-04-17 Thread Charles Determan Jr

Greetings,

I would like to randomly remove elements from a numeric vector but with
different probabilities for higher numbers.

For example:

dat <- sample(seq(10), 100, replace=T)

# now I would like to say randomly remove elements but with a higher chance
of removing elements >= 5 and even greater for elements >= 8.

I am unfamiliar if there is a way to define conditional probabilities.  Any
insight would be appreciated.

Regards,
Charles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error in R program

2014-06-05 Thread Charles Determan Jr

 measurement equations
>
>
> Y1[i,1]=Xi1[i,1]+eps1[1]
> Y1[i,2]=0.8*Xi1[i,1]+eps1[2]
> Y1[i,3]=0.8*Xi1[i,1]+eps1[3]
> Y1[i,4]=0.8*Xi1[i,1]+eps1[4]
> Y1[i,5]=Xi1[i,2]+eps1[5]
> Y1[i,6]=0.8*Xi1[i,2]+eps1[6]
> Y1[i,7]=Xi1[i,3]+eps1[7]
> Y1[i,8]=Xi1[i,2]+eps1[8]
> Y1[i,9]=Eta1[i]+eps1[9]
> Y1[i,10]=0.8*Eta1[i]+eps1[10]
> }
>
> #transform theta to orinal variables
> for (j in 1:10) { if (Y1[j]>0) yo1[i,j]<-1 else yo1[i,j]<-0 }
>
>
> #Input data set for WinBUGS
> data<-list(N1=200,N2=200,P=10,R=Ro,z=yo1)
>
> #Call WinBUGS
> model<-bugs(data,inits,parameters,model.file="D:/Run/model.txt",
> n.chains=2,n.iter=5000,n.burnin=1,n.thin=1,DIC=True,
> bugs.directory="c:/Program Files/WinBUGS14/",
> working.directory="D:/Run/")
>
> #Save Bayesian Estimates
> Eu1[t,]<-model$mean$uby; Elam1[t,]<-model$mean$lam;
> Egam1[t,]<-model$mean$gam
> Ephx1[t,1]<-model$mean$phx1[1,1]; Ephx1[t,2]<-model$mean$phx1[1,2]
> Ephx1[t,3]<-model$mean$phx1[2,2];
> Esgd1[t]<-model$mean$sgd
>
> #Save Standard Errors
> SEu1[t,]<-model$sd$uby1; SElam1[t,]<-model$sd$lam;
> SEgam1[t,]<-model$sd$gam1
> SEphx1[t,1]<-model$sd$phx1[1,1]; SEphx1[t,2]<-model$sd$phx1[1,2]
> SEphx1[t,3]<-model$sd$phx1[2,2]; SEb1[t]<-model$sd$ubeta1
> SEsgd1[t]<-model$sd$sgd
>
> #Save HPD intervals
> for (i in 1:10) {
> temp=model$sims.array[,1,i];
> uby[t,i,]=boa.hpd(temp,0.05)
> }
> temp=model$sims.array[,1,10]; ubeta[t,]=boa.hpd(temp,0.05)
> for (i in 1:6) {
> temp=model$sims.array[,1,10+i];
> lam[t,i,]=boa.hpd(temp,0.05)
> }
> for (i in 1:3) {
> temp=model$sims.array[,1,16+i];
> gam[t,i,]=boa.hpd(temp,0.05)
> }
> temp=model$sims.array[,1,20]; sgd[t,]=boa.hpd(temp,0.05)
> temp=model$sims.array[,1,21]; phx[t,1,]=boa.hpd(temp,0.05)
> temp=model$sims.array[,1,22]; phx[t,2,]=boa.hpd(temp,0.05)
> temp=model$sims.array[,1,24]; phx[t,3,]=boa.hpd(temp,0.05)
>
> #Save DIC value
> DIC[t]=model#DIC
> }   #end
>
>
>


-- 
Dr. Charles Determan, PhD
Integrated Biosciences
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fwd: error in R program

2014-06-05 Thread Charles Determan Jr

-- Forwarded message --
From: thanoon younis 
Date: Thursday, June 5, 2014
Subject: error in R program
To: Charles Determan Jr 

many thanks to you Dr. Charles
Really i have a problem with simulation data in xi  and now i have this
erro r   "Error in mvrnorm(1, c(0, 0, 0), phi1, tol = 1e-06, empirical =
FALSE,  :   incompatible arguments"
Regards

On 5 June 2014 16:45, Charles Determan Jr  wrote:

Hello again Thanoon,

Once again, you should send these request not to me but to the r-help
list.  You are far more likely to get help from the greater R community
than just me.  Furthermore, it is not entirely clear where your error is.
It is courteous to provide only the code that is run up to the error and
commenting the location.  I see you have a loop and all the more important
to isolate where the error is taking place and commenting it out.

My recommendations:

1. Set both t and i = 1 and try to run your loop code manually to locate
the specific function providing the error.
2. Check your data inputs for the function.  The error tells you that some
data is missing or infinite.  Perhaps a slight error in your data
generation?
3. See if you can address the specific instance before running the full
loops, it may be multiple instances or just one.

I hope this provides some guidance,

Charles

On Thu, Jun 5, 2014 at 3:10 AM, thanoon younis 
wrote:

Dear Dr. Charles
i need your help to correct the winbugs code below to estimate parameters
of SEM by using bayesian inference for two group. I have this error"Error
in eigen(Sigma, symmetric = TRUE, EISPACK = EISPACK) : infinite or missing
values in 'x'.
many thanks in advance
R-CODE

library(MASS)  #Load the MASS package
library(R2WinBUGS) #Load the R2WinBUGS package
library(boa)   #Load the boa package
library(coda)  #Load the coda package
N1<-200;N2=100; P1<-10;P2<-10
phi1<-matrix(NA,nrow=4,ncol=4) #The covariance matrix of xi1
Ro1<-matrix(NA,nrow=4,ncol=4)
yo1<-matrix(data=NA,nrow=200,ncol=10)

phi2<-matrix(NA,nrow=4,ncol=4) #The covariance matrix of xi2
Ro2<-matrix(NA,nrow=4,ncol=4)
yo2<-matrix(data=NA,nrow=200,ncol=10)
#Matrices save the Bayesian Estimates and Standard Errors
Eu1<-matrix(data=NA,nrow=100,ncol=10)
SEu<-matrix(data=NA,nrow=100,ncol=10)
Elam1<-matrix(data=NA,nrow=100,ncol=6)
SElam1<-matrix(data=NA,nrow=100,ncol=6)
Egam1<-matrix(data=NA,nrow=100,ncol=4)
SEgam1<-matrix(data=NA,nrow=100,ncol=4)
Ephx1<-matrix(data=NA,nrow=100,ncol=4)
SEphx1<-matrix(data=NA,nrow=100,ncol=4)
Eb1<-numeric(100); SEb<-numeric(100)
Esgd1<-numeric(100); SEsgd<-numeric(100)
Eu2<-matrix(data=NA,nrow=100,ncol=10)
SEu2<-matrix(data=NA,nrow=100,ncol=10)
Elam2<-matrix(data=NA,nrow=100,ncol=6)
SElam2<-matrix(data=NA,nrow=100,ncol=6)
Egam2<-matrix(data=NA,nrow=100,ncol=3)
SEgam2<-matrix(data=NA,nrow=100,ncol=3)
Ephx2<-matrix(data=NA,nrow=100,ncol=3)
SEphx2<-matrix(data=NA,nrow=100,ncol=3)
Eb2<-numeric(100); SEb2<-numeric(100)
Esgd2<-numeric(100); SEsgd2<-numeric(100)

#Arrays save the HPD intervals
mu.y1=array(NA, c(100,10,2))
lam1=array(NA, c(100,6,2))
gam1=array(NA, c(100,4,2))
sgd1=array(NA, c(100,2))
phx1=array(NA, c(100,4,2))
mu.y2=array(NA, c(100,10,2))
lam2=array(NA, c(100,6,2))
gam2=array(NA, c(100,3,2))
sgd2=array(NA, c(100,2))
phx2=array(NA, c(100,3,2))
DIC=numeric(100)#DIC values
#Parameters to be estimated
parameters<-c
("mu.y1","lam1","gam1","phi1","psi1","psd1","sgd1","phx1","mu.y2","lam2","gam2","phi2","psi2","psd2","sgd2","phx2")
#Initial values for the MCMC in WinBUGS
init1<-list(uby1=rep(0.0,10),lam1=rep(0.0,10),
gam1=c(1.0,1.0,1.0

-- 
Dr. Charles Determan, PhD
Integrated Biosciences

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error in R program

2014-06-07 Thread Charles Determan Jr

REPLY TO ALL FOR THE R-HELP LIST!!!

I apologize for the bluntness but you must realize that it is critical if
you desire to get help on the mailing list beyond a single person your
question must actually get to the mailing list.  My expertise only goes so
far and there is an infinitely larger community that could likely help you
more quickly and often in a way that can increase performance.  I am always
happy to help when time allows but please make sure the r-help@r-project.org
is a recipient as well.

Regarding your code, after glancing at it again it is clear that you aren't
actually submitting a covariance matrix to mvrnorm.  The only place you
specify phi1 is initially with NA's and then within the list init1.  If you
intend to use the list init1 you specify it within the function call:

xi1<-mvrnorm(1,c(0,0,0,0),init1$phi1, tol = 1e-6, empirical = FALSE,
EISPACK = FALSE)

Once again, please make sure you are not just sending these messages solely
to me.

Regards,
Charles


On Thu, Jun 5, 2014 at 9:04 AM, thanoon younis 
wrote:

> many thanks to you Dr. Charles
>
> Really i have a problem with simulation data in xi  and now i have this
> erro r   "Error in mvrnorm(1, c(0, 0, 0), phi1, tol = 1e-06, empirical =
> FALSE,  :   incompatible arguments"
>
> Regards
>
>
>
>
> On 5 June 2014 16:45, Charles Determan Jr  wrote:
>
>> Hello again Thanoon,
>>
>> Once again, you should send these request not to me but to the r-help
>> list.  You are far more likely to get help from the greater R community
>> than just me.  Furthermore, it is not entirely clear where your error is.
>> It is courteous to provide only the code that is run up to the error and
>> commenting the location.  I see you have a loop and all the more important
>> to isolate where the error is taking place and commenting it out.
>>
>> My recommendations:
>>
>> 1. Set both t and i = 1 and try to run your loop code manually to locate
>> the specific function providing the error.
>> 2. Check your data inputs for the function.  The error tells you that
>> some data is missing or infinite.  Perhaps a slight error in your data
>> generation?
>> 3. See if you can address the specific instance before running the full
>> loops, it may be multiple instances or just one.
>>
>> I hope this provides some guidance,
>>
>> Charles
>>
>>
>> On Thu, Jun 5, 2014 at 3:10 AM, thanoon younis <
>> thanoon.youni...@gmail.com> wrote:
>>
>>> Dear Dr. Charles
>>> i need your help to correct the winbugs code below to estimate
>>> parameters of SEM by using bayesian inference for two group. I have this
>>> error"Error in eigen(Sigma, symmetric = TRUE, EISPACK = EISPACK) : infinite
>>> or missing values in 'x'.
>>>
>>> many thanks in advance
>>>
>>> R-CODE
>>>
>>> library(MASS)  #Load the MASS package
>>> library(R2WinBUGS) #Load the R2WinBUGS package
>>> library(boa)   #Load the boa package
>>> library(coda)  #Load the coda package
>>>
>>> N1<-200;N2=100; P1<-10;P2<-10
>>>
>>> phi1<-matrix(NA,nrow=4,ncol=4) #The covariance matrix of xi1
>>> Ro1<-matrix(NA,nrow=4,ncol=4)
>>> yo1<-matrix(data=NA,nrow=200,ncol=10)
>>>
>>> phi2<-matrix(NA,nrow=4,ncol=4) #The covariance matrix of xi2
>>> Ro2<-matrix(NA,nrow=4,ncol=4)
>>> yo2<-matrix(data=NA,nrow=200,ncol=10)
>>>
>>> #Matrices save the Bayesian Estimates and Standard Errors
>>> Eu1<-matrix(data=NA,nrow=100,ncol=10)
>>> SEu<-matrix(data=NA,nrow=100,ncol=10)
>>> Elam1<-matrix(data=NA,nrow=100,ncol=6)
>>> SElam1<-matrix(data=NA,nrow=100,ncol=6)
>>> Egam1<-matrix(data=NA,nrow=100,ncol=4)
>>> SEgam1<-matrix(data=NA,nrow=100,ncol=4)
>>> Ephx1<-matrix(data=NA,nrow=100,ncol=4)
>>> SEphx1<-matrix(data=NA,nrow=100,ncol=4)
>>> Eb1<-numeric(100); SEb<-numeric(100)
>>> Esgd1<-numeric(100); SEsgd<-numeric(100)
>>> Eu2<-matrix(data=NA,nrow=100,ncol=10)
>>> SEu2<-matrix(data=NA,nrow=100,ncol=10)
>>> Elam2<-matrix(data=NA,nrow=100,ncol=6)
>>> SElam2<-matrix(data=NA,nrow=100,ncol=6)
>>> Egam2<-matrix(data=NA,nrow=100,ncol=3)
>>> SEgam2<-matrix(data=NA,nrow=100,ncol=3)
>>> Ephx2<-matrix(data=NA,nrow=100,ncol=3)
>>> SEphx2<-matrix(data=NA,nrow=100,ncol=3)
>>> Eb2<-numeric(100); SEb2<-numeric(100)
>>> Esgd2<-numeric(100); SEsgd2<-numeric(100)
>>>
>>>
>>> #Arrays sav

Re: [R] error in R program

2014-06-08 Thread Charles Determan Jr

Firstly, you both need to subscribe to the mailing list.  Please go to
https://stat.ethz.ch/mailman/listinfo/r-help and subscribe.  In this way
you will also get emails from people asking questions and may benefit or
even contribute help to another.  There are several other specialty help
lists that may or may not benefit you.

As for your package installation problem it is difficult to say at this
point.  It installs perfectly fine for me.

1. Do any of the dependencies install or none at all?
2. How are you installing (install.packages or Rstudio interface)?
3. What is your R version?

Charles


On Sat, Jun 7, 2014 at 9:23 AM, Yijia Wang  wrote:

> Hi Charles,
>
> I sent my message to the mail you said, but was rejected and I can't
> figure out why, and here is my problem:
>
> when I install fGarch package, I got the following error message:
>
> Warning in install.packages :
>   unable to access index for repository
> http://cran.rstudio.com/bin/windows/contrib/3.0
> Warning in install.packages :
>   unable to access index for repository
> http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/3.0
> Error in install.packages : cannot open the connection
>
> Could you please give me some advice on how to fix it?
>
> Many thanks~
>
>
> On Sat, Jun 7, 2014 at 6:41 AM, Charles Determan Jr 
> wrote:
>
>> REPLY TO ALL FOR THE R-HELP LIST!!!
>>
>> I apologize for the bluntness but you must realize that it is critical if
>> you desire to get help on the mailing list beyond a single person your
>> question must actually get to the mailing list.  My expertise only goes so
>> far and there is an infinitely larger community that could likely help you
>> more quickly and often in a way that can increase performance.  I am
>> always
>> happy to help when time allows but please make sure the
>> r-help@r-project.org
>> is a recipient as well.
>>
>> Regarding your code, after glancing at it again it is clear that you
>> aren't
>> actually submitting a covariance matrix to mvrnorm.  The only place you
>> specify phi1 is initially with NA's and then within the list init1.  If
>> you
>> intend to use the list init1 you specify it within the function call:
>>
>> xi1<-mvrnorm(1,c(0,0,0,0),init1$phi1, tol = 1e-6, empirical = FALSE,
>> EISPACK = FALSE)
>>
>> Once again, please make sure you are not just sending these messages
>> solely
>> to me.
>>
>> Regards,
>> Charles
>>
>>
>> On Thu, Jun 5, 2014 at 9:04 AM, thanoon younis <
>> thanoon.youni...@gmail.com>
>> wrote:
>>
>> > many thanks to you Dr. Charles
>> >
>> > Really i have a problem with simulation data in xi  and now i have this
>> > erro r   "Error in mvrnorm(1, c(0, 0, 0), phi1, tol = 1e-06, empirical =
>> > FALSE,  :   incompatible arguments"
>> >
>> > Regards
>> >
>> >
>> >
>> >
>> > On 5 June 2014 16:45, Charles Determan Jr  wrote:
>> >
>> >> Hello again Thanoon,
>> >>
>> >> Once again, you should send these request not to me but to the r-help
>> >> list.  You are far more likely to get help from the greater R community
>> >> than just me.  Furthermore, it is not entirely clear where your error
>> is.
>> >> It is courteous to provide only the code that is run up to the error
>> and
>> >> commenting the location.  I see you have a loop and all the more
>> important
>> >> to isolate where the error is taking place and commenting it out.
>> >>
>> >> My recommendations:
>> >>
>> >> 1. Set both t and i = 1 and try to run your loop code manually to
>> locate
>> >> the specific function providing the error.
>> >> 2. Check your data inputs for the function.  The error tells you that
>> >> some data is missing or infinite.  Perhaps a slight error in your data
>> >> generation?
>> >> 3. See if you can address the specific instance before running the full
>> >> loops, it may be multiple instances or just one.
>> >>
>> >> I hope this provides some guidance,
>> >>
>> >> Charles
>> >>
>> >>
>> >> On Thu, Jun 5, 2014 at 3:10 AM, thanoon younis <
>> >> thanoon.youni...@gmail.com> wrote:
>> >>
>> >>> Dear Dr. Charles
>> >>> i need your help to correct the winbugs code below to estimate
>> >>> parameters of SEM by using bayesian inference for two group. I have
>>

[R] reformatting some data

2012-12-04 Thread Charles Determan Jr

Hello,

I am trying to reformat some data so that it is organized by group in the
columns.  The data currently looks like this:

   group X3.Hydroxybutyrate X3.Hydroxyisovalerate   ADP
347 4  4e-04 3e-04  5e-04
353 3  5e-04 3e-04  6e-04
359 4  4e-04 3e-04  6e-04
365 4  6e-04 3e-04  5e-04
371 4  5e-04 3e-04  7e-04
377 2  7e-04 4e-04  7e-04

I would like to reformat it so it is like this:

2  3   4
var1
var2
var3


I realize that there unequal numbers in each group but I would like to
none-the-less if possible.
Here is a subset of the data:

structure(list(group = c(4L, 3L, 4L, 4L, 4L, 2L), X3.Hydroxybutyrate =
c(4e-04,
5e-04, 4e-04, 6e-04, 5e-04, 7e-04), X3.Hydroxyisovalerate = c(3e-04,
3e-04, 3e-04, 3e-04, 3e-04, 4e-04), ADP = c(5e-04, 6e-04, 6e-04,
5e-04, 7e-04, 7e-04)), .Names = c("group", "X3.Hydroxybutyrate",
"X3.Hydroxyisovalerate", "ADP"), row.names = c(347L, 353L, 359L,
365L, 371L, 377L), class = "data.frame")

Any insight is truly appreciated,
Regards,
Charles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fwd: Calculating group means

2013-12-23 Thread Charles Determan Jr

I would suggest using summaryBy()

library(doBy)
# sample data with you specifications
subject <- as.factor(rep(seq(13), each = 5))
state <- as.factor(sample(c(1:8), 65, replace = TRUE))
condition <- as.factor(sample(c(1:10), 65, replace = TRUE))
latency <- runif(65, min=750, max = 1100)

dat <- data.frame(subject, state, condition, latency)

summaryBy(latency~subject+state+condition, data = dat, FUN = function(x)
mean(x))

Regards,


On Mon, Dec 23, 2013 at 6:31 AM, Laura Bethan Thomas [lbt1]  wrote:

> > Hi All,
> >
> > Sorry for what I imagine is quite a basic question. I have been trying
> to do is create latency averages for each state (1-8) for each participant
> (n=13) in each condition (1-10). I'm not sure what function I would need,
> or what the most efficient ay of calculating this would be. If you have any
> help with that I would be very grateful.
> >
> > structure(list(subject = c(1L, 1L, 1L, 1L, 1L, 1L), conditionNo = c(1L,
> > 1L, 1L, 1L, 1L, 1L), state = c(5L, 8L, 7L, 8L, 1L, 7L), latency = c(869L,
> > 864L, 1004L, 801L, 611L, 679L)), .Names = c("subject", "conditionNo",
> > "state", "latency"), row.names = 3:8, class = "data.frame")
> >
> > Thanks again,
> >
> > Laura
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Recycling other internal package functions

2013-09-23 Thread Charles Determan Jr

Greetings,

I am not sure if this question should be posted on the development mailing
list but perhaps it is general enough for this mailing list.  I am
currently developing an R package and there are other packages that use
some internal functions that I would also like to utilize (e.g. reformat
some output, run summary statistics over different structures, etc.).  By
internal I mean that the user is unable to to use these functions, they
simply exist to to make other functions work or make the output 'pretty'.

My question is, what is the appropriate way to recycle these functions in
other packages.  Is it appropriate to simply rewrite them inside the new
package, should I try to import the functions from the other packages, or
some other 'polite' method that extends credit to others.  I certainly
don't want to offend someone if they discovered I reused a function they
made without giving them credit.

Regards,

-- 
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interpreting the result of a Wilcoxon (Mann-Whitney U) test

2013-10-02 Thread Charles Determan Jr

Filipe,

When you chose a different alternative argument you are asking a different
null hypothesis.  You are looking at a two-tailed, lesser than, and greater
than hypotheses.  Which one you chose is dependent upon your initial
question.  Are you asking generically if your two populations (a and b) are
different?  Are you asking if a > b or a < b?  It is my understanding that
you shouldn't just do all of them to see which fits, it depends on what you
initially were intending to test.  If you can answer that question then you
can determine if your appropriate run is significant.

Regards,

-- 
Charles Determan
Integrated Biosciences PhD Candidate
University of Minnesota

On Wed, Oct 2, 2013 at 10:33 AM, Filipe Correia  wrote:

> Hello everyone,
>
> I'm having some trouble interpreting the results of a Wilcoxon
> (Mann-Whitney U) test. Hope you can help.
>
> This is the R script that I am running:
>
> a <- c(1, 1, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 2, 1, 5, 1, 1, 1, 3, 1, 1,
> 1, 1, 1, 1, 3, 1, 1)
> b <- c(1, 2, 1, 1, 2, 3, 2, 2, 1, 2, 1, 1, 1, 2)
> wilcox.test(a, b, alternative="t", mu=0, exact=FALSE, paired=FALSE)  #1st
> wilcox.test(a, b, alternative="l", mu=0, exact=FALSE, paired=FALSE)  #2nd
> wilcox.test(a, b, alternative="g", mu=0, exact=FALSE, paired=FALSE)  #3rd
>
> ... and it's returning:
>
> Wilcoxon rank sum test with continuity correction data:  a and b
> W = 145, p-value = 0.08969
> alternative hypothesis: true location shift is not equal to 0
>
> Wilcoxon rank sum test with continuity correction data:  a and b
> W = 145, p-value = 0.04485
> alternative hypothesis: true location shift is less than 0
>
> Wilcoxon rank sum test with continuity correction data:  a and b
> W = 145, p-value = 0.9582
> alternative hypothesis: true location shift is greater than 0
>
> The null hypothesis is that the populations are equivalent (mu=0). The
> alternative hypothesis are that they differ, with the 2nd and 3rd runs
> of the test above considering respectively that aa. Plus, I'm
> considering an alfa of 0.05.
>
> My issue is that from the first run I could not conclude that there
> was a difference between the two populations (0.08969>0.05), but the
> second run leads me to think that a
> Am I misinterpreting the results?
>
> Thanks!
>
> Filipe
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 121 matches

Mail list logo