date:20140625

Re: [R] Generating Patient Data

2014-06-25 Thread Anthony Damico

# build off of david's suggestion
x <-
data.frame(
patient= 1:20 ,
disease =
sapply(
pmin( 2 + rpois( 20 , 2 ) , 6 ) ,
function( n ) paste0( sample( c('A','B','C','D','E','F'),
n), collapse="+" )
)
)

# break the diseases into a list, one entry per patient
y <- strsplit( as.character( x$disease ) , "\\+" )

# melt the list
library(reshape2)

z <- melt( y )

# re-name the columns in that result
names( z ) <- c( "disease" , "patient" )

# print the results to the screen
z

# compare the structure to `x` if you like
x





On Wed, Jun 25, 2014 at 2:18 AM, Abhinaba Roy 
wrote:

> Hi David,
>
> I was thinking something like this:
>
> ID   Disease
> 1 A
> 2 B
> 3 A
> 1C
> 2D
> 5A
> 4B
> 3D
> 2A
> ....
>
> How can this be done?
>
>
> On Wed, Jun 25, 2014 at 11:34 AM, David Winsemius 
> wrote:
>
> >
> > On Jun 24, 2014, at 10:14 PM, Abhinaba Roy wrote:
> >
> > > Dear R helpers,
> > >
> > > I want to generate data for say 1000 patients (i.e., 1000 unique IDs)
> > > having suffered from various diseases in the past (say diseases
> > > A,B,C,D,E,F). The only condition imposed is that each patient should've
> > > suffered from *atleast* two diseases. So my data frame will have two
> > > columns 'ID' and 'Disease'.
> > >
> > > I want to do a basket analysis with this data, where ID will be the
> > > identifier and we will establish rules based on the 'Disease' column.
> > >
> > > How can I generate this type of data in R?
> > >
> >
> > Perhaps something along these lines for 20 cases:
> >
> > > data.frame(patient=1:20, disease = sapply(pmin(2+rpois(20, 2), 6),
> > function(n) paste0( sample( c('A','B','C','D','E','F'), n), collapse="+"
> ) )
> > + )
> >patient disease
> > 11 F+D
> > 22 F+A+D+E
> > 33 F+D+C+E
> > 44 B+D+C+A
> > 55 D+A+F+C
> > 66   E+A+D
> > 77 E+F+B+C+A+D
> > 88   A+B+C+D+E
> > 99 B+E+C+F
> > 10  10 C+A
> > 11  11 B+A+D+E+C+F
> > 12  12 B+C
> > 13  13 A+D+B+E
> > 14  14 D+C+E+F+B+A
> > 15  15   C+F+D+E+A
> > 16  16   A+C+B
> > 17  17 C+D+B+E
> > 18  18 A+B
> > 19  19   C+B+D+E+F
> > 20  20   D+C+F
> >
> > > --
> > > Regards
> > > Abhinaba Roy
> > >
> > >   [[alternative HTML version deleted]]
> >
> > You should read the Posting Guide and learn to post in HTML.
> > >
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> > --
> > David Winsemius
> > Alameda, CA, USA
> >
> >
>
>
> --
> Regards
> Abhinaba Roy
> Statistician
> Radix Analytics Pvt. Ltd
> Ahmedabad
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] SD of Residuals by group

2014-06-25 Thread Katharina Mersmann

 

 

Dear Community,

I emphasize the use of graphical methods to examine residuals for a Panel
model and want to plot the  Standard Deviation of residuals by group.

I used the following for plotting residuals by group:

>library(lattice)

>xyplot((residuals(fixed.reg1.1))~countrynr,data=data.plm)

 

But I have no idea how to adjust this for the SD of Residuals by group ? 

 

Thanks for attention and help!

Katie

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] matlab serial date to r

2014-06-25 Thread Christoph Schlächter

Hi,

I have a matlab variable as serial date (class double) in the form
'dd-mmm- HH:MM:SS'.

format long
disp( tx(40:60,1) )

1.0e+05 *

   7.35600813091
   7.35600956856
   7.35601305921
   7.35601654985
   7.35602004049
   7.35602353113
   7.35602702178
   7.35603397179
   7.35604092182
   7.35604787183
   7.35605482185
   7.35606177187
   7.35606940080
   7.35607702975
   7.35608465869
   7.35609228763
   7.35609991657
   7.35610754551
   7.35611517445
   7.35612280339
   7.35613085329

It should be the same as

datestr( tx(40:60,1), 0)

01-Jan-2014 00:00:07
01-Jan-2014 00:00:08
01-Jan-2014 00:00:11
01-Jan-2014 00:00:14
01-Jan-2014 00:00:17
01-Jan-2014 00:00:20
01-Jan-2014 00:00:23
01-Jan-2014 00:00:29
01-Jan-2014 00:00:35
01-Jan-2014 00:00:41
01-Jan-2014 00:00:47
01-Jan-2014 00:00:53
01-Jan-2014 00:00:59
01-Jan-2014 00:01:06
01-Jan-2014 00:01:13
01-Jan-2014 00:01:19
01-Jan-2014 00:01:26
01-Jan-2014 00:01:32
01-Jan-2014 00:01:39
01-Jan-2014 00:01:46
01-Jan-2014 00:01:53

I can easily convert it with Matlab but then I will obtain a character
format which is useless. I can also make use of cellstr(datestr(tx(:,1),
0)) but then I can't save it in ASCII file.

The origin of the Matlab format is supposed to be "-00-00". This is the
only origin which results in "2014-01-01" which is my actual start date.

Can somebody please tell me how I can simply convert serial datetime to
datetime in R.

Thanks in advance.

All the best,

Christoph

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] saving a 'get' object in R

2014-06-25 Thread David Stevens

Greg - I appreciate your taking the time to explain. This is very 
helpful. My case was a bit unusual as I'm helping a colleague with code 
to use on a regular but individual basis. I want them to name a data set 
once at the top of the script and have that name propagate through 
several objects that end up being saved at the end with file names tied 
to the data set name. Then tomorrow, they'll do the same thing on a 
different data set so each session would be pretty simple, with only 2-3 
named objects at the end, though the data sets are big and cumbersome.


I'll dig into this more and apologize to those who thought this problem 
was too trivial for the r-help forum.


David

On 6/24/2014 4:00 PM, Greg Snow wrote:

The main reason to avoid assign/get is that there are better ways.
You can use a list or environment more directly without using
assign/get.  Also the main reason to use assign/get is to work with
global variables which are frowned on in general programming (and
prone to hard to find bugs).

Consider the following code snippet:

x <- 1:10
functionThatUsesAssign(x)

what is the value of x after running this code?  If you use global
variables then the answer is "I don't know", and if x was something
that took several hours to compute and I just accidentally overwrote
it, then I am probably not happy.

Using assign often leads to the temptation to try code like:

assign( "x[5]", 25 )

which does something (no warnings, no errors) but generally not what
was being attempted.

Another common (mis)use of assign/get is to create a sequence of
variables like "data01", "data02", "data03", ...  and then do the same
thing for each of the data objects.  This is much better done by
putting all the objects into a single list, then you can easily
iterate over the list using lapply/sapply or still access the
individual pieces.  Then when it comes time to save these objects, you
only need to save the list, not all the individual objects, same for
deleting, copying, moving, etc.  And you don't have a bunch of
different objects cluttering your workspace, just a single list.

On Tue, Jun 24, 2014 at 2:57 PM, David Stevens  wrote:

Thanks to all for the replies. I tried all three and they work great. I was
misinterpreting the list = parameter in save(...) and I get your point about
overwriting existing objects.  I've heard about not using assign/get before.
Can anyone point me to why and what alternatives there are?

Regards

David


On 6/24/2014 2:50 PM, Henrik Bengtsson wrote:

I recommend to use saveRDS()/readRDS() instead.  More convenient and
avoids the risk that load() has of overwriting existing variables with
the same name.

/Henrik

On Tue, Jun 24, 2014 at 1:45 PM, Greg Snow <538...@gmail.com> wrote:

I think that you are looking for the `list` argument in `save`.

save( list=foo, file=paste0(foo, '.Rdata') )

In general it is best to avoid using the assign function (and get when
possible).  Usually there are better alternatives.

On Tue, Jun 24, 2014 at 2:35 PM, David Stevens 
wrote:

R community,

Apologies if this has been answered. The concept I'm looking for is to
save() an object retrieved using get() for an object
that resulted from using assign. Something like

save(get(foo),file=paste(foo,'rData',sep=''))

where assign(foo,obj) creates an object named foo with the contents of
obj
assigned. For example, if

x <- data.frame(v1=c(1,2,3,4),v2=c('1','2','3','4'))
foo = 'my.x'
assign(foo,x)
# (... then modify foo as needed)
save(get(foo),file=paste(foo,'.rData',sep=''))

# though this generates " in save(get(foo), file = paste(foo, ".rData",
sep
= "")) :
object ‘get(foo)’ not found", whereas

get(foo)

at the command prompt yields the contents of my.x

There's a concept I'm missing here. Can anyone help?

Regards

David Stevens

--
David K Stevens, P.E., Ph.D.
Professor and Head, Environmental Engineering
Civil and Environmental Engineering
Utah Water Research Laboratory
8200 Old Main Hill
Logan, UT  84322-8200
435 797 3229 - voice
435 797 1363 - fax
david.stev...@usu.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
David K Stevens, P.E., Ph.D.
Professor and Head, Environmental Engineering
Civil and Environmental Engineering
Utah Water Research Laboratory
8200 Old Main Hill
Logan, UT  84322-8200
435 797 3229 - voice
435 797 1363 - fax
david.stev...@usu.edu

__
R-help@r-project.org mailing list
https://stat.ethz.c

Re: [R] saving a 'get' object in R

2014-06-25 Thread Jeff Newmiller

Seems to me you are creating your own troubles in your "requirements". If the 
analysis is the same from case to case, it makes more sense to use a single set 
of object names within the analysis and change only the names of the input and 
output files.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On June 25, 2014 7:53:33 AM PDT, David Stevens  wrote:
>Greg - I appreciate your taking the time to explain. This is very 
>helpful. My case was a bit unusual as I'm helping a colleague with code
>
>to use on a regular but individual basis. I want them to name a data
>set 
>once at the top of the script and have that name propagate through 
>several objects that end up being saved at the end with file names tied
>
>to the data set name. Then tomorrow, they'll do the same thing on a 
>different data set so each session would be pretty simple, with only
>2-3 
>named objects at the end, though the data sets are big and cumbersome.
>
>I'll dig into this more and apologize to those who thought this problem
>
>was too trivial for the r-help forum.
>
>David
>
>On 6/24/2014 4:00 PM, Greg Snow wrote:
>> The main reason to avoid assign/get is that there are better ways.
>> You can use a list or environment more directly without using
>> assign/get.  Also the main reason to use assign/get is to work with
>> global variables which are frowned on in general programming (and
>> prone to hard to find bugs).
>>
>> Consider the following code snippet:
>>
>> x <- 1:10
>> functionThatUsesAssign(x)
>>
>> what is the value of x after running this code?  If you use global
>> variables then the answer is "I don't know", and if x was something
>> that took several hours to compute and I just accidentally overwrote
>> it, then I am probably not happy.
>>
>> Using assign often leads to the temptation to try code like:
>>
>> assign( "x[5]", 25 )
>>
>> which does something (no warnings, no errors) but generally not what
>> was being attempted.
>>
>> Another common (mis)use of assign/get is to create a sequence of
>> variables like "data01", "data02", "data03", ...  and then do the
>same
>> thing for each of the data objects.  This is much better done by
>> putting all the objects into a single list, then you can easily
>> iterate over the list using lapply/sapply or still access the
>> individual pieces.  Then when it comes time to save these objects,
>you
>> only need to save the list, not all the individual objects, same for
>> deleting, copying, moving, etc.  And you don't have a bunch of
>> different objects cluttering your workspace, just a single list.
>>
>> On Tue, Jun 24, 2014 at 2:57 PM, David Stevens
> wrote:
>>> Thanks to all for the replies. I tried all three and they work
>great. I was
>>> misinterpreting the list = parameter in save(...) and I get your
>point about
>>> overwriting existing objects.  I've heard about not using assign/get
>before.
>>> Can anyone point me to why and what alternatives there are?
>>>
>>> Regards
>>>
>>> David
>>>
>>>
>>> On 6/24/2014 2:50 PM, Henrik Bengtsson wrote:
 I recommend to use saveRDS()/readRDS() instead.  More convenient
>and
 avoids the risk that load() has of overwriting existing variables
>with
 the same name.

 /Henrik

 On Tue, Jun 24, 2014 at 1:45 PM, Greg Snow <538...@gmail.com>
>wrote:
> I think that you are looking for the `list` argument in `save`.
>
> save( list=foo, file=paste0(foo, '.Rdata') )
>
> In general it is best to avoid using the assign function (and get
>when
> possible).  Usually there are better alternatives.
>
> On Tue, Jun 24, 2014 at 2:35 PM, David Stevens
>
> wrote:
>> R community,
>>
>> Apologies if this has been answered. The concept I'm looking for
>is to
>> save() an object retrieved using get() for an object
>> that resulted from using assign. Something like
>>
>> save(get(foo),file=paste(foo,'rData',sep=''))
>>
>> where assign(foo,obj) creates an object named foo with the
>contents of
>> obj
>> assigned. For example, if
>>
>> x <- data.frame(v1=c(1,2,3,4),v2=c('1','2','3','4'))
>> foo = 'my.x'
>> assign(foo,x)
>> # (... then modify foo as needed)
>> save(get(foo),file=paste(foo,'.rData',sep=''))
>>
>> # though this generates " in save(get(foo), file = paste(foo,
>".rData",
>> sep
>> = "")) :
>> object ‘get(foo)’ not found", whereas
>>
>> get(foo)
>>
>> at the command prompt yields the contents of my.

Re: [R] saving a 'get' object in R

2014-06-25 Thread Bert Gunter

... and wrap it all up into a single function call that could even
have the user interactively supply the input data file names.

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll




On Wed, Jun 25, 2014 at 8:23 AM, Jeff Newmiller
 wrote:
> Seems to me you are creating your own troubles in your "requirements". If the 
> analysis is the same from case to case, it makes more sense to use a single 
> set of object names within the analysis and change only the names of the 
> input and output files.
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live Go...
>   Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> ---
> Sent from my phone. Please excuse my brevity.
>
> On June 25, 2014 7:53:33 AM PDT, David Stevens  wrote:
>>Greg - I appreciate your taking the time to explain. This is very
>>helpful. My case was a bit unusual as I'm helping a colleague with code
>>
>>to use on a regular but individual basis. I want them to name a data
>>set
>>once at the top of the script and have that name propagate through
>>several objects that end up being saved at the end with file names tied
>>
>>to the data set name. Then tomorrow, they'll do the same thing on a
>>different data set so each session would be pretty simple, with only
>>2-3
>>named objects at the end, though the data sets are big and cumbersome.
>>
>>I'll dig into this more and apologize to those who thought this problem
>>
>>was too trivial for the r-help forum.
>>
>>David
>>
>>On 6/24/2014 4:00 PM, Greg Snow wrote:
>>> The main reason to avoid assign/get is that there are better ways.
>>> You can use a list or environment more directly without using
>>> assign/get.  Also the main reason to use assign/get is to work with
>>> global variables which are frowned on in general programming (and
>>> prone to hard to find bugs).
>>>
>>> Consider the following code snippet:
>>>
>>> x <- 1:10
>>> functionThatUsesAssign(x)
>>>
>>> what is the value of x after running this code?  If you use global
>>> variables then the answer is "I don't know", and if x was something
>>> that took several hours to compute and I just accidentally overwrote
>>> it, then I am probably not happy.
>>>
>>> Using assign often leads to the temptation to try code like:
>>>
>>> assign( "x[5]", 25 )
>>>
>>> which does something (no warnings, no errors) but generally not what
>>> was being attempted.
>>>
>>> Another common (mis)use of assign/get is to create a sequence of
>>> variables like "data01", "data02", "data03", ...  and then do the
>>same
>>> thing for each of the data objects.  This is much better done by
>>> putting all the objects into a single list, then you can easily
>>> iterate over the list using lapply/sapply or still access the
>>> individual pieces.  Then when it comes time to save these objects,
>>you
>>> only need to save the list, not all the individual objects, same for
>>> deleting, copying, moving, etc.  And you don't have a bunch of
>>> different objects cluttering your workspace, just a single list.
>>>
>>> On Tue, Jun 24, 2014 at 2:57 PM, David Stevens
>> wrote:
 Thanks to all for the replies. I tried all three and they work
>>great. I was
 misinterpreting the list = parameter in save(...) and I get your
>>point about
 overwriting existing objects.  I've heard about not using assign/get
>>before.
 Can anyone point me to why and what alternatives there are?

 Regards

 David


 On 6/24/2014 2:50 PM, Henrik Bengtsson wrote:
> I recommend to use saveRDS()/readRDS() instead.  More convenient
>>and
> avoids the risk that load() has of overwriting existing variables
>>with
> the same name.
>
> /Henrik
>
> On Tue, Jun 24, 2014 at 1:45 PM, Greg Snow <538...@gmail.com>
>>wrote:
>> I think that you are looking for the `list` argument in `save`.
>>
>> save( list=foo, file=paste0(foo, '.Rdata') )
>>
>> In general it is best to avoid using the assign function (and get
>>when
>> possible).  Usually there are better alternatives.
>>
>> On Tue, Jun 24, 2014 at 2:35 PM, David Stevens
>>
>> wrote:
>>> R community,
>>>
>>> Apologies if this has been answered. The concept I'm looking for
>>is to
>>> save() an object retrieved using get() for an object
>>> that resulted from using assign. Something like
>>>
>>> save(get(foo),file=paste(foo,'rData',sep=''))
>>>
>>> where assign(foo,obj) creates an object named foo with the
>>

Re: [R] matlab serial date to r

2014-06-25 Thread Prof Brian Ripley


See ?as.Date.

I am guessing these are days and fractional days.  Try

x <- 7.35600813091e5
as.POSIXct((x - 719529)*86400, origin = "1970-01-01")




On 25/06/2014 14:56, Christoph Schlächter wrote:

Hi,

I have a matlab variable as serial date (class double) in the form
'dd-mmm- HH:MM:SS'.

format long
disp( tx(40:60,1) )

1.0e+05 *

7.35600813091
7.35600956856
7.35601305921
7.35601654985
7.35602004049
7.35602353113
7.35602702178
7.35603397179
7.35604092182
7.35604787183
7.35605482185
7.35606177187
7.35606940080
7.35607702975
7.35608465869
7.35609228763
7.35609991657
7.35610754551
7.35611517445
7.35612280339
7.35613085329

It should be the same as

datestr( tx(40:60,1), 0)

01-Jan-2014 00:00:07
01-Jan-2014 00:00:08
01-Jan-2014 00:00:11
01-Jan-2014 00:00:14
01-Jan-2014 00:00:17
01-Jan-2014 00:00:20
01-Jan-2014 00:00:23
01-Jan-2014 00:00:29
01-Jan-2014 00:00:35
01-Jan-2014 00:00:41
01-Jan-2014 00:00:47
01-Jan-2014 00:00:53
01-Jan-2014 00:00:59
01-Jan-2014 00:01:06
01-Jan-2014 00:01:13
01-Jan-2014 00:01:19
01-Jan-2014 00:01:26
01-Jan-2014 00:01:32
01-Jan-2014 00:01:39
01-Jan-2014 00:01:46
01-Jan-2014 00:01:53

I can easily convert it with Matlab but then I will obtain a character
format which is useless. I can also make use of cellstr(datestr(tx(:,1),
0)) but then I can't save it in ASCII file.

The origin of the Matlab format is supposed to be "-00-00". This is the
only origin which results in "2014-01-01" which is my actual start date.

Can somebody please tell me how I can simply convert serial datetime to
datetime in R.

Thanks in advance.

All the best,

Christoph

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matlab serial date to r

2014-06-25 Thread Jeff Newmiller

I think the character format for this data is the most versatile and clear 
option. You do have to prevent the R input function (read.csv? read.table?) 
from converting it to factor when you read it in, but then you can use 
as.POSIXct with a format argument (see ?strptime) to obtain useful timestamp 
values in R. You also need to be clear about time zones, but that is true 
regardless of the software you use.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On June 25, 2014 6:56:38 AM PDT, "Christoph Schlächter" 
 wrote:
>Hi,
>
>I have a matlab variable as serial date (class double) in the form
>'dd-mmm- HH:MM:SS'.
>
>format long
>disp( tx(40:60,1) )
>
>1.0e+05 *
>
>   7.35600813091
>   7.35600956856
>   7.35601305921
>   7.35601654985
>   7.35602004049
>   7.35602353113
>   7.35602702178
>   7.35603397179
>   7.35604092182
>   7.35604787183
>   7.35605482185
>   7.35606177187
>   7.35606940080
>   7.35607702975
>   7.35608465869
>   7.35609228763
>   7.35609991657
>   7.35610754551
>   7.35611517445
>   7.35612280339
>   7.35613085329
>
>It should be the same as
>
>datestr( tx(40:60,1), 0)
>
>01-Jan-2014 00:00:07
>01-Jan-2014 00:00:08
>01-Jan-2014 00:00:11
>01-Jan-2014 00:00:14
>01-Jan-2014 00:00:17
>01-Jan-2014 00:00:20
>01-Jan-2014 00:00:23
>01-Jan-2014 00:00:29
>01-Jan-2014 00:00:35
>01-Jan-2014 00:00:41
>01-Jan-2014 00:00:47
>01-Jan-2014 00:00:53
>01-Jan-2014 00:00:59
>01-Jan-2014 00:01:06
>01-Jan-2014 00:01:13
>01-Jan-2014 00:01:19
>01-Jan-2014 00:01:26
>01-Jan-2014 00:01:32
>01-Jan-2014 00:01:39
>01-Jan-2014 00:01:46
>01-Jan-2014 00:01:53
>
>I can easily convert it with Matlab but then I will obtain a character
>format which is useless. I can also make use of
>cellstr(datestr(tx(:,1),
>0)) but then I can't save it in ASCII file.
>
>The origin of the Matlab format is supposed to be "-00-00". This is
>the
>only origin which results in "2014-01-01" which is my actual start
>date.
>
>Can somebody please tell me how I can simply convert serial datetime to
>datetime in R.
>
>Thanks in advance.
>
>All the best,
>
>Christoph
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generating Patient Data

2014-06-25 Thread arun




Hi, 

Check if this works:
 set.seed(495)
 dat <- data.frame(ID=sample(1:10,20,replace=TRUE), 
Disease=sample(LETTERS[1:6], 20, replace=TRUE) )
subset(melt(table(dat)[rowSums(!!table(dat))>1,]), !!value,select=1:2)
   ID Disease
1   2   A
3   4   A
4   6   A
6  10   A
8   3   B
15  4   C
16  6   C
20  3   D
22  6   D
24 10   D
26  3   E
27  4   E
29  7   E
31  2   F
33  4   F
35  7   F
A.K.



On Wednesday, June 25, 2014 1:17 AM, Abhinaba Roy  
wrote:
Dear R helpers,

I want to generate data for say 1000 patients (i.e., 1000 unique IDs)
having suffered from various diseases in the past (say diseases
A,B,C,D,E,F). The only condition imposed is that each patient should've
suffered from *atleast* two diseases. So my data frame will have two
columns 'ID' and 'Disease'.

I want to do a basket analysis with this data, where ID will be the
identifier and we will establish rules based on the 'Disease' column.

How can I generate this type of data in R?

-- 
Regards
Abhinaba Roy

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matlab serial date to r

2014-06-25 Thread arun

Hi,
May be this helps:

dat <- read.table(text="7.35600813091
  7.35600956856
  7.35601305921
  7.35601654985
  7.35602004049
  7.35602353113
  7.35602702178
  7.35603397179
  7.35604092182
  7.35604787183
  7.35605482185
  7.35606177187
  7.35606940080
  7.35607702975
  7.35608465869
  7.35609228763
  7.35609991657
  7.35610754551
  7.35611517445
  7.35612280339
  7.35613085329",sep="",header=F, colClasses="character")
library(chron)
t1 <- chron(1.0e+05 *as.numeric(dat[,1])) -719529
format(as.POSIXct(paste(as.Date(dates(t1)), times(t1)%%1)),"%m-%b-%Y %H:%M:%S")


A.K.



On Wednesday, June 25, 2014 10:00 AM, Christoph Schlächter 
 wrote:
Hi,

I have a matlab variable as serial date (class double) in the form
'dd-mmm- HH:MM:SS'.

format long
disp( tx(40:60,1) )

1.0e+05 *

   7.35600813091
   7.35600956856
   7.35601305921
   7.35601654985
   7.35602004049
   7.35602353113
   7.35602702178
   7.35603397179
   7.35604092182
   7.35604787183
   7.35605482185
   7.35606177187
   7.35606940080
   7.35607702975
   7.35608465869
   7.35609228763
   7.35609991657
   7.35610754551
   7.35611517445
   7.35612280339
   7.35613085329

It should be the same as

datestr( tx(40:60,1), 0)

01-Jan-2014 00:00:07
01-Jan-2014 00:00:08
01-Jan-2014 00:00:11
01-Jan-2014 00:00:14
01-Jan-2014 00:00:17
01-Jan-2014 00:00:20
01-Jan-2014 00:00:23
01-Jan-2014 00:00:29
01-Jan-2014 00:00:35
01-Jan-2014 00:00:41
01-Jan-2014 00:00:47
01-Jan-2014 00:00:53
01-Jan-2014 00:00:59
01-Jan-2014 00:01:06
01-Jan-2014 00:01:13
01-Jan-2014 00:01:19
01-Jan-2014 00:01:26
01-Jan-2014 00:01:32
01-Jan-2014 00:01:39
01-Jan-2014 00:01:46
01-Jan-2014 00:01:53

I can easily convert it with Matlab but then I will obtain a character
format which is useless. I can also make use of cellstr(datestr(tx(:,1),
0)) but then I can't save it in ASCII file.

The origin of the Matlab format is supposed to be "-00-00". This is the
only origin which results in "2014-01-01" which is my actual start date.

Can somebody please tell me how I can simply convert serial datetime to
datetime in R.

Thanks in advance.

All the best,

Christoph

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Duncan test: 2-way ANOVA without repetition, but with multiple subjects

2014-06-25 Thread Rosario Garcia Gil

Hello

I have a question on how to perform a Duncan test after I set a model like this 
(see below).

The data consist of a dependent variable (PHt) and two dependent variables 
(REGION (3 levels) AND MANAGEMENT (3 levels)).


   MANAGEMENT
N  PS
 REGION A   196 196  196
  V   196 196 196
  H   196 196 196

196 is the number of trees on which PHt was estimated, but we cannot consider 
them as repetitions, they are subjects. Therefore, only one repetition per 
REGION*MANAGEMENT factor.

I set a two way ANOVA without repetition. Although within

model <- 
aov(PHt~REGION*MANAGEMENT+Error(subject_f/(REGION*MANAGEMENT)),data=obsHETf)

Then I try to run a Duncan test for REGION and also for MANAGEMENT. I get this 
error message.

Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = 
stringsAsFactors) :
  cannot coerce class "c("aovlist", "listof")" to a data.frame

Is there anyone who could give me a clue on what it is wrong? Maybe it isnot 
correct to call for a Duncan.test() for such type of model?

If I fit only the mean (mean of the 196 observations) within each 
REGION*MANAGEMENT factor, then Duncan.test() works as expected. The model then 
looks as simple as this.

model <- aov(PHt~REGION+MANAGEMENT,data=obsHET)

Thanks in advance.
R.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generating Patient Data

2014-06-25 Thread arun

Also, you can do:
library(dplyr)
dat%>%group_by(ID)%>%filter(length(unique(Disease))>1)%>%arrange(Disease,ID)
A.K.




On Wednesday, June 25, 2014 3:45 AM, arun  wrote:


Forgot about:
library(reshape2)





On , arun  wrote:



Hi, 

Check if this works:
 set.seed(495)
 dat <- data.frame(ID=sample(1:10,20,replace=TRUE), 
Disease=sample(LETTERS[1:6], 20, replace=TRUE) )
subset(melt(table(dat)[rowSums(!!table(dat))>1,]), !!value,select=1:2)
   ID Disease
1   2   A
3   4   A
4   6   A
6  10   A
8   3   B
15  4   C
16  6   C
20  3   D
22  6   D
24 10   D
26  3   E
27  4   E
29  7   E
31  2   F
33  4   F
35  7   F
A.K.






On Wednesday, June 25, 2014 1:17 AM, Abhinaba Roy  
wrote:
Dear R helpers,

I want to generate data for say 1000 patients (i.e., 1000 unique IDs)
having suffered from various diseases in the past (say diseases
A,B,C,D,E,F). The only condition imposed is that each patient should've
suffered from *atleast* two diseases. So my data frame will have two
columns 'ID' and 'Disease'.

I want to do a basket analysis with this data, where ID will be the
identifier and we will establish rules based on the 'Disease' column.

How can I generate this type of data in R?

-- 
Regards
Abhinaba Roy

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] partykit ctree: minbucket and case weights

2014-06-25 Thread Torsten Hothorn



Dear Amber,

your data contains missing values and you don't use surrogate splits to 
deal with them. So, the observations are passed down the tree randomly 
(there is no "majority" argument to "ctree_control"!) and thus it might 
happen that too small terminal nodes are created.


Simply use surrogate split and the tree will be deterministic with 
correct-sized terminal nodes (maxsurrogate = 3, for example).


Best,

Torsten

On Mon, 9 Jun 2014, Amber Dawn Nolder wrote:

I have attached the data set (cavl) and R code used when I got the results I 
posted about. I included the code I used at the top of the document. Below 
that is the version of R used and some of the results I obtained.

Many thanks!
Amber 
On Wed, 4 Jun 2014 09:12:15 +0200 (CEST)
Torsten Hothorn  wrote:


On Tue, 3 Jun 2014, Amber Dawn Nolder wrote:

I apologize for my lack of knowledge with R. I usually load my data as a 
csv file. May I send that to you? I was not sure if I could do so on the 
list.


yes, and the R code you used. Thanks,

Torsten


Thank you?
On Fri, 30 May 2014 09:37:23 +0200 (CEST)
Torsten Hothorn  wrote:


Amber,

this looks like an error -- could you pls send me a reproducible example 
so that I can track the problem down?


Best,

Torsten




Prof. Dr. Torsten Hothorn   =
 \\
Universitaet Zuerich \\
Institut fuer Epidemiologie, Biostatistik und \\
Praevention, Abteilung Biostatistik   //
Hirschengraben 84//
CH-8001 Zuerich //
Schweiz//
==
Telephon:  +41 44 634 48 17
Fax:   +41 44 634 43 86
Web:   http://tiny.uzh.ch/6p


On Wed, 28 May 2014, Achim Zeileis wrote:


Falls Du es nicht eh gesehen hast...

lg,
Z

-- Forwarded message --
Date: Wed, 28 May 2014 17:16:12 -0400
From: Amber Dawn Nolder 
To: r-help@r-project.org
Subject: [R] partykit ctree: minbucket and case weights


   Hello,
   I am an R novice, and I am using the "partykit" package to create
   regression trees. I used the following to generate the trees:
   ctree(y~x1+x2+x3+x4,data=my_data,control=ctree_control(testtype =
   "Bonferroni", mincriterion = 0.90, minsplit = 12, minbucket = 4,
   majority = TRUE)
   I thought that "minbucket" set the minimum value for the sum of 
weights
   in each terminal node, and that each case weight is 1, unless 
otherwise
   specified. In which case, the sum of case weights in a node should 
equal the
   number of cases (n) in that node. However, I  sometimes obtain a tree 
with

   a terminal node that contains fewer than 4 cases.
   My data set has a total of 36 cases. The dependent and all 
independent
   variables are continuous data. Variables x1 and x2 contain missing 
(NA)

   values.
   Could someone please explain why I am getting these results?
   Am I  mistaken about the value of case weights or about the use of 
minbucket

   to restrict the size of a terminal node?
   This is an example of the output:
   Model formula:
   y ~ x1 + x2 + x3 + x4
   Fitted party:
   [1] root
   |   [2] x4 <= 30: 0.927 (n = 17, err = 1.1)
   |   [3] x4 > 30
   |   |   [4] x2 <= 43: 0.472 (n = 8, err = 0.4)
   |   |   [5] x2 > 43
   |   |   |   [6] x3 <= 0.4: 0.282 (n = 3, err = 0.0)
   |   |   |   [7] x3 > 0.4: 0.020 (n = 8, err = 0.0)
   Number of inner nodes:3
   Number of terminal nodes: 4
   Many thanks!
   Amber Nolder
   Graduate Student
   Indiana University of Pennsylvania
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Duncan test: 2-way ANOVA without repetition, but with multiple subjects

2014-06-25 Thread Richard M. Heiberger

You didn't say which package Duncan.test is in.  glht has the ability.
glht, and probably any other package's test, cannot work with aovlist objects.
They require aov objects.  That means you must rewrite your Error() statement
into the main model.  Please see the entire maiz example in ?mmc to see how
to rewrite your model.

## install.packages("HH") ## if necessary
require(HH)
?mmc

On Wed, Jun 25, 2014 at 6:04 AM, Rosario Garcia Gil
 wrote:
> Hello
>
> I have a question on how to perform a Duncan test after I set a model like 
> this (see below).
>
> The data consist of a dependent variable (PHt) and two dependent variables 
> (REGION (3 levels) AND MANAGEMENT (3 levels)).
>
>
>MANAGEMENT
> N  PS
>  REGION A   196 196  196
>   V   196 196 196
>   H   196 196 196
>
> 196 is the number of trees on which PHt was estimated, but we cannot consider 
> them as repetitions, they are subjects. Therefore, only one repetition per 
> REGION*MANAGEMENT factor.
>
> I set a two way ANOVA without repetition. Although within
>
> model <- 
> aov(PHt~REGION*MANAGEMENT+Error(subject_f/(REGION*MANAGEMENT)),data=obsHETf)
>
> Then I try to run a Duncan test for REGION and also for MANAGEMENT. I get 
> this error message.
>
> Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = 
> stringsAsFactors) :
>   cannot coerce class "c("aovlist", "listof")" to a data.frame
>
> Is there anyone who could give me a clue on what it is wrong? Maybe it isnot 
> correct to call for a Duncan.test() for such type of model?
>
> If I fit only the mean (mean of the 196 observations) within each 
> REGION*MANAGEMENT factor, then Duncan.test() works as expected. The model 
> then looks as simple as this.
>
> model <- aov(PHt~REGION+MANAGEMENT,data=obsHET)
>
> Thanks in advance.
> R.
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Simple permutation question

2014-06-25 Thread Robert Latest

So my company has hired a few young McKinsey guys from overseas for a
couple of weeks to help us with a production line optimization. They
probably charge what I make in a year, but that's OK because I just
never have the time to really dive into one particular time, and I have
to hand it to the consultants that they came up with one or two really
clever ideas to model the production line. Of course it's up to me to
feed them the real data which they then churn through their Excel
models that they cook up during the nights in their hotel rooms, and
which I then implement back into my experimental system using live data.

Anyway, whenever they need something or come up with something I skip
out of the room, hack it into R, export the CSV and come back in about
half the time it takes Excel to even read in the data, let alone
process it. Of course that gor them curious, and I showed off a couple
of scripts that condense their abysmal Excel convolutions in a few
lean and mean lines of R code.

Anyway, I'm in my office with this really attractive, clever young
McKinsey girl (I'm in my mid-forties, married with kids and all, but I
still enjoyed impressing a woman with computer stuff, of all things!),
and one of her models involves a simple permutation of five letters --
"A" through "E".

And that's when I find out that R doesn't have a permutation function.
How is that possible? R has EVERYTHING, but not that? I'm
flabbergasted. Stumped. And now it's up to me to spend the evening at
home coding that model, and the only thing I really need is that
permutation.

So this is my first attempt:

perm.broken <- function(x) {
if (length(x) == 1) return(x)
sapply(1:length(x), function(i) {
cbind(x[i], perm(x[-i]))
})
}

But it doesn't work:
> perm.broken(c("A", "B", "C"))
 [,1] [,2] [,3]
[1,] "A"  "B"  "C" 
[2,] "A"  "B"  "C" 
[3,] "B"  "A"  "A" 
[4,] "C"  "C"  "B" 
[5,] "C"  "C"  "B" 
[6,] "B"  "A"  "A" 
> 

And I can't figure out for the life of me why. It should work because I
go through the elements of x in order, use that in the leftmost column,
and slap the permutation of the remaining elements to the right. What
strikes me as particularly odd is that there doesn't even seem to be a
systematic sequence of letters in any of the columns. OK, since I
really need that function I wrote this piece of crap:

perm.stupid <- function(x) {
b <- as.matrix(expand.grid(rep(list(x), length(x
b[!sapply(1:nrow(b), function(r) any(duplicated(b[r,]))),]
}

It works, but words cannot describe its ugliness. And it gets really
slow really fast with growing x.

So, anyway. My two questions are:
1. Does R really, really, seriously lack a permutation function?
2. OK, stop kidding me. So what's it called?
3. Why doesn't my recursive function work, and what would a
   working version look like?

Thanks,
robert

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple permutation question

2014-06-25 Thread Cade, Brian

It is called sample(,replace=F), where the default argument is sampling
without replacement.
Try
x <- c("A","B","C","D","E")
sample(x)

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  ca...@usgs.gov 
tel:  970 226-9326



On Wed, Jun 25, 2014 at 2:22 PM, Robert Latest  wrote:

> So my company has hired a few young McKinsey guys from overseas for a
> couple of weeks to help us with a production line optimization. They
> probably charge what I make in a year, but that's OK because I just
> never have the time to really dive into one particular time, and I have
> to hand it to the consultants that they came up with one or two really
> clever ideas to model the production line. Of course it's up to me to
> feed them the real data which they then churn through their Excel
> models that they cook up during the nights in their hotel rooms, and
> which I then implement back into my experimental system using live data.
>
> Anyway, whenever they need something or come up with something I skip
> out of the room, hack it into R, export the CSV and come back in about
> half the time it takes Excel to even read in the data, let alone
> process it. Of course that gor them curious, and I showed off a couple
> of scripts that condense their abysmal Excel convolutions in a few
> lean and mean lines of R code.
>
> Anyway, I'm in my office with this really attractive, clever young
> McKinsey girl (I'm in my mid-forties, married with kids and all, but I
> still enjoyed impressing a woman with computer stuff, of all things!),
> and one of her models involves a simple permutation of five letters --
> "A" through "E".
>
> And that's when I find out that R doesn't have a permutation function.
> How is that possible? R has EVERYTHING, but not that? I'm
> flabbergasted. Stumped. And now it's up to me to spend the evening at
> home coding that model, and the only thing I really need is that
> permutation.
>
> So this is my first attempt:
>
> perm.broken <- function(x) {
> if (length(x) == 1) return(x)
> sapply(1:length(x), function(i) {
> cbind(x[i], perm(x[-i]))
> })
> }
>
> But it doesn't work:
> > perm.broken(c("A", "B", "C"))
>  [,1] [,2] [,3]
> [1,] "A"  "B"  "C"
> [2,] "A"  "B"  "C"
> [3,] "B"  "A"  "A"
> [4,] "C"  "C"  "B"
> [5,] "C"  "C"  "B"
> [6,] "B"  "A"  "A"
> >
>
> And I can't figure out for the life of me why. It should work because I
> go through the elements of x in order, use that in the leftmost column,
> and slap the permutation of the remaining elements to the right. What
> strikes me as particularly odd is that there doesn't even seem to be a
> systematic sequence of letters in any of the columns. OK, since I
> really need that function I wrote this piece of crap:
>
> perm.stupid <- function(x) {
> b <- as.matrix(expand.grid(rep(list(x), length(x
> b[!sapply(1:nrow(b), function(r) any(duplicated(b[r,]))),]
> }
>
> It works, but words cannot describe its ugliness. And it gets really
> slow really fast with growing x.
>
> So, anyway. My two questions are:
> 1. Does R really, really, seriously lack a permutation function?
> 2. OK, stop kidding me. So what's it called?
> 3. Why doesn't my recursive function work, and what would a
>working version look like?
>
> Thanks,
> robert
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generating Patient Data

2014-06-25 Thread David Winsemius


On Jun 24, 2014, at 11:18 PM, Abhinaba Roy wrote:

> Hi David,
> 
> I was thinking something like this:
> 
> ID   Disease
> 1 A
> 2 B
> 3 A
> 1C
> 2D
> 5A
> 4B
> 3D
> 2A
> ....
> 
> How can this be done?

 do.call(rbind,  lapply( 1:20, function(pt) { 
data.frame( patient=pt, 
disease= sample( c('A','B','C','D','E','F'), 
pmin(2+rpois(1, 2), 6))  )}) )

-- 
David.
> 
> 
> On Wed, Jun 25, 2014 at 11:34 AM, David Winsemius  
> wrote:
> 
> On Jun 24, 2014, at 10:14 PM, Abhinaba Roy wrote:
> 
> > Dear R helpers,
> >
> > I want to generate data for say 1000 patients (i.e., 1000 unique IDs)
> > having suffered from various diseases in the past (say diseases
> > A,B,C,D,E,F). The only condition imposed is that each patient should've
> > suffered from *atleast* two diseases. So my data frame will have two
> > columns 'ID' and 'Disease'.
> >
> > I want to do a basket analysis with this data, where ID will be the
> > identifier and we will establish rules based on the 'Disease' column.
> >
> > How can I generate this type of data in R?
> >
> 
> Perhaps something along these lines for 20 cases:
> 
> > data.frame(patient=1:20, disease = sapply(pmin(2+rpois(20, 2), 6), 
> > function(n) paste0( sample( c('A','B','C','D','E','F'), n), collapse="+" ) )
> + )
>patient disease
> 11 F+D
> 22 F+A+D+E
> 33 F+D+C+E
> 44 B+D+C+A
> 55 D+A+F+C
> 66   E+A+D
> 77 E+F+B+C+A+D
> 88   A+B+C+D+E
> 99 B+E+C+F
> 10  10 C+A
> 11  11 B+A+D+E+C+F
> 12  12 B+C
> 13  13 A+D+B+E
> 14  14 D+C+E+F+B+A
> 15  15   C+F+D+E+A
> 16  16   A+C+B
> 17  17 C+D+B+E
> 18  18 A+B
> 19  19   C+B+D+E+F
> 20  20   D+C+F
> 
> > --
> > Regards
> > Abhinaba Roy
> >
> >   [[alternative HTML version deleted]]
> 
> You should read the Posting Guide and learn to post in HTML.
> >
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> 
> --
> David Winsemius
> Alameda, CA, USA
> 
> 
> 
> 
> -- 
> Regards
> Abhinaba Roy
> Statistician
> Radix Analytics Pvt. Ltd
> Ahmedabad
> 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple permutation question

2014-06-25 Thread Ted Harding

I think Robert wants deterministic permutations. In the e1071
package -- load with library(e1071) -- there is a function
permutations():

  Description:
  Returns a matrix containing all permutations of the integers
  '1:n' (one permutation per row).

  Usage:
  permutations(n)

  Arguments:
  n: Number of element to permute.

so, starting with
  x <- c("A","B","C","D","E")
  library(e1071)
  P <- permutations(length(x))

then, for say the 27th of these 120 permutations of x,

  x[P[27,]]

will return it.

Ted.

On 25-Jun-2014 20:38:45 Cade, Brian wrote:
> It is called sample(,replace=F), where the default argument is sampling
> without replacement.
> Try
> x <- c("A","B","C","D","E")
> sample(x)
> 
> Brian
> 
> Brian S. Cade, PhD
> 
> U. S. Geological Survey
> Fort Collins Science Center
> 2150 Centre Ave., Bldg. C
> Fort Collins, CO  80526-8818
> 
> email:  ca...@usgs.gov 
> tel:  970 226-9326
> 
> 
> 
> On Wed, Jun 25, 2014 at 2:22 PM, Robert Latest  wrote:
> 
>> So my company has hired a few young McKinsey guys from overseas for a
>> couple of weeks to help us with a production line optimization. They
>> probably charge what I make in a year, but that's OK because I just
>> never have the time to really dive into one particular time, and I have
>> to hand it to the consultants that they came up with one or two really
>> clever ideas to model the production line. Of course it's up to me to
>> feed them the real data which they then churn through their Excel
>> models that they cook up during the nights in their hotel rooms, and
>> which I then implement back into my experimental system using live data.
>>
>> Anyway, whenever they need something or come up with something I skip
>> out of the room, hack it into R, export the CSV and come back in about
>> half the time it takes Excel to even read in the data, let alone
>> process it. Of course that gor them curious, and I showed off a couple
>> of scripts that condense their abysmal Excel convolutions in a few
>> lean and mean lines of R code.
>>
>> Anyway, I'm in my office with this really attractive, clever young
>> McKinsey girl (I'm in my mid-forties, married with kids and all, but I
>> still enjoyed impressing a woman with computer stuff, of all things!),
>> and one of her models involves a simple permutation of five letters --
>> "A" through "E".
>>
>> And that's when I find out that R doesn't have a permutation function.
>> How is that possible? R has EVERYTHING, but not that? I'm
>> flabbergasted. Stumped. And now it's up to me to spend the evening at
>> home coding that model, and the only thing I really need is that
>> permutation.
>>
>> So this is my first attempt:
>>
>> perm.broken <- function(x) {
>> if (length(x) == 1) return(x)
>> sapply(1:length(x), function(i) {
>> cbind(x[i], perm(x[-i]))
>> })
>> }
>>
>> But it doesn't work:
>> > perm.broken(c("A", "B", "C"))
>>  [,1] [,2] [,3]
>> [1,] "A"  "B"  "C"
>> [2,] "A"  "B"  "C"
>> [3,] "B"  "A"  "A"
>> [4,] "C"  "C"  "B"
>> [5,] "C"  "C"  "B"
>> [6,] "B"  "A"  "A"
>> >
>>
>> And I can't figure out for the life of me why. It should work because I
>> go through the elements of x in order, use that in the leftmost column,
>> and slap the permutation of the remaining elements to the right. What
>> strikes me as particularly odd is that there doesn't even seem to be a
>> systematic sequence of letters in any of the columns. OK, since I
>> really need that function I wrote this piece of crap:
>>
>> perm.stupid <- function(x) {
>> b <- as.matrix(expand.grid(rep(list(x), length(x
>> b[!sapply(1:nrow(b), function(r) any(duplicated(b[r,]))),]
>> }
>>
>> It works, but words cannot describe its ugliness. And it gets really
>> slow really fast with growing x.
>>
>> So, anyway. My two questions are:
>> 1. Does R really, really, seriously lack a permutation function?
>> 2. OK, stop kidding me. So what's it called?
>> 3. Why doesn't my recursive function work, and what would a
>>working version look like?
>>
>> Thanks,
>> robert
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-
E-Mail: (Ted Harding) 
Date: 25-Jun-2014  Time: 21:55:42
This message was sent by XFMail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-gu

[R] Join the TIBCO TERR team at useR 2014!

2014-06-25 Thread Louis Bajuk-Yorgan

The TERR team will be at useR 2014, June 30th to July 3rd, to share
the latest enhancements and news around TIBCO Enterprise Runtime for
R. In addition to providing demos at the TIBCO TERR table in the
exhibition area, some of the senior members of the TERR team will be
presenting at the conference:

Louis Bajuk-Yorgan, Sr. Dir., Product Management:“Deploying R into
Business Intelligence and Real-time Applications”. Business track,
Session 5, Wednesday 16:00

Stephen Kaluzny, Lead Statistician for TERR: “Software Testing and the
R Language”
Business track, Session 6, Thursday 10:00

Michael Sannella, Lead Architect for TERR: "The Compatibility
Challenge: Examining R and Developing TERR". Posters 1, Tuesday, 17:30


-- 
Lou Bajuk-Yorgan
Sr. Director, Product Management
Spotfire, TIBCO Software
206-802-2328
lba...@tibco.com
Twitter: @LouBajuk
http://spotfire.tibco.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple permutation question

2014-06-25 Thread David L Carlson

Assuming you want all of the permutations not just a random permutation (such 
as sample() give you):

> require(e1071)
> indx <- permutations(5)
> head(indx)
 [,1] [,2] [,3] [,4] [,5]
[1,]12345
[2,]21345
[3,]23145
[4,]13245
[5,]31245
[6,]32145
> tail(indx)
   [,1] [,2] [,3] [,4] [,5]
[115,]54123
[116,]54213
[117,]54231
[118,]54132
[119,]54312
[120,]54321

If you want them converted to numbers try
> apply(ind, 1, function(x) paste(LETTERS[x], collapse=" "))
[1] "A B C D E" "B A C D E" "B C A D E" "A C B D E" "C A B D E" "C B A D E"
> tail(perm.ltrs)
[1] "E D A B C" "E D B A C" "E D B C A" "E D A C B" "E D C A B" "E D C B A"

This is not the only permutation function in R, but this one has the advantage 
of being symmetrical. The last permutation is the reverse of the first, the 
penultimate the reverse of the second, etc.

-
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Cade, Brian
Sent: Wednesday, June 25, 2014 3:39 PM
To: Robert Latest
Cc: r-help@r-project.org
Subject: Re: [R] Simple permutation question

It is called sample(,replace=F), where the default argument is sampling
without replacement.
Try
x <- c("A","B","C","D","E")
sample(x)

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  ca...@usgs.gov 
tel:  970 226-9326



On Wed, Jun 25, 2014 at 2:22 PM, Robert Latest  wrote:

> So my company has hired a few young McKinsey guys from overseas for a
> couple of weeks to help us with a production line optimization. They
> probably charge what I make in a year, but that's OK because I just
> never have the time to really dive into one particular time, and I have
> to hand it to the consultants that they came up with one or two really
> clever ideas to model the production line. Of course it's up to me to
> feed them the real data which they then churn through their Excel
> models that they cook up during the nights in their hotel rooms, and
> which I then implement back into my experimental system using live data.
>
> Anyway, whenever they need something or come up with something I skip
> out of the room, hack it into R, export the CSV and come back in about
> half the time it takes Excel to even read in the data, let alone
> process it. Of course that gor them curious, and I showed off a couple
> of scripts that condense their abysmal Excel convolutions in a few
> lean and mean lines of R code.
>
> Anyway, I'm in my office with this really attractive, clever young
> McKinsey girl (I'm in my mid-forties, married with kids and all, but I
> still enjoyed impressing a woman with computer stuff, of all things!),
> and one of her models involves a simple permutation of five letters --
> "A" through "E".
>
> And that's when I find out that R doesn't have a permutation function.
> How is that possible? R has EVERYTHING, but not that? I'm
> flabbergasted. Stumped. And now it's up to me to spend the evening at
> home coding that model, and the only thing I really need is that
> permutation.
>
> So this is my first attempt:
>
> perm.broken <- function(x) {
> if (length(x) == 1) return(x)
> sapply(1:length(x), function(i) {
> cbind(x[i], perm(x[-i]))
> })
> }
>
> But it doesn't work:
> > perm.broken(c("A", "B", "C"))
>  [,1] [,2] [,3]
> [1,] "A"  "B"  "C"
> [2,] "A"  "B"  "C"
> [3,] "B"  "A"  "A"
> [4,] "C"  "C"  "B"
> [5,] "C"  "C"  "B"
> [6,] "B"  "A"  "A"
> >
>
> And I can't figure out for the life of me why. It should work because I
> go through the elements of x in order, use that in the leftmost column,
> and slap the permutation of the remaining elements to the right. What
> strikes me as particularly odd is that there doesn't even seem to be a
> systematic sequence of letters in any of the columns. OK, since I
> really need that function I wrote this piece of crap:
>
> perm.stupid <- function(x) {
> b <- as.matrix(expand.grid(rep(list(x), length(x
> b[!sapply(1:nrow(b), function(r) any(duplicated(b[r,]))),]
> }
>
> It works, but words cannot describe its ugliness. And it gets really
> slow really fast with growing x.
>
> So, anyway. My two questions are:
> 1. Does R really, really, seriously lack a permutation function?
> 2. OK, stop kidding me. So what's it called?
> 3. Why doesn't my recursive function work, and what would a
>working version look like?
>
> Thanks,
> robert
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting gui

Re: [R] Simple permutation question

2014-06-25 Thread Bert Gunter

and further...

See ?Recall for how to do recursion in R.

However, it is my understanding that recursion is not that efficient
in R. A chain of function environments must be created, and this does
not scale well. (Comments from real experts welcome here).

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll




On Wed, Jun 25, 2014 at 1:38 PM, Cade, Brian  wrote:
> It is called sample(,replace=F), where the default argument is sampling
> without replacement.
> Try
> x <- c("A","B","C","D","E")
> sample(x)
>
> Brian
>
> Brian S. Cade, PhD
>
> U. S. Geological Survey
> Fort Collins Science Center
> 2150 Centre Ave., Bldg. C
> Fort Collins, CO  80526-8818
>
> email:  ca...@usgs.gov 
> tel:  970 226-9326
>
>
>
> On Wed, Jun 25, 2014 at 2:22 PM, Robert Latest  wrote:
>
>> So my company has hired a few young McKinsey guys from overseas for a
>> couple of weeks to help us with a production line optimization. They
>> probably charge what I make in a year, but that's OK because I just
>> never have the time to really dive into one particular time, and I have
>> to hand it to the consultants that they came up with one or two really
>> clever ideas to model the production line. Of course it's up to me to
>> feed them the real data which they then churn through their Excel
>> models that they cook up during the nights in their hotel rooms, and
>> which I then implement back into my experimental system using live data.
>>
>> Anyway, whenever they need something or come up with something I skip
>> out of the room, hack it into R, export the CSV and come back in about
>> half the time it takes Excel to even read in the data, let alone
>> process it. Of course that gor them curious, and I showed off a couple
>> of scripts that condense their abysmal Excel convolutions in a few
>> lean and mean lines of R code.
>>
>> Anyway, I'm in my office with this really attractive, clever young
>> McKinsey girl (I'm in my mid-forties, married with kids and all, but I
>> still enjoyed impressing a woman with computer stuff, of all things!),
>> and one of her models involves a simple permutation of five letters --
>> "A" through "E".
>>
>> And that's when I find out that R doesn't have a permutation function.
>> How is that possible? R has EVERYTHING, but not that? I'm
>> flabbergasted. Stumped. And now it's up to me to spend the evening at
>> home coding that model, and the only thing I really need is that
>> permutation.
>>
>> So this is my first attempt:
>>
>> perm.broken <- function(x) {
>> if (length(x) == 1) return(x)
>> sapply(1:length(x), function(i) {
>> cbind(x[i], perm(x[-i]))
>> })
>> }
>>
>> But it doesn't work:
>> > perm.broken(c("A", "B", "C"))
>>  [,1] [,2] [,3]
>> [1,] "A"  "B"  "C"
>> [2,] "A"  "B"  "C"
>> [3,] "B"  "A"  "A"
>> [4,] "C"  "C"  "B"
>> [5,] "C"  "C"  "B"
>> [6,] "B"  "A"  "A"
>> >
>>
>> And I can't figure out for the life of me why. It should work because I
>> go through the elements of x in order, use that in the leftmost column,
>> and slap the permutation of the remaining elements to the right. What
>> strikes me as particularly odd is that there doesn't even seem to be a
>> systematic sequence of letters in any of the columns. OK, since I
>> really need that function I wrote this piece of crap:
>>
>> perm.stupid <- function(x) {
>> b <- as.matrix(expand.grid(rep(list(x), length(x
>> b[!sapply(1:nrow(b), function(r) any(duplicated(b[r,]))),]
>> }
>>
>> It works, but words cannot describe its ugliness. And it gets really
>> slow really fast with growing x.
>>
>> So, anyway. My two questions are:
>> 1. Does R really, really, seriously lack a permutation function?
>> 2. OK, stop kidding me. So what's it called?
>> 3. Why doesn't my recursive function work, and what would a
>>working version look like?
>>
>> Thanks,
>> robert
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple permutation question

2014-06-25 Thread David L Carlson

A couple of typos in the last section. It should be 

> perm.ltrs <- apply(indx, 1, function(x) paste(LETTERS[x], collapse=" "))
> tail(perm.ltrs)
[1] "E D A B C" "E D B A C" "E D B C A" "E D A C B" "E D C A B" "E D C B A"
> tail(perm.ltrs)
[1] "E D A B C" "E D B A C" "E D B C A" "E D A C B" "E D C A B" "E D C B A"
>

David C

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of David L Carlson
Sent: Wednesday, June 25, 2014 4:02 PM
To: Cade, Brian; Robert Latest
Cc: r-help@r-project.org
Subject: Re: [R] Simple permutation question

Assuming you want all of the permutations not just a random permutation (such 
as sample() give you):

> require(e1071)
> indx <- permutations(5)
> head(indx)
 [,1] [,2] [,3] [,4] [,5]
[1,]12345
[2,]21345
[3,]23145
[4,]13245
[5,]31245
[6,]32145
> tail(indx)
   [,1] [,2] [,3] [,4] [,5]
[115,]54123
[116,]54213
[117,]54231
[118,]54132
[119,]54312
[120,]54321

If you want them converted to numbers try
> apply(ind, 1, function(x) paste(LETTERS[x], collapse=" "))
[1] "A B C D E" "B A C D E" "B C A D E" "A C B D E" "C A B D E" "C B A D E"
> tail(perm.ltrs)
[1] "E D A B C" "E D B A C" "E D B C A" "E D A C B" "E D C A B" "E D C B A"

This is not the only permutation function in R, but this one has the advantage 
of being symmetrical. The last permutation is the reverse of the first, the 
penultimate the reverse of the second, etc.

-
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Cade, Brian
Sent: Wednesday, June 25, 2014 3:39 PM
To: Robert Latest
Cc: r-help@r-project.org
Subject: Re: [R] Simple permutation question

It is called sample(,replace=F), where the default argument is sampling
without replacement.
Try
x <- c("A","B","C","D","E")
sample(x)

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  ca...@usgs.gov 
tel:  970 226-9326



On Wed, Jun 25, 2014 at 2:22 PM, Robert Latest  wrote:

> So my company has hired a few young McKinsey guys from overseas for a
> couple of weeks to help us with a production line optimization. They
> probably charge what I make in a year, but that's OK because I just
> never have the time to really dive into one particular time, and I have
> to hand it to the consultants that they came up with one or two really
> clever ideas to model the production line. Of course it's up to me to
> feed them the real data which they then churn through their Excel
> models that they cook up during the nights in their hotel rooms, and
> which I then implement back into my experimental system using live data.
>
> Anyway, whenever they need something or come up with something I skip
> out of the room, hack it into R, export the CSV and come back in about
> half the time it takes Excel to even read in the data, let alone
> process it. Of course that gor them curious, and I showed off a couple
> of scripts that condense their abysmal Excel convolutions in a few
> lean and mean lines of R code.
>
> Anyway, I'm in my office with this really attractive, clever young
> McKinsey girl (I'm in my mid-forties, married with kids and all, but I
> still enjoyed impressing a woman with computer stuff, of all things!),
> and one of her models involves a simple permutation of five letters --
> "A" through "E".
>
> And that's when I find out that R doesn't have a permutation function.
> How is that possible? R has EVERYTHING, but not that? I'm
> flabbergasted. Stumped. And now it's up to me to spend the evening at
> home coding that model, and the only thing I really need is that
> permutation.
>
> So this is my first attempt:
>
> perm.broken <- function(x) {
> if (length(x) == 1) return(x)
> sapply(1:length(x), function(i) {
> cbind(x[i], perm(x[-i]))
> })
> }
>
> But it doesn't work:
> > perm.broken(c("A", "B", "C"))
>  [,1] [,2] [,3]
> [1,] "A"  "B"  "C"
> [2,] "A"  "B"  "C"
> [3,] "B"  "A"  "A"
> [4,] "C"  "C"  "B"
> [5,] "C"  "C"  "B"
> [6,] "B"  "A"  "A"
> >
>
> And I can't figure out for the life of me why. It should work because I
> go through the elements of x in order, use that in the leftmost column,
> and slap the permutation of the remaining elements to the right. What
> strikes me as particularly odd is that there doesn't even seem to be a
> systematic sequence of letters in any of the columns. OK, since I
> really need that function I wrote this piece of crap:
>
> perm.stupid <- function(x) {
> b <- as.matrix(expand.grid(rep(list(x), length(x
> b[!s

Re: [R] Simple permutation question

2014-06-25 Thread Jeff Newmiller

The brokenness of your perm.broken function arises from the attempted use 
of sapply to bind matrices together, which is not something sapply does.


perm.fixed <- function( x ) {
  if ( length( x ) == 1 ) return( matrix( x, nrow=1 ) )
  lst <- lapply( seq_along( x )
   , function( i ) {
   cbind( x[ i ], perm.jdn( x[ -i ] ) )
 }
   )
  do.call(rbind, lst)
}


On Wed, 25 Jun 2014, Robert Latest wrote:


So my company has hired a few young McKinsey guys from overseas for a
couple of weeks to help us with a production line optimization. They
probably charge what I make in a year, but that's OK because I just
never have the time to really dive into one particular time, and I have
to hand it to the consultants that they came up with one or two really
clever ideas to model the production line. Of course it's up to me to
feed them the real data which they then churn through their Excel
models that they cook up during the nights in their hotel rooms, and
which I then implement back into my experimental system using live data.

Anyway, whenever they need something or come up with something I skip
out of the room, hack it into R, export the CSV and come back in about
half the time it takes Excel to even read in the data, let alone
process it. Of course that gor them curious, and I showed off a couple
of scripts that condense their abysmal Excel convolutions in a few
lean and mean lines of R code.

Anyway, I'm in my office with this really attractive, clever young
McKinsey girl (I'm in my mid-forties, married with kids and all, but I
still enjoyed impressing a woman with computer stuff, of all things!),
and one of her models involves a simple permutation of five letters --
"A" through "E".

And that's when I find out that R doesn't have a permutation function.
How is that possible? R has EVERYTHING, but not that? I'm
flabbergasted. Stumped. And now it's up to me to spend the evening at
home coding that model, and the only thing I really need is that
permutation.

So this is my first attempt:

perm.broken <- function(x) {
   if (length(x) == 1) return(x)
   sapply(1:length(x), function(i) {
   cbind(x[i], perm(x[-i]))
   })
}

But it doesn't work:

perm.broken(c("A", "B", "C"))

[,1] [,2] [,3]
[1,] "A"  "B"  "C"
[2,] "A"  "B"  "C"
[3,] "B"  "A"  "A"
[4,] "C"  "C"  "B"
[5,] "C"  "C"  "B"
[6,] "B"  "A"  "A"




And I can't figure out for the life of me why. It should work because I
go through the elements of x in order, use that in the leftmost column,
and slap the permutation of the remaining elements to the right. What
strikes me as particularly odd is that there doesn't even seem to be a
systematic sequence of letters in any of the columns. OK, since I
really need that function I wrote this piece of crap:

perm.stupid <- function(x) {
   b <- as.matrix(expand.grid(rep(list(x), length(x
   b[!sapply(1:nrow(b), function(r) any(duplicated(b[r,]))),]
}

It works, but words cannot describe its ugliness. And it gets really
slow really fast with growing x.

So, anyway. My two questions are:
1. Does R really, really, seriously lack a permutation function?
2. OK, stop kidding me. So what's it called?
3. Why doesn't my recursive function work, and what would a
  working version look like?

Thanks,
robert

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple permutation question

2014-06-25 Thread Jeff Newmiller


sorry... editing on the fly... try:

perm.fixed <- function( x ) {
  if ( length( x ) == 1 ) return( matrix( x, nrow=1 ) )
  lst <- lapply( seq_along( x )
   , function( i ) {
   cbind( x[ i ], perm.fixed( x[ -i ] ) )
 }
   )
  do.call( rbind, lst )
}

On Wed, 25 Jun 2014, Jeff Newmiller wrote:

The brokenness of your perm.broken function arises from the attempted use of 
sapply to bind matrices together, which is not something sapply does.


perm.fixed <- function( x ) {
 if ( length( x ) == 1 ) return( matrix( x, nrow=1 ) )
 lst <- lapply( seq_along( x )
  , function( i ) {
  cbind( x[ i ], perm.jdn( x[ -i ] ) )
}
  )
 do.call(rbind, lst)
}


On Wed, 25 Jun 2014, Robert Latest wrote:


So my company has hired a few young McKinsey guys from overseas for a
couple of weeks to help us with a production line optimization. They
probably charge what I make in a year, but that's OK because I just
never have the time to really dive into one particular time, and I have
to hand it to the consultants that they came up with one or two really
clever ideas to model the production line. Of course it's up to me to
feed them the real data which they then churn through their Excel
models that they cook up during the nights in their hotel rooms, and
which I then implement back into my experimental system using live data.

Anyway, whenever they need something or come up with something I skip
out of the room, hack it into R, export the CSV and come back in about
half the time it takes Excel to even read in the data, let alone
process it. Of course that gor them curious, and I showed off a couple
of scripts that condense their abysmal Excel convolutions in a few
lean and mean lines of R code.

Anyway, I'm in my office with this really attractive, clever young
McKinsey girl (I'm in my mid-forties, married with kids and all, but I
still enjoyed impressing a woman with computer stuff, of all things!),
and one of her models involves a simple permutation of five letters --
"A" through "E".

And that's when I find out that R doesn't have a permutation function.
How is that possible? R has EVERYTHING, but not that? I'm
flabbergasted. Stumped. And now it's up to me to spend the evening at
home coding that model, and the only thing I really need is that
permutation.

So this is my first attempt:

perm.broken <- function(x) {
   if (length(x) == 1) return(x)
   sapply(1:length(x), function(i) {
   cbind(x[i], perm(x[-i]))
   })
}

But it doesn't work:

perm.broken(c("A", "B", "C"))

[,1] [,2] [,3]
[1,] "A"  "B"  "C"
[2,] "A"  "B"  "C"
[3,] "B"  "A"  "A"
[4,] "C"  "C"  "B"
[5,] "C"  "C"  "B"
[6,] "B"  "A"  "A"




And I can't figure out for the life of me why. It should work because I
go through the elements of x in order, use that in the leftmost column,
and slap the permutation of the remaining elements to the right. What
strikes me as particularly odd is that there doesn't even seem to be a
systematic sequence of letters in any of the columns. OK, since I
really need that function I wrote this piece of crap:

perm.stupid <- function(x) {
   b <- as.matrix(expand.grid(rep(list(x), length(x
   b[!sapply(1:nrow(b), function(r) any(duplicated(b[r,]))),]
}

It works, but words cannot describe its ugliness. And it gets really
slow really fast with growing x.

So, anyway. My two questions are:
1. Does R really, really, seriously lack a permutation function?
2. OK, stop kidding me. So what's it called?
3. Why doesn't my recursive function work, and what would a
  working version look like?

Thanks,
robert

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
 Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Contro

[R] plot a 3-D marked point process

2014-06-25 Thread Ferra Xu

Hello All

I intend to plot a 3-D marked point process. Could you please help me to find a 
code or a related package?

Thanks
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to troubleshoot issue with curve() function?

2014-06-25 Thread Lavrenz, Steven M

I am trying to plot data that I imported with read. table using a Weibull 
distribution. Here's the code I used for importing the data:

data <- read.table(file =
"C:/Users/Steven/OneDrive/Unfiled/Temp/phase2.csv", header = FALSE, sep = ",", 
dec = ".", col.names = c("cycle_time", "cycle_length", "red_time", 
"prot_green", "perm_green", "veh_count", "vc", "max_delay", "prot_occ", 
"perm_occ", "red5_occ", "gor", "ror", "arrival_time"), na.strings = "NA")

And here's my code for plotting:

curve(dexp(data(arrival_time), rate=0.06), from = 0, to = 60, main = 
"Exponential distribution")

I keep getting this error message:

Error in curve(dexp(data(arrival_time), rate = 0.06), from = 0, to = 60,  : 
  'expr' must be a function, or a call or an expression containing 'x'

How to I begin to troubleshoot this? The only thing I can think of is that my 
variable of interest, "arrival_time", does contain a number of rows with 
blanks, but would that be the underlying cause? If so, can I get the function 
to ignore blank rows?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to troubleshoot issue with curve() function?

2014-06-25 Thread Benno Pütz

I’d start trying to understand the error message - it is actually pretty clear 
(see ?curve).
I suspect curve is not the function you want to use, rather something like

plot(data$arrival_time, dexp(data$arrival_time, rate=0.06), xlim = c(0,  60), 
type=‘l’, main = "Exponential distribution”)

assuming your arrival times lie in the specified window.

Just a wild guess though, as you didn’t provide data (see footer)

On 25 Jun 2014, at 23:02 , Lavrenz, Steven M  wrote:

> I am trying to plot data that I imported with read. table using a Weibull 
> distribution. Here's the code I used for importing the data:
> 
> data <- read.table(file =
> "C:/Users/Steven/OneDrive/Unfiled/Temp/phase2.csv", header = FALSE, sep = 
> ",", dec = ".", col.names = c("cycle_time", "cycle_length", "red_time", 
> "prot_green", "perm_green", "veh_count", "vc", "max_delay", "prot_occ", 
> "perm_occ", "red5_occ", "gor", "ror", "arrival_time"), na.strings = "NA")
> 
> And here's my code for plotting:
> 
> curve(dexp(data(arrival_time), rate=0.06), from = 0, to = 60, main = 
> "Exponential distribution")
> 
> I keep getting this error message:
> 
> Error in curve(dexp(data(arrival_time), rate = 0.06), from = 0, to = 60,  : 
>  'expr' must be a function, or a call or an expression containing 'x'
> 
> How to I begin to troubleshoot this? The only thing I can think of is that my 
> variable of interest, "arrival_time", does contain a number of rows with 
> blanks, but would that be the underlying cause? If so, can I get the function 
> to ignore blank rows?
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot a 3-D marked point process

2014-06-25 Thread Ben Bolker

Ferra Xu  yahoo.com> writes:

> 
> Hello All
> 
> I intend to plot a 3-D marked point process. 
> Could you please help me to find a code or a related package?

How about scatterplot3d or rgl::plot3d ?
Googling "r 3d point plot" gets you a lot of good starting points.
(The easiest way to indicate the marks would be by modifying colour
according to mark class.)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Space-time non-homogeneous Poisson process

2014-06-25 Thread Ferra Xu

I am wondering if I want to model a space-time non-homogeneous Poisson process 
that has more probability of occurrence in some areas in space, how should I do 
that? For example we want to model the intensity function of earthquake 
occurrence that is more probable to happen in some areas (around earthquake 
line) and therefore the intensity function is a function of space and time.


Thanks,

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generating Patient Data

[R] SD of Residuals by group

[R] matlab serial date to r

Re: [R] saving a 'get' object in R

Re: [R] saving a 'get' object in R

Re: [R] saving a 'get' object in R

Re: [R] matlab serial date to r

Re: [R] matlab serial date to r

Re: [R] Generating Patient Data

Re: [R] matlab serial date to r

[R] Duncan test: 2-way ANOVA without repetition, but with multiple subjects

Re: [R] Generating Patient Data

Re: [R] partykit ctree: minbucket and case weights

Re: [R] Duncan test: 2-way ANOVA without repetition, but with multiple subjects

[R] Simple permutation question

Re: [R] Simple permutation question

Re: [R] Generating Patient Data

Re: [R] Simple permutation question

[R] Join the TIBCO TERR team at useR 2014!

Re: [R] Simple permutation question

Re: [R] Simple permutation question

Re: [R] Simple permutation question

Re: [R] Simple permutation question

Re: [R] Simple permutation question

[R] plot a 3-D marked point process

[R] How to troubleshoot issue with curve() function?

Re: [R] How to troubleshoot issue with curve() function?

Re: [R] plot a 3-D marked point process

[R] Space-time non-homogeneous Poisson process

29 matches

Site Navigation

Mail list logo

Footer information