[R] Error in ginv(A) : 'X' must be a numeric or complex matrix

2015-06-14 Thread Matteo Villa
Dear all,

I encountered a problem that has been bugging me for some hours now,
and I still can't come up with a viable solution. I tried searching
the archives, but to no avail, so here I am sending my first e-mail to
the list!
I am estimating a binary spatial autoregressive model via a Gibbs
sampler. When I do this with a neighborhood matrix, everything goes
perfectly fine, but when I switch to a distance matrix, the program
stops almost immediately, and outputs the famous (but hardly ever
occurring):
"Error in ginv(A) : 'X' must be a numeric or complex matrix"

What strikes me as odd is that when I try to compute the generalized
inverse myself, everything goes smoothly, and all the matrixes seem to
be numeric. Below you will find the data and code for the replicable
example.

Data for the replicable example:
https://gist.github.com/lessermatter/66b6488cfe6f5d7893bf

And here is the code (the error occurs after executing the final line,
which is also a toy model to be estimated through the bsar function):
https://gist.github.com/lessermatter/0284be117a19620750aa

Any ideas?

Kind regards,

Matteo Villa
Università degli Studi di Milano
Italy

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Creating list with increasing string lengths

2015-06-14 Thread Jim Lemon
Hi Nia,
Many ways to do something like this, for example:

N<-6
sapply(1:N,function(x) return(sample(1:10,x,TRUE)))

I'll let you work out how to generalize this.

Jim


On Sun, Jun 14, 2015 at 3:06 PM, Bert Gunter  wrote:
> Is this homework? Homework is deprecated here.
>
> ?lapply
>
> is one of many possible approaches. If this is not homework, showing your
> unsuccessful code would likely lead to a better learning experience for you.
>
> Cheers,
> Bert
>
> Bert Gunter
>
> "Data is not information. Information is not knowledge. And knowledge is
> certainly not wisdom."
>-- Clifford Stoll
>
> On Sat, Jun 13, 2015 at 8:04 AM, Nia Gupta via R-help 
> wrote:
>
>> Hello,
>>
>> I am trying to create a list where each name would have an increasing
>> vector length. For example, I am trying to obtain something that looks like
>> this:
>>
>> [[1]][1] 2
>> [[2]]
>> [1] 2 4
>>
>> [[3]]
>> [1] 1 2 3
>> .
>> The numbers generated would just be any random numbers. My thought was to
>> use a for-loop and the sequence function but that doesn't seem to be
>> working. Any help on this would be greatly appreciated.
>>
>> Thank you,
>>
>> Nia
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R 3.2, Mac 10.10.3 : help.search showing error

2015-06-14 Thread David Winsemius

On Jun 13, 2015, at 11:04 PM, Berend Hasselman wrote:

> 
>> On 14-06-2015, at 06:25, Ramnik Bansal  wrote:
>> 
>> Thanks. But it seems to be an R 3.2.0 specific problem.
>> 
> 
> I replied with the following to a similar message on R-devel.

There was no error either with using help() or using the Mac GUI package 
manager. Those were the reported difficulties previously reported.

> 
> 
> See this thread on R-SIG-Mac
> 
> https://stat.ethz.ch/pipermail/r-sig-mac/2015-April/011420.html
> 
> This may help.
> Get R 3.2.0-patched or even the release candidate for R 3.2.1
> 

 The error I am getting was from the most recent R 3.2.0 downloaded yesterday. 
There is no 3.2.0-Patched for Mavericks.  The release candidate, R 3.2.1 RC, is 
marked on the ATT Research webpage as failing Make and it indeed fails to 
launch. I would NOT recommend that anyone accept that advice. 

But maybe it is a Mac-specific problem. When I remove the crippled R 3.2.1 RC 
and reinstall the R 3.2.0 and run from a Terminal window I do not get the 
help.search() error. So copying to R SIG Mac, and will not copy R-help on any 
further efforts.

-- 
David.



> 
> Berend
> 
> 
>> On Sun, Jun 14, 2015 at 8:42 AM, David Winsemius 
>> wrote:
>> 
>>> 
>>> On Jun 13, 2015, at 7:41 AM, Ramnik Bansal wrote:
>>> 
 Getting following error in using help.search
 
> utils::help.search("linear models")
 Error in help(db[i, "topic"], package = db[i, "Package"], lib.loc =
>>> lib,  :
 'topic' should be a name, length-one character vector or reserved word
>>> 
>>> I first tried this in a Mac running the SL version for R 3.1.2 and did not
>>> get this error. I updated my Mavericks laptop to R 3.2.0 and can now
>>> reproduce this error. It does not seem to depend on having a space in the
>>> argument. It seems to be thrown by this segment of code in the
>>> `help()`-function:
>>> 
>>>  ischar <- tryCatch(is.character(topic) && length(topic) ==
>>>  1L, error = identity)
>>>  if (inherits(ischar, "error"))
>>>  ischar <- FALSE
>>>  if (!ischar) {
>>>  reserved <- c("TRUE", "FALSE", "NULL", "Inf", "NaN",
>>>  "NA", "NA_integer_", "NA_real_", "NA_complex_", "NA_character_")
>>>  stopic <- deparse(substitute(topic))
>>>  if (!is.name(substitute(topic)) && !stopic %in% reserved)
>>>  stop("'topic' should be a name, length-one character vector or
>>> reserved word")
>>> 
>>> If gone through the `help.search` function code and cannot find where the
>>> `help` function is actually called. This seems unlikely to be a
>>> Mac-specific problem.
>>> 
 
 
> example(help.search)
 
 hlp.sr> help.search("linear models")# In case you forgot how to fit
 linear
 Error in help(db[i, "topic"], package = db[i, "Package"], lib.loc =
>>> lib,  :
 'topic' should be a name, length-one character vector or reserved word
 
 
 How to sort this?
 
 [[alternative HTML version deleted]]
>>> 
>>> David Winsemius
>>> Alameda, CA, USA
>>> 
>>> 
>> 
>>  [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with abs function

2015-06-14 Thread Andrés Aragón Martínez
Hi,

Just do the following:


> tran<-c(7.2)
> tgrid<-c(7.1,7.4,7.3,7.1,7.3)
> tgrid<-tgrid-tran
> tgrid
[1] -0.1  0.2  0.1 -0.1  0.1
> abs(tgrid[tgrid>0.1])
[1] 0.2

Andrés



> El 12/06/2015, a las 11:01, Jeff Newmiller  
> escribió:
> 
> FAQ 7.31
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live Go...
>  Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> --- 
> Sent from my phone. Please excuse my brevity.
> 
> On June 12, 2015 8:39:48 AM PDT, "Nelson, Gary (MISC)" 
>  wrote:
>> I have come across some odd behavior (to me) using the abs() function
>> that I wonder if anyone can explain.
>> 
>> Using  R version 3.2.0, I created a vector of absolute values using 
>> the following code:
>> 
>>> tran<-c(7.2)
>>> tgrid<-c(7.1,7.4,7.3,7.1,7.3)
>>> dgrid<-abs(tgrid-tran)
>>> dgrid
>> [1] 0.1 0.2 0.1 0.1 0.1
>> 
>> When I tried to extract just the rows with values>0.1, I get
>> 
>>> dgrid[dgrid>0.1]
>> [1] 0.1 0.2 0.1
>> 
>> There should be only 1 value extracted.
>> 
>> However, if I enter the values by hand
>> 
>>> bgrid<-c(0.1,0.2,0.1,0.1,0.1)
>>> bgrid
>> [1] 0.1 0.2 0.1 0.1 0.1
>>> bgrid[bgrid>0.1]
>> [1] 0.2
>> 
>> The result is correct.  So why is this happening?
>> 
>> I did explore a little bit and found
>> 
>>> as.character(dgrid)
>> [1] "0.101"  "0.2"
>> [3] "0.0996" "0.101"
>> [5] "0.0996"
>> 
>> which shows the absolute values of the negative differences  are
>> slightly greater than the difference of 0.1
>> 
>> Is this normal behavior for the abs() function?
>> 
>> Thanks,
>> 
>> Gary Nelson
>> 
>>  [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] most frequent value

2015-06-14 Thread Ragia Ibrahim
Dear group,
I have the following integer object

> a 

3 4 6 
3 3 6 

how to get the most frequent value (it should be 3)

and get nothing if no frequent one and all is equal

3 4 6 
3 4 6 
Thanks in advance
Ragia
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] most frequent value

2015-06-14 Thread Jim Lemon
Hi Ragia,
The basic method is to use "table" and examine the result:

> a<-matrix(c(3,3,4,3,6,6),nrow=2)
> b<-matrix(c(3,3,4,4,6,6),nrow=2)
> table(a)
a
3 4 6
3 1 2

but you can get the modal value directly like this:

library(prettyR)
Mode(a)
[1] "3"
Mode(b)
[1] ">1 mode"

Jim

On Mon, Jun 15, 2015 at 9:04 AM, Ragia Ibrahim  wrote:
> Dear group,
> I have the following integer object
>
>> a
>
> 3 4 6
> 3 3 6
>
> how to get the most frequent value (it should be 3)
>
> and get nothing if no frequent one and all is equal
>
> 3 4 6
> 3 4 6
> Thanks in advance
> Ragia
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] matrix/df help populate NA

2015-06-14 Thread jim holtman
Is this what you want:

> x1 = structure(list(Subject = c("x1", "x2"), A = c(1.5, -1.2), B = c(-1.3,
+ -0.3), C = c(0.4, 0.3), D = c(-0.2, -0.1)), .Names = c("Subject",
+ "A", "B", "C", "D"), class = "data.frame", row.names = c(NA,
+ -2L))
>
> x2 = structure(list(Subject = c("x1", "x2"), A = c(4.3, 2.4), D = c(-2.4,
+ 0.1), F = c(1.3, 0.5), H = c(-2.3, -1.4)), .Names = c("Subject",
+ "A", "D", "F", "H"), class = "data.frame", row.names = c(NA,
+ -2L))
>
> # determine what the missing columns are and then add them to x2
> missing <- setdiff(colnames(x1), colnames(x2))
>
> new_x2 <- x2
>
> for (i in missing) new_x2[[i]] <- NA
>
> new_x2
  Subject   AD   FH  B  C
1  x1 4.3 -2.4 1.3 -2.3 NA NA
2  x2 2.4  0.1 0.5 -1.4 NA NA





Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Sat, Jun 13, 2015 at 11:17 PM, Adrian Johnson 
wrote:

> Dear group:
>
> I have two data frames. The column names of the two data frame has
> some common variables but not identical.
>
> my aim is to make 2  DFs more uniform by taking union of both colnames
>
>
> For example: I have x1 and x2 matrices:
>
> > x1
>   SubjectAB   CD
> 1  x1  1.5 -1.3 0.4 -0.2
> 2  x2 -1.2 -0.3 0.3 -0.1
> > x2
>   Subject   AD   FH
> 1  x1 4.3 -2.4 1.3 -2.3
> 2  x2 2.4  0.1 0.5 -1.4
>
>  cases = c('A','B','C','D','F','H')
>
> for X2 I want to create newX2 DF.
>
> > x3
>   Subject   A  B  CD   FH
> 1  x1 4.3 NA NA -2.4 1.3 -2.3
> 2  x2 2.4 NA NA  0.1 0.5 -1.4
>
>
> Since B and C are no existing in x2, I put NAs.
>
> how can I create x3 matrix?
>
>
>
> dput code:
>
> x1 = structure(list(Subject = c("x1", "x2"), A = c(1.5, -1.2), B = c(-1.3,
> -0.3), C = c(0.4, 0.3), D = c(-0.2, -0.1)), .Names = c("Subject",
> "A", "B", "C", "D"), class = "data.frame", row.names = c(NA,
> -2L))
>
> x2 = structure(list(Subject = c("x1", "x2"), A = c(4.3, 2.4), D = c(-2.4,
> 0.1), F = c(1.3, 0.5), H = c(-2.3, -1.4)), .Names = c("Subject",
> "A", "D", "F", "H"), class = "data.frame", row.names = c(NA,
> -2L))
>
>
> Could you please help how to create x3 with NAs incorporated.
> adrian.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Scatterplot : smoothing colors according to density of points

2015-06-14 Thread jim holtman
check out the 'hexbin' package for making scatter plots that have a lot of
points overlapping in a small area.


Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Tue, Jun 2, 2015 at 9:51 AM, Adams, Jean  wrote:

> Try this.
>
> Jean
>
> D <- structure(list(
>   id = structure(1:6, .Label = c("O13297", "O13329", "O13525",
> "O13539", "O13541", "O13547"), class = "factor"),
>   X = c(44.44, 31.272085, 6.865672, 14.176245, 73.275862,
> 28.991597),
>   Y = c(21.6122, 4.0159, 2.43884, 7.81217, 3.59012, 258.999)),
>   .Names = c("id", "X", "Y"), class = "data.frame",
>   row.names = c("1", "2", "3", "4", "5", "6"))
>
> # define the number of colors
> ncol <- 100
> # define the radius of the neighborhood
> distcut <- 30
> pal <- colorRampPalette(c("blue", "yellow", "red"))(ncol)
>
> # calculate the euclidean distance between all pairs of points, based on X,
> Y coordinates
> Ddist <- with(D, as.matrix(dist(cbind(X, Y), diag=TRUE, upper=TRUE)))
> # count up the number of neighbors within distcut distance of each point
> D$C <- apply(Ddist # use this count to define the levels (which will be then used to color
> points in the plot
> D$Clevels <- with(D,
>   cut(C, breaks=seq(min(C), max(C), length.out=ncol+1),
> labels=FALSE, include.lowest=TRUE))
>
> # plot the data
> with(D, plot(X, Y, col=pal[Clevels], log="y", pch=16))
>
>
>
> On Tue, Jun 2, 2015 at 5:37 AM, Benjamin Dubreuil <
> benjamin.dubre...@weizmann.ac.il> wrote:
>
> > Hello everyone,
> >
> > I have a data frame D with 4 columns id,X,Y,C.
> > I want to plot a simple scatter plot of D$X vs. D$Y and using D$C values
> > as a color. (id is just a text string not used for the plot)
> >
> > But actually, I don't want to use the raw values of D$C, I would prefer
> to
> > calculate the average values of D$C according to the density of points
> in a
> > fixed neighborhood.
> > In other words, I would like to smooth the colors according to the
> density
> > of points.
> >
> > I am looking for any function,package that could solve this.
> > So far, I've been looking at library MASS and the function kde2d which
> can
> > calculate the density of points in 2 directions, but I don't see how I
> > could then use this information to recalculate my D$C values.
> >
> > Here is a piece of the matrix :
> >  > head(D)
> >   id X YC
> > 1 O13297 44.44  21.61220 -0.136651639
> > 2 O13329 31.272085   4.01590 -0.117016949
> > 3 O13525  6.865672   2.43884 -0.161173913
> > 4 O13539 14.176245   7.81217 -0.075756757
> > 5 O13541 73.275862   3.59012 -0.006988235
> > 6 O13547 28.991597 258.99900 -0.013985507
> >
> > > dim(D)
> > [1] 36164
> >
> > > apply(D[,-1],2,range)
> >X  Y  C
> > [1,]   0.3378378 0.0003 -0.738
> > [2,] 100.000 24556.4000  0.5582500
> > (Y is not linear, so I use log='y' in the plot function)
> >
> > I used a palette of 100 colors ranging from Blue to Yellow to red.
> > >pal =  colorRampPalette(c("blue","yellow","red"))(100)
> >
> > To make D$C values correspond to a color, I used a cut with the following
> > breaks (101 breaks from -1.2 to 1.2):
> > > BREAKS
> >   [1] -1.2000 -0.8000 -0.4000 -0.3600 -0.3200 -0.2800 -0.2400 -0.2000
> > -0.1925
> >  [10] -0.1850 -0.1775 -0.1700 -0.1625 -0.1550 -0.1475 -0.1400 -0.1368
> > -0.1336
> >  [19] -0.1304 -0.1272 -0.1240 -0.1208 -0.1176 -0.1144 -0.1112 -0.1080
> > -0.1048
> >  [28] -0.1016 -0.0984 -0.0952 -0.0920 -0.0888 -0.0856 -0.0824 -0.0792
> > -0.0760
> >  [37] -0.0728 -0.0696 -0.0664 -0.0632 -0.0600 -0.0568 -0.0536 -0.0504
> > -0.0472
> >  [46] -0.0440 -0.0408 -0.0376 -0.0344 -0.0312 -0.0280 -0.0248 -0.0216
> > -0.0184
> >  [55] -0.0152 -0.0120 -0.0088 -0.0056 -0.0024  0.0008  0.0040  0.0072
> > 0.0104
> >  [64]  0.0136  0.0168  0.0200  0.0232  0.0264  0.0296  0.0328  0.0360
> > 0.0392
> >  [73]  0.0424  0.0456  0.0488  0.0520  0.0552  0.0584  0.0616  0.0648
> > 0.0680
> >  [82]  0.0712  0.0744  0.0776  0.0808  0.0840  0.0872  0.0904  0.0936
> > 0.0968
> >  [91]  0.1000  0.1250  0.1500  0.1750  0.2000  0.2250  0.2500  0.4875
> > 0.7250
> > [100]  0.9625  1.2000
> > > C.levels = as.numeric(cut(D$C,breaks=BREAKS))
> > >length(C.levels)
> > [1] 3616
> >
> > C.levels ranges from 2 to 98 and then to plot the colors I used
> > pal[C.levels].
> > > plot( x=D$x, y=D$Y, col=pal[ C.levels ],log='y')
> >
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE a

Re: [R] Milisecond problem in as.POSIXct ?

2015-06-14 Thread Joshua Ulrich
On Wed, Jun 10, 2015 at 8:05 PM, Joshua Ulrich  wrote:
>
> This is known behavior with how POSIXt objects are printed.  See the 
> discussion on StackOverflow: 
> http://stackoverflow.com/questions/7726034/how-r-formats-posixct-with-fractional-seconds
>
To summarize the relevant portion of the discussion on StackOverflow:

I believe the behavior is due to truncating the fractional seconds
instead of rounding, combined with the floating point representation
of the POSIXct time (though I would need to take a closer look at
do_formatPOSIXlt to verify).  If you look at the underlying double
value of the POSIXct object, you can see why the printed value in the
first example below is 0.2s and the printed value for the second
example is 0.4s.

R> sprintf("%20.10f", as.POSIXct('2011-10-11 07:49:36.3', tz="UTC"))
[1] "1318319376.299523"
R> sprintf("%20.10f", as.POSIXct('2011-10-11 07:49:36.4', tz="UTC"))
[1] "1318319376.400954"

> On Wed, Jun 10, 2015 at 7:41 PM, ce  wrote:
>>
>> Dear all,
>>
>> my main problem is with miliseconds. I have an array :
>>
>> library(xts)
>> options(digits.secs = 3)
>> > x
>> [1] "2015-06-10 10:22:06.389 EDT" "2015-06-10 10:22:07.473 EDT"
>> [3] "2015-06-10 10:22:08.717 EDT" "2015-06-10 10:22:09.475 EDT"
>>
>> > x[1]
>> [1] "2015-06-10 10:22:06.38 EDT"
>> > x[2]
>> [1] "2015-06-10 10:22:07.473 EDT"
>>
>> why it cuts last digit of miliseconds 389 to 38 ? ( it doesn't cut 473 !! )
>>
>> I try to dump it to post here:
>>
>> > dump("x",file=stdout())
>>
>> x <-
>> structure(c(1433946126.39, 1433946127.474, 1433946128.717, 1433946129.476
>> ), tzone = "", tclass = c("POSIXct", "POSIXt"), class = c("POSIXct",
>> "POSIXt"))
>>
>> new array becomes :
>>
>> > x
>> [1] "2015-06-10 10:22:06.390 EDT" "2015-06-10 10:22:07.473 EDT"
>> [3] "2015-06-10 10:22:08.717 EDT" "2015-06-10 10:22:09.476 EDT"
>>
>> this time first milisecond 389 became 390 ?  and last element 475 became 476 
>> ?
>>
>> I do some more tests :
>>
>> as.POSIXct("2015-06-10 10:22:07.473",format='%Y-%m-%d %H:%M:%OS')
>> [1] "2015-06-10 10:22:07.473 EDT"
>>
>> is correct, but :
>>
>> as.POSIXct("2015-06-10 10:22:06.389",format='%Y-%m-%d %H:%M:%OS')
>> [1] "2015-06-10 10:22:06.388 EDT"
>>
>> why miliseconds turn to 388 instead of 389 ?
>>
>> or
>>
>>  as.POSIXct("2015-06-10 10:22:07.478",format='%Y-%m-%d %H:%M:%OS')
>> [1] "2015-06-10 10:22:07.477 EDT"
>>
>>  why it shows 477 instead of 478
>>
>> > sessionInfo()
>> R version 3.2.0 (2015-04-16)
>> Platform: x86_64-suse-linux-gnu (64-bit)
>> Running under: openSUSE 13.2 (Harlequin) (x86_64)
>>
>> locale:
>>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
>>  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
>>  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
>>  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
>>  [9] LC_ADDRESS=C   LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  methods   base
>>
>> other attached packages:
>> [1] xts_0.9-7  zoo_1.7-12
>>
>> loaded via a namespace (and not attached):
>> [1] tools_3.2.0 grid_3.2.0  lattice_0.20-31
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

>
> --
> Joshua Ulrich  |  about.me/joshuaulrich
> FOSS Trading  |  www.fosstrading.com


-- 
Joshua Ulrich  |  about.me/joshuaulrich
FOSS Trading  |  www.fosstrading.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Different behavior of model.matrix between R 3.2 and R3.1.1

2015-06-14 Thread Frank Harrell
Terry - your example didn't demonstrate the problem because the variable 
that interacted with strata (zed) was not a factor variable.


But I had stated the problem incorrectly.  It's not that there are too 
many strata terms; there are too many non-strata terms when the variable 
interacting with the stratification factor is a factor variable.  Here 
is a simple example, where I have attached no packages other than the 
basic startup packages.


strat <- function(x) x
d <- expand.grid(a=c('a1','a2'), b=c('b1','b2'))
d$y <- c(1,3,2,4)
f <- y ~ a * strat(b)
m <- model.frame(f, data=d)
Terms <- terms(f, specials='strat', data=d)
specials <- attr(Terms, 'specials')
temp <- survival:::untangle.specials(Terms, 'strat', 1)
Terms <- Terms[- temp$terms]
model.matrix(Terms, m)

  (Intercept) aa2 aa1:strat(b)b2 aa2:strat(b)b2
1   1   0  0  0
2   1   1  0  0
3   1   0  1  0
4   1   1  0  1
. . .

The column corresponding to a='a1' b='b2' should not be there 
(aa1:strat(b)b2).


This does seem to be a change in R.  Any help appreciated.

Note that after subsetting out strat terms using Terms[ - temp$terms], 
Terms attributes factor and term.labels are:


attr(,"factors")
 a a:strat(b)
y0  0
a1  2
strat(b) 0  1
attr(,"term.labels")
[1] "a"  "a:strat(b)"


Frank


On 06/11/2015 08:44 AM, Therneau, Terry M., Ph.D. wrote:

Frank,
   I'm not sure what is going on.  The following test function works for
me in both 3.1.1 and 3.2, i.e, the second model matrix has fewer
columns.  As I indicated to you earlier, the coxph code removes the
strata() columns after creating X because I found it easier to correctly
create the assign attribute.

   Can you create a worked example?

require(survival)
testfun <- function(formula, data) {
 tform <- terms(formula, specials="strata")
 mf <- model.frame(tform, data)

 terms2 <- terms(mf)
 strat <- untangle.specials(terms2, "strata")
 if (length(strat$terms)) terms2 <- terms2[-strat$terms]
 X <- model.matrix(terms2, mf)
 X
}

tdata <- data.frame(y= 1:10, zed = 1:10, grp =
factor(c(1,1,1,2,2,2,1,1,3,3)))

testfun(y ~ zed*grp, tdata)

testfun(y ~ strata(grp)*zed, tdata)


Terry T.

- original message --

For building design matrices for Cox proportional hazards models in the
cph function in the rms package I have always used this construct:

Terms <- terms(formula, specials=c("strat", "cluster", "strata"),
data=data)
specials <- attr(Terms, 'specials')
stra<- specials$strat
Terms.ns <- Terms
  if(length(stra)) {
temp <- untangle.specials(Terms.ns, "strat", 1)
Terms.ns <- Terms.ns[- temp$terms]#uses [.terms function
  }
X <- model.matrix(Terms.ns, X)[, -1, drop=FALSE]

The Terms.ns logic removes stratification factor "main effects" so that
if a stratification factor interacts with a non-stratification factor,
only the interaction terms are included, not the strat. factor main
effects. [In a Cox PH model stratification goes into the nonparametric
survival curve part of the model].

Lately this logic quit working; model.matrix keeps the unneeded main
effects in the design matrix.  Does anyone know what changed in R that
could have caused this, and possibly a workaround?


---




--

Frank E Harrell Jr  Professor and Chairman  School of 
Medicine
Department of *Biostatistics*   *Vanderbilt University*

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] matrix/df help populate NA

2015-06-14 Thread Adrian Johnson
Thank you very much. It worked!


On Sun, Jun 14, 2015 at 8:00 PM, jim holtman  wrote:
> Is this what you want:
>
>> x1 = structure(list(Subject = c("x1", "x2"), A = c(1.5, -1.2), B = c(-1.3,
> + -0.3), C = c(0.4, 0.3), D = c(-0.2, -0.1)), .Names = c("Subject",
> + "A", "B", "C", "D"), class = "data.frame", row.names = c(NA,
> + -2L))
>>
>> x2 = structure(list(Subject = c("x1", "x2"), A = c(4.3, 2.4), D = c(-2.4,
> + 0.1), F = c(1.3, 0.5), H = c(-2.3, -1.4)), .Names = c("Subject",
> + "A", "D", "F", "H"), class = "data.frame", row.names = c(NA,
> + -2L))
>>
>> # determine what the missing columns are and then add them to x2
>> missing <- setdiff(colnames(x1), colnames(x2))
>>
>> new_x2 <- x2
>>
>> for (i in missing) new_x2[[i]] <- NA
>>
>> new_x2
>   Subject   AD   FH  B  C
> 1  x1 4.3 -2.4 1.3 -2.3 NA NA
> 2  x2 2.4  0.1 0.5 -1.4 NA NA
>
>
>
>
>
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>
> On Sat, Jun 13, 2015 at 11:17 PM, Adrian Johnson 
> wrote:
>>
>> Dear group:
>>
>> I have two data frames. The column names of the two data frame has
>> some common variables but not identical.
>>
>> my aim is to make 2  DFs more uniform by taking union of both colnames
>>
>>
>> For example: I have x1 and x2 matrices:
>>
>> > x1
>>   SubjectAB   CD
>> 1  x1  1.5 -1.3 0.4 -0.2
>> 2  x2 -1.2 -0.3 0.3 -0.1
>> > x2
>>   Subject   AD   FH
>> 1  x1 4.3 -2.4 1.3 -2.3
>> 2  x2 2.4  0.1 0.5 -1.4
>>
>>  cases = c('A','B','C','D','F','H')
>>
>> for X2 I want to create newX2 DF.
>>
>> > x3
>>   Subject   A  B  CD   FH
>> 1  x1 4.3 NA NA -2.4 1.3 -2.3
>> 2  x2 2.4 NA NA  0.1 0.5 -1.4
>>
>>
>> Since B and C are no existing in x2, I put NAs.
>>
>> how can I create x3 matrix?
>>
>>
>>
>> dput code:
>>
>> x1 = structure(list(Subject = c("x1", "x2"), A = c(1.5, -1.2), B = c(-1.3,
>> -0.3), C = c(0.4, 0.3), D = c(-0.2, -0.1)), .Names = c("Subject",
>> "A", "B", "C", "D"), class = "data.frame", row.names = c(NA,
>> -2L))
>>
>> x2 = structure(list(Subject = c("x1", "x2"), A = c(4.3, 2.4), D = c(-2.4,
>> 0.1), F = c(1.3, 0.5), H = c(-2.3, -1.4)), .Names = c("Subject",
>> "A", "D", "F", "H"), class = "data.frame", row.names = c(NA,
>> -2L))
>>
>>
>> Could you please help how to create x3 with NAs incorporated.
>> adrian.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.