[R] hourly prediction time series

2016-02-05 Thread AURORA GONZALEZ VIDAL

Dear R users,

I am fronting my firts time series problem. I have hourly temperature data
for 3 years (from 01/01/2013 to 5/02/2016). I would like to use those in
order to PREDICT TEMPERATURE OF THE NEXT HOURS according to the
observations.

A subset of the data look like this:

date <- rep(seq(as.Date("14-01-01"), as.Date("14-01-03"), by="days"), 24)
hour <-rep(c(paste0("0",0:9,":00:00"), paste0(10:23,":00:00")),3)
temperature <- c(6.1, 6.8, 6.5, 7.2, 7.1, 7.9, 5.9, 6.8, 7.7, 9.5, 12.6,
 14.0, 15.9, 17.3, 17.5, 17.2, 15.0, 14.1,
13.1, 11.7, 10.9,
 11.0, 11.6, 11.0, 11.2, 11.0, 11.0, 11.4,
12.2, 13.7, 12.9,
 12.9, 12.8, 13.4, 13.9, 14.9, 16.6, 16.0,
15.2, 15.4, 14.7,
 14.6, 13.3, 13.0, 13.8, 13.1, 12.0, 11.9,
11.8, 11.6, 11.0,
 11.2, 11.6, 10.6, 9.5, 9.8, 9.9, 11.7,
15.3, 18.6, 20.7,
 22.2, 22.2, 20.8, 20.2, 18.3, 15.6, 13.6,
12.8, 13.1, 13.7, 14.7)

dfExample <- data.frame(date, hour, temperature) 

So as to plot 3 years ( from 01/01/2013 to 31/12/2015) I use this code and
obtained the attached picture. It is observed seasonality.

tempdf4 <- ts(df4$temperature, frequency=365*24*3)
plot.ts(tempdf4)

Am I doing it well? Could you help me with any information in this type of
problem (mainly with the prediction). For example, if I want to use Arima,
according with my data structure, what are the arguments of the funcion??

fit=Arima(df4$temperature, seasonal=list(order=c(xxx,xxx,xxx),period=xxx)
plot(forecast(fit))

I could use also some predictions from other source that I am collecting
since January, 2016. But I would prefer to understand the simplest way to
solve the problem and then, progressively, understand more complex
approaches.

Thank you very much for any kind of help.


--
Aurora González Vidal
Phd student in Data Analytics for Energy Efficiency

Faculty of Computer Sciences
University of Murcia

@. aurora.gonzal...@um.es
T. 868 88 7866
www.um.es/ae
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] hourly prediction time series

2016-02-05 Thread Sean Porter
Try the auto.arima function in the forecast package..

Regards,
 
DR SEAN PORTER
Scientist

South African Association for Marine Biological Research
Direct Tel: +27 (31) 328 8169   Fax: +27 (31) 328 8188
E-mail: spor...@ori.org.za Web: www.saambr.org.za
1 King Shaka Avenue, Point, Durban 4001 KwaZulu-Natal South Africa
PO Box 10712, Marine Parade 4056 KwaZulu-Natal South Africa

 


-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of AURORA GONZALEZ 
VIDAL
Sent: 05 February 2016 10:50 AM
To: r-help@r-project.org
Subject: [R] hourly prediction time series

Dear R users,

I am fronting my firts time series problem. I have hourly temperature data for 
3 years (from 01/01/2013 to 5/02/2016). I would like to use those in order to 
PREDICT TEMPERATURE OF THE NEXT HOURS according to the observations.

A subset of the data look like this:

date <- rep(seq(as.Date("14-01-01"), as.Date("14-01-03"), by="days"), 24) hour 
<-rep(c(paste0("0",0:9,":00:00"), paste0(10:23,":00:00")),3) temperature <- 
c(6.1, 6.8, 6.5, 7.2, 7.1, 7.9, 5.9, 6.8, 7.7, 9.5, 12.6,
 14.0, 15.9, 17.3, 17.5, 17.2, 15.0, 14.1, 13.1, 11.7, 10.9,
 11.0, 11.6, 11.0, 11.2, 11.0, 11.0, 11.4, 12.2, 13.7, 12.9,
 12.9, 12.8, 13.4, 13.9, 14.9, 16.6, 16.0, 15.2, 15.4, 14.7,
 14.6, 13.3, 13.0, 13.8, 13.1, 12.0, 11.9, 11.8, 11.6, 11.0,
 11.2, 11.6, 10.6, 9.5, 9.8, 9.9, 11.7, 15.3, 18.6, 20.7,
 22.2, 22.2, 20.8, 20.2, 18.3, 15.6, 13.6, 12.8, 13.1, 13.7, 
14.7)

dfExample <- data.frame(date, hour, temperature) 

So as to plot 3 years ( from 01/01/2013 to 31/12/2015) I use this code and 
obtained the attached picture. It is observed seasonality.

tempdf4 <- ts(df4$temperature, frequency=365*24*3)
plot.ts(tempdf4)

Am I doing it well? Could you help me with any information in this type of 
problem (mainly with the prediction). For example, if I want to use Arima, 
according with my data structure, what are the arguments of the funcion??

fit=Arima(df4$temperature, seasonal=list(order=c(xxx,xxx,xxx),period=xxx)
plot(forecast(fit))

I could use also some predictions from other source that I am collecting since 
January, 2016. But I would prefer to understand the simplest way to solve the 
problem and then, progressively, understand more complex approaches.

Thank you very much for any kind of help.


--
Aurora González Vidal
Phd student in Data Analytics for Energy Efficiency

Faculty of Computer Sciences
University of Murcia

@. aurora.gonzal...@um.es
T. 868 88 7866
www.um.es/ae

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] hourly prediction time series

2016-02-05 Thread Giorgio Garziano
Some good references:

https://www.otexts.org/fpp

http://link.springer.com/book/10.1007%2F978-0-387-88698-5

http://www.statoek.wiso.uni-goettingen.de/veranstaltungen/zeitreihen/sommer03/ts_r_intro.pdf


Best,

--

GG

This Communication is Ericsson Confidential.
We only send and receive email on the basis of the term set out at 
http://www.ericsson.com/email_disclaimer




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R project and the TPP

2016-02-05 Thread John Logsdon
Folks

TPP, and in a European context, TTIP are very dangerous not only to open
source software but to any public service and no satisfactory response has
been forthcoming.

There are ways of circumventing it I guess or opposing it (maybe using
ISPs in China, Russia or North Korea???).  The issue really should be
reversed - how muchy open source coding has found its way into closed
source software?  We do not know because proprietory coding is secret, and
hence insecure.  Perhaps a court could rule that all software should be
available for inspection by independent experts.  This possibility may be
sufficient to shut TI etc up.

But this seems to have been put together in total secrecy and undermines
pretty nearly every 'freedom' people have fought for since at least King
John and probably others (not that English peasants enjoyed too much
freedom after 1215 as it was the barons who got it all!)

I really do not understand why legislators have done this unless
corruption has become so pervasive that there are no longer any good guys
and girls around (well, maybe Bernie Sanders and Jeremy Corbyn excepted
but their chance of power is pretty slim at the moment despite Iowa).

In the UK we have a referendum on EU membership which under ordinary
circumstances I would automatically support as very much a pro-Europe
person.  But if TTIP is implemented, I don't know which way to vote.  Of
course it is a total sham anyway, so maybe a bloody nose for the
legislators would not be a bad idea.  And looking at the way the EU has
treated Greece, Cyprus, Ireland, Portugal, I don't hold out much hope for
an epiphany.

Anyway this is a bit OT. :)


>
>
> On 2/4/2016 6:59 PM, David Winsemius wrote:
>>> On Feb 4, 2016, at 3:15 PM, Rolf Turner 
>>> wrote:
>>>
>>>
>>>
>>> Quite a while ago I went to talk (I think it may have been at an NZSA
>>> conference) given by the great Ross Ihaka.  I forget the details but my
>>> vague recollection was that it involved a technique for automatic
>>> choice of some sort of smoothing parameter involved in a graphical
>>> display.
>> Identifying discontinuities:
>>
>> https://www.stat.auckland.ac.nz/~ihaka/downloads/Curves.pdf
>>
>> http://www.google.com/patents/US6704013
>>
>> TI can now own analytic geometry if they file enough patents.
>
>And TI could therefore under TPP demand that any Internet Service
> Provider remove any R content (or R generated content) that they claimed
> (correctly or otherwise) infringed on their intellectual property,
> without a court order, and with common citizens having only slightly
> more ability to seek redress than the British peasants had when their
> nobility got King John of England to sign the Magna Carta on 15 June 1215?
>
>
>And, of course, this is only one concrete example.
>
>
>More relevant, TPP might prohibit any government from promoting
> the use of open-source software, because it could deprive a for-profit
> company of income, and they could therefore sue for lost profit under
> the Investor-State Dispute Settlement Settlement (ISDS) provisions of
> the TPP or other "free trade" agreements like NAFTA.  This is hardly far
> fetched:  Last Dec. 21, the U.S. Congress decided that consumers in the
> U.S. did not have the right to know the origins of the meat they buy
> under NAFTA (Scott Smith, "Congress repeals country of origin labeling
> for meat", United Press International, Dec. 21, 2015 at 10:12 AM,
> http://www.upi.com/Top_News/US/2015/12/21/Congress-repeals-country-of-origin-labeling-for-meat/3241450709277/).
>
>
>
>Spencer Graves
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


Best wishes

John

John Logsdon
Quantex Research Ltd
+44 161 445 4951/+44 7717758675

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R project and the TPP

2016-02-05 Thread Boris Steipe
Does that mean I could poison the Archive by posting IP on this list, or poison 
someone else's code if they use some of mine that I post here?

B.



On Feb 4, 2016, at 7:48 PM, Spencer Graves 
 wrote:

>  It's not clear if the TPP would ever directly impact the R project.  
> However, it could impact many R users.
> 
> 
>* For example, if someone decides that something you have on the 
> web includes material for which they claim copyright, the TPP allows them to 
> order your Internet Service Provider to take down your web site.   No court 
> order is required.  No proof is required.  If you want to contest, the 
> dispute might go through the "Investor-State Dispute Settlement" process, 
> where the issue will be judged by people essentially selected by 
> multinational businesses. (Article 18, Section J.)  [Phillip Morris Tobacco 
> Company has already sued Uruguay, Australia and Norway over packaging 
> requirements that has actually been effective in reducing tobacco consumption 
> in those countries.  Former New York City Mayor Bloomberg has been paying 
> legal fees for Uruguay, because they can't afford the legal fees.  Tobacco is 
> explicitly excluded from the TPP, but similar suits could be brought over 
> other types of products or services.]  This could include some algorithm 
> you've coded into R, if some company decides you're using their copyrighted 
> algorithm or whatever without paying for it.  Current US copyright law covers 
> "derivative works", which could be almost anything.  This sounds far fetched. 
>  However, the Recording Industry Association of America (RIAA) sued four 
> college students for close to $100 billion, because their improvements of 
> search engines made it easier for people in a university intranet to find 
> copyrighted music placed by others in their "public" folder.  The attorney 
> uncle of one of those four told his nephew that it would cost him a million 
> dollars to defend himself, and there would be no way he could recoup that 
> money even if he won.  In negotiations, they asked the student how much money 
> he had.  He said he had saved $12,000 for college.  They took it. Major media 
> organizations similarly sued Venture Capitalists who funded Napster and 
> Lawyers who advised MP3.com that they had reasonable grounds to that MP3's 
> business model was legal.  The Napster funders and MP3 lawyers similarly knew 
> they could not afford to defend themselves and settled.  [Wikipedia, "Free 
> Culture (book)";  https://en.wikipedia.org/wiki/Free_Culture_(book)]
> 
> 
>  Other parts of the TPP are highly undemocratic but may not relate as 
> closely to R as the provision I just mentioned.
> 
> 
>* The provision that worries me the most is Article 18.78 on 
> “Trade Secrets”.  This broadly criminalizes “unauthorized and willful 
> disclosure of a trade secret”.  This doesn't sound bad, except that a "trade 
> secret" could include documentation of criminal behavior.  This substantially 
> increases risks for journalists and whistleblowers.  For example The Guardian 
> could be sued for having published documents released by Ed Snowden -- even 
> though what Snowden did was expose perjury by James Clapper, US Director of 
> National Intelligence. 
> (http://tumblr.fightforthefuture.org/post/132605875893/final-tpp-text-confirms-worst-fears-shadowy)
> 
> 
>* Article 18.63 "forces the most draconian parts of the U.S.’s 
> broken copyright system on the rest of the world without expanding 
> protections for fair use and free speech. This section requires countries to 
> enforce copyright until 70 years after the creator’s death. This will keep an 
> enormous amount of information, art, and creativity out of the public domain 
> for decades longer than necessary, and allow for governments to abuse 
> copyright laws to censor online content at will, since so much of it will be 
> copyrighted for so long." 
> (http://tumblr.fightforthefuture.org/post/132605875893/final-tpp-text-confirms-worst-fears-shadowy)
>  
> 
> 
>* The TPP could also make the Internet less secure.  For example, 
> the Electronic Frontier Foundation says that, "With no good rationale, the 
> agreement would outlaw a country from adopting rules for the sale of software 
> that include mandatory code review or the release of source code. This could 
> inhibit countries from addressing pressing information security problems, 
> such as widespread and massive vulnerability in closed-source home routers." 
> (www.eff.org/issues/tpp)
> 
> 
>  I hope you find this interesting and useful even if some of it is off 
> topic.
> 
> 
>  Spencer Graves
> 
> 
> On 2/4/2016 5:15 PM, Rolf Turner wrote:
>> 
>> 
>> Quite a while ago I went to talk (I think it may have been at an NZSA
>> conference) given by the great Ross Ihaka.  I forget the details but
>> my vague recollection was that it involved a technique for automatic
>> choice of some so

Re: [R] Package for error analysis

2016-02-05 Thread S Ellison
> I'm doing error analysis of predictive models and I need to calculate global
> error, this is, I need to calculate the resultant error from propagation of
> indirect measurements errors.

If the problem is propagation of variance on inputs through a known function, 
you could look at uncert() in the metRology package.
That basically does first order error propagation using analytic (if available) 
or numerical differentiation of the function, or (via uncertMC()) Monte Carlo 
simulation to get error distributions (which I would recommend for larger 
variance over first order calculations).

I wouldn't exactly call it statistics - it's based on recommendations for 
measurement science and they assume a pretty simple deterministic model and 
that has little or nothing to do with model fitting and inference.

S Ellison

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Catarina
> Silva
> Sent: 04 February 2016 09:57
> To: R mailling list
> Subject: [R] Package for error analysis
> 
> Hi,
> 
> I'm doing error analysis of predictive models and I need to calculate global
> error, this is, I need to calculate the resultant error from propagation of
> indirect measurements errors.
> 
> I know a little about propagation error theory, using derivation formulas to
> calculate it, but I want to know if exists some package to do it.
> 
> 
> 
> Ty
> 
> Catarina
> 
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R project and the TPP

2016-02-05 Thread José Bustos
Thank everyone, I have found some good, but limited infomation about it. As
Ross Ihaka mention in his presentation: "Houston, We Have a Problem".

The R software is a important part of the Free Software Fundation, they
have been fighting back TPP long time, but last weeks in Chile was not so
good. Some chilean politician (as Minister of the Interior and Public
Security Jorge Burgos) are impulsing the TPP to be signed as soon as
possible. On the other side, some smart representatives have been trying to
stop the fast track.

It is a very big issue, there will be a ton of more limitations like Ross
Ihaka had mentioned.

Please get informed about the impacts and send emails to yours
representatives in congress to NOT aprove what we all don't know.

Here some articles to read:

http://www.ip-watch.org/2015/11/24/tpp-article-14-17-free-software-no-harm-no-foul/
https://www.fsf.org/blogs/licensing/latest-tpp-leak-shows-systemic-threat-to-software-freedom

Keep talking about hte impacts and limitations of this agreement.

Jose

2016-02-04 21:59 GMT-03:00 David Winsemius :

>
> > On Feb 4, 2016, at 3:15 PM, Rolf Turner  wrote:
> >
> >
> >
> > Quite a while ago I went to talk (I think it may have been at an NZSA
> conference) given by the great Ross Ihaka.  I forget the details but my
> vague recollection was that it involved a technique for automatic choice of
> some sort of smoothing parameter involved in a graphical display.
>
> Identifying discontinuities:
>
> https://www.stat.auckland.ac.nz/~ihaka/downloads/Curves.pdf
>
> http://www.google.com/patents/US6704013
>
> TI can now own analytic geometry if they file enough patents.
>
> --
> David.
> > Apparently Ross's ideas related peripherally to some patented technique
> owned by Texas Instruments, and TI was causing problems for Ross.  He
> seemed to be of the opinion that the TPP would make matters worse.
> >
> > I suspect he's right.  It will make matters worse for everyone except
> the rich bastards in the multinationals, in all respects.
> >
> > cheers,
> >
> > Rolf
> >
> > --
> > Technical Editor ANZJS
> > Department of Statistics
> > University of Auckland
> > Phone: +64-9-373-7599 ext. 88276
> >
> > On 05/02/16 11:33, Marc Schwartz wrote:
> >> Ted and José,
> >>
> >> The FSF has a blog post here that might provide some insights:
> >>
> >>
> http://www.fsf.org/blogs/licensing/time-to-act-on-tpp-is-now-rallies-against-tpp-in-washington-d-c-november-14-18
> >>
> >> That is from last November, but the relevant passage, perhaps in a
> temporal vacuum, seems to be the second paragraph with the following
> sentences focused on the GPL:
> >>
> >> "The regulation would not affect freely licensed software, such as
> software under the GPL, that already comes with its own conditions ensuring
> users receive source code. Such licenses are grants of permission from the
> copyright holders on the work, who are not a "Party" to TPP."
> >>
> >>
> >> The Software Freedom Conservancy has a post on this as well, from the
> same time frame:
> >>
> >>   https://sfconservancy.org/blog/2015/nov/09/gpl-tpp/
> >>
> >> Regards,
> >>
> >> Marc
> >>
> >>
> >>> On Feb 4, 2016, at 4:01 PM, Ted Harding 
> wrote:
> >>>
> >>> Saludos José!
> >>> Could you please give a summary of the relevant parts of TPP
> >>> that might affect the use of R? I have looked up TPP on Wikipedia
> >>> without beginning to understand what it might imply for the use of R.
> >>> Best wishes,
> >>> Ted.
> >>>
> >>> On 04-Feb-2016 14:43:29 José Bustos wrote:
>  Hi everyone,
> 
>  I have a question regarding the use R software under the new TPP laws
>  adopted by some governments in the region. Who know how this new
> agreements
>  will affect researchers and the R community?
> 
>  Hope some of you knows better and can give ideas about it.
> 
>  saludos,
>  José
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
>


-- 
José Bustos
Director AESPRO
Magíster en Estadística Aplicada
Movil +56 995939144
www.aespro.cl

--

Este mensaje y/o documento adjunto está dirigido exclusivamente al
destinatario especificado y puede contener información confidencial,
privilegiada o de divulgación restringida. Cualquier revelación, copia,
distribución o acción que comprometa el contenido de esta información está
prohibida. Si usted recibe este correo por error, contacte al emisor y
borre la información de su computador.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting g

[R] pearson correlation matrix

2016-02-05 Thread emmanuelle morin

Hello,

I have a set of 12 individuals with thousands of variables measured.
I understand that when I'm using the cor() function on my matrix I'm 
calculating the correlation between the different variables according to 
their values for the different individuals.


What I'm willing to do is to calculate a correlation between the 
individuals and I have no clue how I can do that.

Could you please help me ?

Thanks,

Emmanuelle

--
Emmanuelle MORIN
UMR 1136 INRA/Université de Lorraine
F-54280 Champenoux
Tel : + 33 3 83 39 41 33
http://mycor.nancy.inra.fr

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pearson correlation matrix

2016-02-05 Thread David L Carlson
Assuming your data is called BIG and has 12 rows and thousands of columns:

cor(t(BIG))

-
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352



-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of emmanuelle morin
Sent: Friday, February 5, 2016 8:07 AM
To: r-help@r-project.org
Subject: [R] pearson correlation matrix

Hello,

I have a set of 12 individuals with thousands of variables measured.
I understand that when I'm using the cor() function on my matrix I'm 
calculating the correlation between the different variables according to 
their values for the different individuals.

What I'm willing to do is to calculate a correlation between the 
individuals and I have no clue how I can do that.
Could you please help me ?

Thanks,

Emmanuelle

-- 
Emmanuelle MORIN
UMR 1136 INRA/Université de Lorraine
F-54280 Champenoux
Tel : + 33 3 83 39 41 33
http://mycor.nancy.inra.fr

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pearson correlation matrix

2016-02-05 Thread Michael Dewey
Assuming your dataset is in a matrix you want to transpose it. So you 
can go t(mesdonnees) and then call cor on that.


On 05/02/2016 14:06, emmanuelle morin wrote:

Hello,

I have a set of 12 individuals with thousands of variables measured.
I understand that when I'm using the cor() function on my matrix I'm
calculating the correlation between the different variables according to
their values for the different individuals.

What I'm willing to do is to calculate a correlation between the
individuals and I have no clue how I can do that.
Could you please help me ?

Thanks,

Emmanuelle



--
Michael
http://www.dewey.myzen.co.uk/home.html

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subset with missing argument within a function

2016-02-05 Thread William Dunlap via R-help
R's subscripting operators do not "guess" the value of a missing
argument: a missing k'th subscript means seq_len(dim(x)[k]).
I bet that you use syntax like x[,1] (the entire first column of x)
all the time and that you don't want this syntax to go away.

Some languages use a placeholder like '.' or '*' to do this.  Perhaps
S should have, but it is now late to make such a change.



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Feb 4, 2016 at 11:23 PM, Stefano de Pretis 
wrote:

> Thanks Bill,
>
> This is more clear.
>
> In any case, I find very inappropriate that a programming language tries
> to guess the value of a missing argument. It is unfair towards code
> developers and it promotes the production of bugged piece of software.
>
> I hope R will revise its policies sooner or later.
>
> Thanks for the discussion,
>
> Stefano
>
>
>
>
>
>
>
> 2016-02-04 18:19 GMT+01:00 William Dunlap :
>
>> The "missingness" of an argument gets passed down through nested function
>> calls.  E.g.,
>>   fOuter <- function(x) c(outerMissing=missing(x), innerMissing=fInner(x))
>>   fInner <- function(x) missing(x)
>>   fInner()
>>   #[1] TRUE
>>   fOuter()
>>   #outerMissing innerMissing
>>   #  TRUE TRUE
>> It is only when a function evaluates an argument that you get a message
>> like 'argument is missing, with no default'.  ('[' checks for missingness
>> before
>> evaluating a subscript argument so it will not give that error.)
>>
>>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>>
>> On Thu, Feb 4, 2016 at 7:47 AM, Stefano de Pretis > > wrote:
>>
>>> Hi Petr,
>>>
>>> Thank you for your answer.
>>>
>>> I'm not sure how the empty index reflects what I'm showing in my example.
>>> If my function was
>>>
>>> emptySubset <- function(vec) vec[]
>>>
>>> I would then agree that this was the case. But I think it's different:
>>> I'm
>>> specifically telling my function that it should have two arguments ("vec"
>>> and "ix")
>>>
>>> subsettingFun <- function(vec, ix) vec[ix]
>>>
>>> and I guess why, within the function, it does not happen what happens on
>>> the command line:
>>>
>>> > ix
>>> Error: object 'ix' not found
>>> > letters[ix]
>>> Error: object 'ix' not found
>>>
>>> My "expectation" came from a matter of coherence, but probably I'm still
>>> missing something.
>>>
>>> Regards,
>>>
>>> Stefano
>>>
>>>
>>>
>>> 2016-02-04 15:39 GMT+01:00 PIKAL Petr :
>>>
>>> > Hi
>>> >
>>> > Help page for ?"[" says
>>> >
>>> > An empty index selects all values: this is most often used to replace
>>> all
>>> > the entries but keep the attributes.
>>> >
>>> > and actually you function construction works with empty index
>>> >
>>> > > x<-c(1,2,5)
>>> > > letters[x]
>>> > [1] "a" "b" "e"
>>> > > letters[]
>>> >  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p"
>>> "q"
>>> > "r" "s"
>>> > [20] "t" "u" "v" "w" "x" "y" "z"
>>> >
>>> > It is sometimes useful not "expect" the program behavior but "inspect"
>>> why
>>> > it behaves differently.
>>> >
>>> > If you want your function to throw error when some arguments are
>>> missing
>>> > you need to do the check yourself and not rely on programming language.
>>> >
>>> > And BTW I did not know an answer before I inspected docs.
>>> >
>>> > Cheers
>>> > Petr
>>> >
>>> >
>>> > > -Original Message-
>>> > > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
>>> Stefano
>>> > > de Pretis
>>> > > Sent: Thursday, February 04, 2016 11:00 AM
>>> > > To: r-help@r-project.org
>>> > > Subject: [R] Subset with missing argument within a function
>>> > >
>>> > > Hi all,
>>> > >
>>> > > I'm guessing what's the rationale behind this:
>>> > >
>>> > > > subsettingFun <- function(vec, ix) vec[ix]
>>> > > > subsettingFun(letters, c(1,2,5))
>>> > > [1] "a" "b" "e"
>>> > > > subsettingFun(letters)
>>> > >  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p"
>>> > > "q"
>>> > > "r" "s"
>>> > > [20] "t" "u" "v" "w" "x" "y" "z"
>>> > >
>>> > > If the argument "ix" is missing, I'm expecting an error not to return
>>> > > the
>>> > > variable "vec" as it is.
>>> > >
>>> > > I think this is VERY dangerous and does not help the development of
>>> > > reliable code and the debugging.
>>> > >
>>> > > Cheers,
>>> > >
>>> > > Stefano
>>> > >
>>> > > *Center for Genomic Science of IIT@SEMM*
>>> > >
>>> > > Stefano de Pretis, PhD
>>> > >
>>> > > *Postdoctoral fellow *
>>> > >
>>> > >   [[alternative HTML version deleted]]
>>> > >
>>> > > __
>>> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> > > https://stat.ethz.ch/mailman/listinfo/r-help
>>> > > PLEASE do read the posting guide http://www.R-project.org/posting-
>>> > > guide.html
>>> > > and provide commented, minimal, self-contained, reproducible code.
>>> >
>>> > 
>>> > Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou
>>> > určeny pouze jeho adresátům.
>>> > Jestliže jste

[R] does save.image() also save the random state?

2016-02-05 Thread Jinsong Zhao

Dear there,

Here is a snipped code,

> rm(list = ls())
> x <- 123
> save.image("abc.RData")
> rm(list = ls())
> load("abc.RData")
> sample(10)
 [1]  3  7  4  6 10  2  5  9  8  1
> rm(list = ls())
> load("abc.RData")
> sample(10)
 [1]  3  7  4  6 10  2  5  9  8  1

you will see that, after loading a abc.RData file that is saved by 
save.image(), sample(10) gives the same results. I am wondering whether 
it's designed purposely. And if it is, how can I get a different results 
of sample(10) every time after loading the saved image?


Any help will be really appreciated. Thanks in advance.

Best,
Jinsong

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Alignment of a double plot where one has a switched axis

2016-02-05 Thread Gavin Rudge
Hi Rgonauts,

I am plotting 3 variables from one data set on one plot.  Two of them as a 
stacked bar and a ratio on a completely different scale, so I  need to put one 
of the axes on the top of the plot for clarity.  Whilst this is not good 
visualisation practice, there is a valid reason why, in this case, people 
viewing this plot would be interested in the magnitude of the values in the 
stacked bars and the distribution of the ratio of them (too complex and dull to 
go into here).

The plot consists of horizontal stacked bars showing two values, with a dot 
plot showing the ratio of them, all in order of the magnitude of the ratio 
(this is important).  I want them to look something like the code below but 
with a correct alignment

I wanted to avoid something overly complex with grobs as I don't find working 
with them very intuitive, although this may be the only way.  I've codged 
together this imperfect solution from code I found about the place and was 
hoping someone could either suggest a much better way or help with the final 
task of nudging the plots to make them coherent

Thanks in advance for any help received.

GavinR

#here is my code

require(ggplot2)
require(cowplot)
require(reshape2)
require(gridExtra)
require(grid)

#my original data set looks something like this, but with many more values

Set.seed=42
df1<-data.frame(idcode=LETTERS[1:10],v1=rnorm(10,mean=30,sd=10),v2=rnorm(10,mean=10,sd=5))
str(df1)
df1$rto<-(df1$v1/df1$v2)

#melt the frame
require(reshape2)
df2<-melt(df1,id.vars=c("idcode","rto"))
df2

#oder the data by the ratio variable

df2$idcode<-reorder(df2$idcode,df2$rto)

#make the first plot
plot1<-ggplot(df2)+geom_bar(stat="identity",aes(x=idcode,y=value, 
fill=variable))+theme(legend.position=c(.92,.87))+coord_flip()
plot1

#make the second plot

plot2<-ggplot(df2)+geom_point(stat="identity",aes(x=idcode,y=rto))+coord_flip()+theme(panel.background
 = element_rect(fill="transparent"))+coord_flip()
plot2

#flip the axis with cowplot

plot2<-ggdraw(switch_axis_position(plot2, axis='x'))

#plot both on the same page

grid.newpage()

#create the layout

pushViewport(viewport(layout=grid.layout(1,1)))

#helper function to get the regions right - no idea what this does but I 
cribbed it from:
#http://www.sthda.com/english/wiki/ggplot2-easy-way-to-mix-multiple-graphs-on-the-same-page-r-software-and-data-visualization#create-a-complex-layout-using-the-function-viewport

define_region <- function(row, col){
  viewport(layout.pos.row = row, layout.pos.col = col)
}

#here is the plot

print(plot1,vp=define_region(1,1))
print(plot2, vp=define_region(1,1))

#has all the ingredients but how to nudge it?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] does save.image() also save the random state?

2016-02-05 Thread Duncan Murdoch

On 05/02/2016 11:14 AM, Jinsong Zhao wrote:

Dear there,

Here is a snipped code,

  > rm(list = ls())
  > x <- 123
  > save.image("abc.RData")
  > rm(list = ls())
  > load("abc.RData")
  > sample(10)
   [1]  3  7  4  6 10  2  5  9  8  1
  > rm(list = ls())
  > load("abc.RData")
  > sample(10)
   [1]  3  7  4  6 10  2  5  9  8  1

you will see that, after loading a abc.RData file that is saved by
save.image(), sample(10) gives the same results. I am wondering whether
it's designed purposely. And if it is, how can I get a different results
of sample(10) every time after loading the saved image?


This happens because you are reloading the random number seed.  You can 
tell R to ignore it by calling


set.seed(NULL)

just after you load the image.  See ?set.seed for more details.

Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] does save.image() also save the random state?

2016-02-05 Thread Dénes Tóth

On 02/05/2016 05:25 PM, Duncan Murdoch wrote:

On 05/02/2016 11:14 AM, Jinsong Zhao wrote:

Dear there,

Here is a snipped code,

  > rm(list = ls())
  > x <- 123
  > save.image("abc.RData")
  > rm(list = ls())
  > load("abc.RData")
  > sample(10)
   [1]  3  7  4  6 10  2  5  9  8  1
  > rm(list = ls())
  > load("abc.RData")
  > sample(10)
   [1]  3  7  4  6 10  2  5  9  8  1

you will see that, after loading a abc.RData file that is saved by
save.image(), sample(10) gives the same results. I am wondering whether
it's designed purposely. And if it is, how can I get a different results
of sample(10) every time after loading the saved image?


This happens because you are reloading the random number seed.  You can
tell R to ignore it by calling

set.seed(NULL)

just after you load the image.  See ?set.seed for more details.

Duncan Murdoch



Based on your problem description, it seems that you actually do not 
want to restore the whole workspace but only the objects that you worked 
with. If this is indeed the case, it is much better to use 
save(list=ls(), file = "abc.RData") instead of save.image("abc.RData").
(Actually it is almost always better to use an explicitly parametrized 
save() call instead of save.image()).


save.image() can cause a lot of troubles besides the one you faced 
recently (which is caused due to the save and restore of the 
.Random.seed hidden object, as Duncan mentioned).


Cheers,
Denes





__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Create macro_var in R

2016-02-05 Thread Amoy Yang via R-help


 One more question (see below). I cannot use macro-var, mvar, for creating new 
name, as shown below. Any advice is highly appreciated!

> mvar<-"pop"
> new.pop<-tab[[mvar]]; new.pop
 [1]  698 1214 1003 1167 2549  824  944 1937  935  570    0
> new.tab[[mvar]]<-d$pop; 
Error in new.tab[[mvar]] <- d$pop : object 'new.tab' not found 

On Thursday, February 4, 2016 11:02 AM, Amoy Yang  wrote:
 

 This works although it looks rare by using min(",key,"). Don't know why but 
just have to remember it. This is a tough part in R.
Thanks for helps!
Amoy 

On Wednesday, February 3, 2016 5:25 PM, Gabor Grothendieck 
 wrote:
 

 See

  Example 5.  Insert Variables

on the sqldf home page.

  https://github.com/ggrothendieck/sqldf


On Wed, Feb 3, 2016 at 2:16 PM, Amoy Yang via R-help
 wrote:
> First, MVAR<-c("population) should be the same as "population'". Correct?
> You use tab[[MVAR]] to refer to "population" where double [[...]] removes 
> double quotes "...", which seemingly work for r-code although it is tedious 
> in comparison direct application in SAS %let MVAR=population. But it does not 
> work for sqldef in R as shown below.
>
>> key<-"pop"
>> library(sqldf)
>> sqldf("select grade, count(*) as cnt, min(tab[[key]]) as min,
> + max(pop) as max, avg(pop) as mean, median(pop) as median,
> + stdev(pop) as stdev from tab group by grade")
> Error in sqliteSendQuery(con, statement, bind.data) :
>  error in statement: near "[[key]": syntax error
>
>
>
>
>    On Wednesday, February 3, 2016 12:40 PM, "ruipbarra...@sapo.pt" 
> wrote:
>
>
>  Hello,
>
> You can't use tab$MVAR but you can use tab[[MVAR]] if you do MVAR <- 
> "population" (no need for c()).
>
> Hope this helps,
>
> Rui Barradas
>  Citando Amoy Yang via R-help :
> population is the field-name in data-file (say, tab). MVAR<-population takes 
> data (in the column of population) rather than field-name as done in SAS:  
> %let MVAR=population;
> In the following r-program, for instance, I cannot use ... tab$MVAR...or 
> simply MVAR itself since MVAR is defined as "population" with double quotes 
> if using MVAR<-c("population")
>
>    On Wednesday, February 3, 2016 11:54 AM, Duncan Murdoch 
> wrote:
>
>
> On 03/02/2016 12:41 PM, Amoy Yang via R-help wrote:
>  There is a %LET statement in SAS: %let MVAR=population; Thus, MVAR can be 
>used through entire program.
> In R, I tried MAVR<-c("population"). The problem is that MAVR comes with 
> double quote "" that I don't need. But MVAR<-c(population) did NOT work 
> out. Any way that double quote can be removed as done in SAS when creating 
> macro_var?
> Thanks in advance for helps!
> R doesn't have a macro language, and you usually don't need one.
>
> If you are only reading the value of population, then
>
> MAVR <- population
>
> is fine.  This is sometimes the same as c(population), but in general
> it's different:  c() will remove some attributes, such as
> the dimensions on arrays.
>
> If you need to modify it in your program, it's likely more complicated.
> The normal way to go would be to put your code in a function, and have
> it return the modified version.  For example,
>
> population <- doModifications(population)
>
> where doModifications is a function with a definition like
>
> doModifications <- function(MAVR) {
>    # do all your calculations on MAVR
>    # then return it at the end using
>    MAVR
> }
>
> Duncan Murdoch
>
>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.r-project.org/posting-guide.htmlandprovide commented, minimal, 
> self-contained, reproducible code.
>
>
>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

   

   

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] does save.image() also save the random state?

2016-02-05 Thread Duncan Murdoch

On 05/02/2016 11:49 AM, Dénes Tóth wrote:

On 02/05/2016 05:25 PM, Duncan Murdoch wrote:
> On 05/02/2016 11:14 AM, Jinsong Zhao wrote:
>> Dear there,
>>
>> Here is a snipped code,
>>
>>   > rm(list = ls())
>>   > x <- 123
>>   > save.image("abc.RData")
>>   > rm(list = ls())
>>   > load("abc.RData")
>>   > sample(10)
>>[1]  3  7  4  6 10  2  5  9  8  1
>>   > rm(list = ls())
>>   > load("abc.RData")
>>   > sample(10)
>>[1]  3  7  4  6 10  2  5  9  8  1
>>
>> you will see that, after loading a abc.RData file that is saved by
>> save.image(), sample(10) gives the same results. I am wondering whether
>> it's designed purposely. And if it is, how can I get a different results
>> of sample(10) every time after loading the saved image?
>
> This happens because you are reloading the random number seed.  You can
> tell R to ignore it by calling
>
> set.seed(NULL)
>
> just after you load the image.  See ?set.seed for more details.
>
> Duncan Murdoch
>

Based on your problem description, it seems that you actually do not
want to restore the whole workspace but only the objects that you worked
with. If this is indeed the case, it is much better to use
save(list=ls(), file = "abc.RData") instead of save.image("abc.RData").
(Actually it is almost always better to use an explicitly parametrized
save() call instead of save.image()).

save.image() can cause a lot of troubles besides the one you faced
recently (which is caused due to the save and restore of the
.Random.seed hidden object, as Duncan mentioned).


Yes, that's good advice.

One problem I've heard of is that some people save the workspace a few 
times before reading that it's good to tell R not to save on exit.  Then 
they keep reloading the same random seed in every session thereafter.


Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Create macro_var in R

2016-02-05 Thread MacQueen, Don
Yes, you can use a name stored in a variable to create a new column in a
data frame (guessing that's what you want). Here's an example:

> df <- data.frame(a=1:5)
> df[['b']] <- 2:6
> df
  a b
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6
> mvar <- 'c'
> df[[mvar]] <- 0:4
> df
  a b c
1 1 2 0
2 2 3 1
3 3 4 2
4 4 5 3
5 5 6 4


In your case, the object named "new.tab" does not exist when you try to
create a new variable in it. That's what the error message says.

Try, perhaps,

new.tab <- tab
new.tab[[mvar]] <- d$pop


(and hope that the number of elements in d$pop is the same as the number
of rows in new.tab)

-Don


-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 2/5/16, 8:53 AM, "R-help on behalf of Amoy Yang via R-help"
 wrote:

>
>
> One more question (see below). I cannot use macro-var, mvar, for
>creating new name, as shown below. Any advice is highly appreciated!
>
>> mvar<-"pop"
>> new.pop<-tab[[mvar]]; new.pop
> [1]  698 1214 1003 1167 2549  824  944 1937  935  5700
>> new.tab[[mvar]]<-d$pop;
>Error in new.tab[[mvar]] <- d$pop : object 'new.tab' not found
>
>On Thursday, February 4, 2016 11:02 AM, Amoy Yang 
>wrote:
> 
>
> This works although it looks rare by using min(",key,"). Don't know why
>but just have to remember it. This is a tough part in R.
>Thanks for helps!
>Amoy 
>
>On Wednesday, February 3, 2016 5:25 PM, Gabor Grothendieck
> wrote:
> 
>
> See
>
>  Example 5.  Insert Variables
>
>on the sqldf home page.
>
>  https://github.com/ggrothendieck/sqldf
>
>
>On Wed, Feb 3, 2016 at 2:16 PM, Amoy Yang via R-help
> wrote:
>> First, MVAR<-c("population) should be the same as "population'".
>>Correct?
>> You use tab[[MVAR]] to refer to "population" where double [[...]]
>>removes double quotes "...", which seemingly work for r-code although it
>>is tedious in comparison direct application in SAS %let MVAR=population.
>>But it does not work for sqldef in R as shown below.
>>
>>> key<-"pop"
>>> library(sqldf)
>>> sqldf("select grade, count(*) as cnt, min(tab[[key]]) as min,
>> + max(pop) as max, avg(pop) as mean, median(pop) as median,
>> + stdev(pop) as stdev from tab group by grade")
>> Error in sqliteSendQuery(con, statement, bind.data) :
>>  error in statement: near "[[key]": syntax error
>>
>>
>>
>>
>>On Wednesday, February 3, 2016 12:40 PM, "ruipbarra...@sapo.pt"
>> wrote:
>>
>>
>>  Hello,
>>
>> You can't use tab$MVAR but you can use tab[[MVAR]] if you do MVAR <-
>>"population" (no need for c()).
>>
>> Hope this helps,
>>
>> Rui Barradas
>>  Citando Amoy Yang via R-help :
>> population is the field-name in data-file (say, tab). MVAR<-population
>>takes data (in the column of population) rather than field-name as done
>>in SAS:  %let MVAR=population;
>> In the following r-program, for instance, I cannot use ...
>>tab$MVAR...or simply MVAR itself since MVAR is defined as "population"
>>with double quotes if using MVAR<-c("population")
>>
>>On Wednesday, February 3, 2016 11:54 AM, Duncan Murdoch
>> wrote:
>>
>>
>> On 03/02/2016 12:41 PM, Amoy Yang via R-help wrote:
>>  There is a %LET statement in SAS: %let MVAR=population; Thus, MVAR can
>>be used through entire program.
>> In R, I tried MAVR<-c("population"). The problem is that MAVR comes
>>with double quote "" that I don't need. But MVAR<-c(population) did
>>NOT work out. Any way that double quote can be removed as done in SAS
>>when creating macro_var?
>> Thanks in advance for helps!
>> R doesn't have a macro language, and you usually don't need one.
>>
>> If you are only reading the value of population, then
>>
>> MAVR <- population
>>
>> is fine.  This is sometimes the same as c(population), but in general
>> it's different:  c() will remove some attributes, such as
>> the dimensions on arrays.
>>
>> If you need to modify it in your program, it's likely more complicated.
>> The normal way to go would be to put your code in a function, and have
>> it return the modified version.  For example,
>>
>> population <- doModifications(population)
>>
>> where doModifications is a function with a definition like
>>
>> doModifications <- function(MAVR) {
>># do all your calculations on MAVR
>># then return it at the end using
>>MAVR
>> }
>>
>> Duncan Murdoch
>>
>>
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>>http://www.r-project.org/posting-guide.htmlandprovide commented,
>>minimal, self-contained, reproducible code.
>>
>>
>>
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>> and provide commented, mini

Re: [R] Create macro_var in R

2016-02-05 Thread William Dunlap via R-help
If 'tab' is a data.frame then new.tab <- tab[[mvar]] is a column from that
data.frame, not a data.frame with one column.  new.tab <- tab[ , mvar,
drop=FALSE ] will give you a data.frame that you can add to with either of
nvar <- "newName"
new.tab[ , nvar] <- newColumn
new.tab[[nvar]] <- newColumn

If you have a fixed name for the new column (not a variable containing
the name), you can also use
new.tab <- cbind(new.tab, newName=newColumn)
new.tab$newName <- newColumn


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Feb 5, 2016 at 8:53 AM, Amoy Yang via R-help 
wrote:

>
>
>  One more question (see below). I cannot use macro-var, mvar, for creating
> new name, as shown below. Any advice is highly appreciated!
>
> > mvar<-"pop"
> > new.pop<-tab[[mvar]]; new.pop
>  [1]  698 1214 1003 1167 2549  824  944 1937  935  5700
> > new.tab[[mvar]]<-d$pop;
> Error in new.tab[[mvar]] <- d$pop : object 'new.tab' not found
>
> On Thursday, February 4, 2016 11:02 AM, Amoy Yang 
> wrote:
>
>
>  This works although it looks rare by using min(",key,"). Don't know why
> but just have to remember it. This is a tough part in R.
> Thanks for helps!
> Amoy
>
> On Wednesday, February 3, 2016 5:25 PM, Gabor Grothendieck <
> ggrothendi...@gmail.com> wrote:
>
>
>  See
>
>   Example 5.  Insert Variables
>
> on the sqldf home page.
>
>   https://github.com/ggrothendieck/sqldf
>
>
> On Wed, Feb 3, 2016 at 2:16 PM, Amoy Yang via R-help
>  wrote:
> > First, MVAR<-c("population) should be the same as "population'". Correct?
> > You use tab[[MVAR]] to refer to "population" where double [[...]]
> removes double quotes "...", which seemingly work for r-code although it is
> tedious in comparison direct application in SAS %let MVAR=population. But
> it does not work for sqldef in R as shown below.
> >
> >> key<-"pop"
> >> library(sqldf)
> >> sqldf("select grade, count(*) as cnt, min(tab[[key]]) as min,
> > + max(pop) as max, avg(pop) as mean, median(pop) as median,
> > + stdev(pop) as stdev from tab group by grade")
> > Error in sqliteSendQuery(con, statement, bind.data) :
> >  error in statement: near "[[key]": syntax error
> >
> >
> >
> >
> >On Wednesday, February 3, 2016 12:40 PM, "ruipbarra...@sapo.pt" <
> ruipbarra...@sapo.pt> wrote:
> >
> >
> >  Hello,
> >
> > You can't use tab$MVAR but you can use tab[[MVAR]] if you do MVAR <-
> "population" (no need for c()).
> >
> > Hope this helps,
> >
> > Rui Barradas
> >  Citando Amoy Yang via R-help :
> > population is the field-name in data-file (say, tab). MVAR<-population
> takes data (in the column of population) rather than field-name as done in
> SAS:  %let MVAR=population;
> > In the following r-program, for instance, I cannot use ... tab$MVAR...or
> simply MVAR itself since MVAR is defined as "population" with double quotes
> if using MVAR<-c("population")
> >
> >On Wednesday, February 3, 2016 11:54 AM, Duncan Murdoch <
> murdoch.dun...@gmail.com> wrote:
> >
> >
> > On 03/02/2016 12:41 PM, Amoy Yang via R-help wrote:
> >  There is a %LET statement in SAS: %let MVAR=population; Thus, MVAR can
> be used through entire program.
> > In R, I tried MAVR<-c("population"). The problem is that MAVR comes with
> double quote "" that I don't need. But MVAR<-c(population) did NOT work
> out. Any way that double quote can be removed as done in SAS when creating
> macro_var?
> > Thanks in advance for helps!
> > R doesn't have a macro language, and you usually don't need one.
> >
> > If you are only reading the value of population, then
> >
> > MAVR <- population
> >
> > is fine.  This is sometimes the same as c(population), but in general
> > it's different:  c() will remove some attributes, such as
> > the dimensions on arrays.
> >
> > If you need to modify it in your program, it's likely more complicated.
> > The normal way to go would be to put your code in a function, and have
> > it return the modified version.  For example,
> >
> > population <- doModifications(population)
> >
> > where doModifications is a function with a definition like
> >
> > doModifications <- function(MAVR) {
> ># do all your calculations on MAVR
> ># then return it at the end using
> >MAVR
> > }
> >
> > Duncan Murdoch
> >
> >
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.r-project.org/posting-guide.htmlandprovide commented, minimal,
> self-contained, reproducible code.
> >
> >
> >
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, re

Re: [R] Create macro_var in R

2016-02-05 Thread Jeff Newmiller
You REALLY NEED to read the "Introduction to R" document discussion of 
indexing. 

tab is a variable. It is apparently a data.frame. tab[[mvar]] is an expression 
that retrieves part of the data in the tab data.frame. The data it returns is a 
vector, not a data.frame. 

The "[[" operator extracts an element of a list.  A data.frame is a list of 
vectors (all of the same length). A vector of mode "numeric" is just numbers, 
not a list. 

You created a new variable new.pop that holds a numeric vector. You printed it 
and confirmed that that is what it is. 

You then tried to refer to a variable that you have NOT created, new.tab. 
However,  if you had tried the expression new.pop[[mvar]] you would have been 
trying to treat a numeric vector as a list, which it is not... and if it was it 
would have to have an element named pop inside it already to extract something, 
which it doesn't. 

A key step in getting out of your state of confusion is to learn how objects 
can contain other objects, and when to work with containing objects and when to 
work work contained objects. 

Some possible solutions:

new.tab <- data.frame( pop=new.pop )

or

new.tab <- data.frame( pop = tab[[mvar]] )

or

new.tab <- tab[ , "pop", drop=FALSE )

With which you can then add new columns

new.tab$pop2 <- new.pop ^2
new.pop[[ "pop3" ]] <- new.pop^3

-- 
Sent from my phone. Please excuse my brevity.

On February 5, 2016 8:53:28 AM PST, Amoy Yang via R-help  
wrote:
>
>
>One more question (see below). I cannot use macro-var, mvar, for
>creating new name, as shown below. Any advice is highly appreciated!
>
>> mvar<-"pop"
>> new.pop<-tab[[mvar]]; new.pop
> [1]  698 1214 1003 1167 2549  824  944 1937  935  570    0
>> new.tab[[mvar]]<-d$pop; 
>Error in new.tab[[mvar]] <- d$pop : object 'new.tab' not found 
>
>On Thursday, February 4, 2016 11:02 AM, Amoy Yang 
>wrote:
> 
>
>This works although it looks rare by using min(",key,"). Don't know why
>but just have to remember it. This is a tough part in R.
>Thanks for helps!
>Amoy 
>
>On Wednesday, February 3, 2016 5:25 PM, Gabor Grothendieck
> wrote:
> 
>
> See
>
>  Example 5.  Insert Variables
>
>on the sqldf home page.
>
>  https://github.com/ggrothendieck/sqldf
>
>
>On Wed, Feb 3, 2016 at 2:16 PM, Amoy Yang via R-help
> wrote:
>> First, MVAR<-c("population) should be the same as "population'".
>Correct?
>> You use tab[[MVAR]] to refer to "population" where double [[...]]
>removes double quotes "...", which seemingly work for r-code although
>it is tedious in comparison direct application in SAS %let
>MVAR=population. But it does not work for sqldef in R as shown below.
>>
>>> key<-"pop"
>>> library(sqldf)
>>> sqldf("select grade, count(*) as cnt, min(tab[[key]]) as min,
>> + max(pop) as max, avg(pop) as mean, median(pop) as median,
>> + stdev(pop) as stdev from tab group by grade")
>> Error in sqliteSendQuery(con, statement, bind.data) :
>>  error in statement: near "[[key]": syntax error
>>
>>
>>
>>
>>    On Wednesday, February 3, 2016 12:40 PM, "ruipbarra...@sapo.pt"
> wrote:
>>
>>
>>  Hello,
>>
>> You can't use tab$MVAR but you can use tab[[MVAR]] if you do MVAR <-
>"population" (no need for c()).
>>
>> Hope this helps,
>>
>> Rui Barradas
>>  Citando Amoy Yang via R-help :
>> population is the field-name in data-file (say, tab).
>MVAR<-population takes data (in the column of population) rather than
>field-name as done in SAS:  %let MVAR=population;
>> In the following r-program, for instance, I cannot use ...
>tab$MVAR...or simply MVAR itself since MVAR is defined as "population"
>with double quotes if using MVAR<-c("population")
>>
>>    On Wednesday, February 3, 2016 11:54 AM, Duncan Murdoch
> wrote:
>>
>>
>> On 03/02/2016 12:41 PM, Amoy Yang via R-help wrote:
>>  There is a %LET statement in SAS: %let MVAR=population; Thus, MVAR
>can be used through entire program.
>> In R, I tried MAVR<-c("population"). The problem is that MAVR comes
>with double quote "" that I don't need. But MVAR<-c(population) did
>NOT work out. Any way that double quote can be removed as done in SAS
>when creating macro_var?
>> Thanks in advance for helps!
>> R doesn't have a macro language, and you usually don't need one.
>>
>> If you are only reading the value of population, then
>>
>> MAVR <- population
>>
>> is fine.  This is sometimes the same as c(population), but in general
>> it's different:  c() will remove some attributes, such as
>> the dimensions on arrays.
>>
>> If you need to modify it in your program, it's likely more
>complicated.
>> The normal way to go would be to put your code in a function, and
>have
>> it return the modified version.  For example,
>>
>> population <- doModifications(population)
>>
>> where doModifications is a function with a definition like
>>
>> doModifications <- function(MAVR) {
>>    # do all your calculations on MAVR
>>    # then return it at the end using
>>    MAVR
>> }
>>
>> Duncan Murdoch
>>
>>
>>
>>        [[alternative HTML version deleted]]
>>
>> 

Re: [R] pearson correlation matrix

2016-02-05 Thread Michael Dewey



On 05/02/2016 17:10, emmanuelle morin wrote:

ok, it will still make sense ?



Whether it makes sense to correlate the people rather than the variables 
depends on the underlying science which (a) we do not know, and (b) is 
not really an R question.




Le 05/02/2016 15:31, Michael Dewey a écrit :

Assuming your dataset is in a matrix you want to transpose it. So you
can go t(mesdonnees) and then call cor on that.

On 05/02/2016 14:06, emmanuelle morin wrote:

Hello,

I have a set of 12 individuals with thousands of variables measured.
I understand that when I'm using the cor() function on my matrix I'm
calculating the correlation between the different variables according to
their values for the different individuals.

What I'm willing to do is to calculate a correlation between the
individuals and I have no clue how I can do that.
Could you please help me ?

Thanks,

Emmanuelle







--
Michael
http://www.dewey.myzen.co.uk/home.html

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pearson correlation matrix

2016-02-05 Thread emmanuelle morin

ok, it will still make sense ?

Le 05/02/2016 15:31, Michael Dewey a écrit :
Assuming your dataset is in a matrix you want to transpose it. So you 
can go t(mesdonnees) and then call cor on that.


On 05/02/2016 14:06, emmanuelle morin wrote:

Hello,

I have a set of 12 individuals with thousands of variables measured.
I understand that when I'm using the cor() function on my matrix I'm
calculating the correlation between the different variables according to
their values for the different individuals.

What I'm willing to do is to calculate a correlation between the
individuals and I have no clue how I can do that.
Could you please help me ?

Thanks,

Emmanuelle





--
Emmanuelle MORIN
UMR 1136 INRA/Université de Lorraine
F-54280 Champenoux
Tel : + 33 3 83 39 41 33
http://mycor.nancy.inra.fr

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Accessing specific data.frame columns within function

2016-02-05 Thread Greg Snow
You are trying to use shortcuts where shortcuts are not appropriate
and having to go a lot longer around than if you did not use the
shortcut, see fortune(312).

You should really reread the help page: help("[[") and section 6.1 of
An Introduction to R.

Basically you should be able to do something like:

f <- function(data, oldnames) {
  data <- data[ data[[oldnames[2] ]] == 4, ]
  data[['d']] <- data[[ oldnames[1] ]]^2 + data[[ oldnames[2] ]]
  data
}

Or maybe a little more readable (but not as good a golf score):

f <- function(data, oldnames) {
  aa <- oldnames[1]
  cc <- oldnames[2]
  data <- data[ data[[ cc ]] == 4, ]
  data[['d']] <- data[[ aa ]]^2 + data[[ cc ]]
  data
}

I could have used a and c instead of aa and cc, but the doubled
letters mean less confusion with the `c` function in R.

Also you should read (and heed) the Warning section on the help page
for subset (?subset).

On Thu, Feb 4, 2016 at 9:13 PM, Clark Kogan  wrote:
> Hello,
>
> I am trying to write a function that adds a few columns to a data.frame. The
> function uses the columns in a specific way. For instance, it might take a^2
> + c to produce a column d. Or it might do more complex manipulations that I
> don't think I need to discuss here. I want to keep x as a data.frame when I
> pass it into the function, as I want to use some data.frame functionality on
> x.
>
> Furthermore, I don't want the names in x to have to be specific. I want to
> be able to specify which columns the function should treat as "a" and "c".
>
> The way I am currently doing it, is that I pass the names of the columns
> that I want to treat as a and c.
>
> f <- function(data,oldnames) {
>   newnames <- c("a","c")
>   ix <- match(oldnames,names(y))
>   names(y)[ix] <- newnames
>   y <- subset(y,c==4)
>   y$d <- y$a^2 + y$c
>   ix <- match(newnames,names(y))
>   names(y)[ix] <- oldnames
>   y
> }
>
> y <- data.frame(k=c(1,1,1),l=c(2,2,5),m=c(4,2,4))
> f(y,c("k","m"))
>
> The way that I am doing it does not seem all that elegent or standard
> practice. My question is: are there potential problems programming with
> data.frames in this way, and are their standard practice methods of
> referencing data.frame names that deal with these problems?
>
> Thanks!
>
> Clark
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] reading in multiple data sets in 2 loops

2016-02-05 Thread Reka Howard
Hello,
I have over 1000 csv data sets I need to read into R, so I want to read
them in using a loop. The data sets are named as
pheno_1000ind_4000m_add_h70_prog_1_2.csv,
pheno_1000ind_4000m_add_h70_prog_1_3.csv, ... so I need 2 loops (for the
last 2 numbers in the names). What I would like to do is the following:

setwd("C:/Research3/simulation1/second_gen")
d1<-read.csv("pheno_1000ind_4000m_add_h70_prog_1_2.csv")
d2<-read.csv("pheno_1000ind_4000m_add_h70_prog_1_3.csv")
d3<-read.csv("pheno_1000ind_4000m_add_h70_prog_2_3.csv")
.
.
.

I am wondering how I can accomplish this with a loop. Any suggestion is
appreciated!
I tried the following but it does not work:

data <- lapply(
 
paste(("C:/Research3/simulation1/second_gen/pheno_1000ind_4000m_add_h70_prog_",[1:2],"_",[2:3],".csv",sep=''),
read.csv, header=TRUE, sep=',' )
names(data) <- paste("d", LETTERS[1:3], sep='')

Thanks!
Reka

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.