Typo on the second line result <- ( result0 %>% select( -admin_period1 ) %>% inner_join( result0 %>% select( ID, admin_period1, end=start ) , by = c( ID="ID", admin_period ="admin_period1" ) ) %>% mutate( ddays = end - start ) ) -- Sent from my phone. Please excuse my brevity.
On July 3, 2016 11:55:14 AM PDT, Kevin Wamae <kwa...@kemri-wellcome.org> wrote: >Hi Jeff, “likes its Excel”, I don’t follow. Pardon me for any mix up. > >Thanks for the code. After running it, this is the error I get. > >Error: cannot join on columns 'admin_period' x 'admin_period1': index >out of bounds > >Regards >------------------------------------------------------------------------------- >Kevin Wame | Ph.D. Student (IDeAL) >KEMRI-Wellcome Trust Collaborative Research Programme >Centre for Geographic Medicine Research >P.O. Box 230-80108, Kilifi, Kenya > > >On 7/3/16, 9:34 PM, "Jeff Newmiller" <jdnew...@dcn.davis.ca.us> wrote: > >I still get the impression from your mixing of information types that >you are thinking like this is Excel. > >Perhaps something like > >drug_study$admin_period <- ave( "Y" == drug_study$drug_admin, >drug_study$ID, FUN=cumsum ) >library(dplyr) >result0 <- ( drug_study > %>% filter( 0 != admin_period ) > %>% group_by( ID, admin_period ) > %>% summarise( start = min( date ) ) > %>% mutate( admin_period1 = admin_period -1 ) > ) >result <- ( result0 > %>% select( -admin_period ) > %>% inner_join( result0 %>% select( ID, admin_period1, end=start ) > , by = c( ID="ID", admin_period ="admin_period1" ) > ) > %>% mutate( ddays = end - start ) > ) >-- >Sent from my phone. Please excuse my brevity. > >On July 3, 2016 10:24:51 AM PDT, Kevin Wamae ><kwa...@kemri-wellcome.org> wrote: >>HI Jeff, it’s been an uphill task working with the dataset and I am >not >>the first to complain. Nonetheless, data-cleaning is ongoing and since >>I cannot wait for that to get done, I decided to make the most of what >>the dataset looks like at this time. It appears the process may take a >>while. >> >>Thanks for the script. From the output, I noticed that “result” >>contains the first and last date for each of the individuals and not >>taking into account the variable “drug-admin”. >> >>ID start end >>J1/3 1/5/09 12/25/10 >>R1/3 1/4/07 12/15/08 >>R10/1 1/4/07 3/5/12 >> >>My aim is to pick the date, for example in 2007, where drug-admin == >>“Y” as my start and the date in the subsequent year (2008 in this >case) >>where drug-admin == “Y” as my end. Then, I should populate the >variable >>“study_id” with “start” up to the entry just above the one whose date >>matches “end”, as the output below shows (I hope its structure is >>maintained as I have copied it from R-Studio). The goal for now is to >>then get difference in days between “date” and “study_id” and still >get >>to keep that column for “study_id” as I might use it later. >> >>From the output, it can be seen that for this individual, the dates >run >>from 2007 to 2008. However, for some individuals, the dates run from >>2008-2009, 2009-2010 and so on. Therefore, I need to make the script >>deal with all the years as the dates range from 2001-2016 >> >>ID date drug_admin year month study_id >>R1/3 5/11/07 Y 2007 5 5/11/07 >>R1/3 5/16/07 2007 5 5/11/07 >>R1/3 5/22/07 2007 5 5/11/07 >>R1/3 5/28/07 2007 5 5/11/07 >>R1/3 6/5/07 2007 6 5/11/07 >>R1/3 6/11/07 2007 6 5/11/07 >>R1/3 6/18/07 2007 6 5/11/07 >>R1/3 6/25/07 2007 6 5/11/07 >>R1/3 7/2/07 2007 7 5/11/07 >>R1/3 7/16/07 2007 7 5/11/07 >>R1/3 7/29/07 2007 7 5/11/07 >>R1/3 8/2/07 2007 8 5/11/07 >>R1/3 8/7/07 2007 8 5/11/07 >>R1/3 8/13/07 2007 8 5/11/07 >>R1/3 9/18/07 2007 9 5/11/07 >>R1/3 9/24/07 2007 9 5/11/07 >>R1/3 10/6/07 2007 10 5/11/07 >>R1/3 10/8/07 2007 10 5/11/07 >>R1/3 10/15/07 2007 10 5/11/07 >>R1/3 10/22/07 2007 10 5/11/07 >>R1/3 10/29/07 2007 10 5/11/07 >>R1/3 11/8/07 2007 11 5/11/07 >>R1/3 11/12/07 2007 11 5/11/07 >>R1/3 11/19/07 2007 11 5/11/07 >>R1/3 11/29/07 2007 11 5/11/07 >>R1/3 12/6/07 2007 12 5/11/07 >>R1/3 12/10/07 2007 12 5/11/07 >>R1/3 12/21/07 2007 12 5/11/07 >>R1/3 1/7/08 2008 1 5/11/07 >>R1/3 1/14/08 2008 1 5/11/07 >>R1/3 1/21/08 2008 1 5/11/07 >>R1/3 1/28/08 2008 1 5/11/07 >>R1/3 2/4/08 Y 2008 2 >> >> >>Regards >>------------------------------------------------------------------------------- >>Kevin Wame >> >>############################################################### >> >>############################################################### >> >> >> >>On 7/3/16, 7:05 PM, "Jeff Newmiller" <jdnew...@dcn.davis.ca.us> wrote: >> >>result <- setNames( data.frame( aggregate( date~ID, data=drug_study, >>FUN=min ), aggregate( date~ID, data=drug_study, FUN=max )[2] ), c( >>"ID", "start", "end" ) ) >> >> >>______________________________________________________________________ >> >>This e-mail contains information which is confidential. It is intended >>only for the use of the named recipient. If you have received this >>e-mail in error, please let us know by replying to the sender, and >>immediately delete it from your system. Please note, that in these >>circumstances, the use, disclosure, distribution or copying of this >>information is strictly prohibited. KEMRI-Wellcome Trust Programme >>cannot accept any responsibility for the accuracy or completeness of >>this message as it has been transmitted over a public network. >Although >>the Programme has taken reasonable precautions to ensure no viruses >are >>present in emails, it cannot accept responsibility for any loss or >>damage arising from the use of the email or attachments. Any views >>expressed in this message are those of the individual sender, except >>where the sender specifically states them to be the views of >>KEMRI-Wellcome Trust Programme. >>______________________________________________________________________ > > > > >______________________________________________________________________ > >This e-mail contains information which is confidential. It is intended >only for the use of the named recipient. If you have received this >e-mail in error, please let us know by replying to the sender, and >immediately delete it from your system. Please note, that in these >circumstances, the use, disclosure, distribution or copying of this >information is strictly prohibited. KEMRI-Wellcome Trust Programme >cannot accept any responsibility for the accuracy or completeness of >this message as it has been transmitted over a public network. Although >the Programme has taken reasonable precautions to ensure no viruses are >present in emails, it cannot accept responsibility for any loss or >damage arising from the use of the email or attachments. Any views >expressed in this message are those of the individual sender, except >where the sender specifically states them to be the views of >KEMRI-Wellcome Trust Programme. >______________________________________________________________________ ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.