Very similar to what Oliver posted: library(dplyr) newdata <- olddata |> group_by(ID) |> mutate(firstdate = first(date)) newdata
1) I attached dplyr to the entire program. Oliver used dplyr::group_by() and dplyr::mutate() to do the same thing. 2) I used the base R |> pipe while Oliver used the %>% pipe from the magritter package to do the same thing. If you want a version that is closer to how SAS would process the data, then you could use for loops after sorting the data. Tim -----Original Message----- From: R-help <r-help-boun...@r-project.org> On Behalf Of Tom Woolman Sent: Wednesday, November 27, 2024 12:05 PM To: Sorkin, John <jsor...@som.umaryland.edu> Cc: r-help@r-project.org (r-help@r-project.org) <r-help@r-project.org> Subject: Re: [R] R Processing dataframe by group - equivalent to SAS by group processing with a first. and retain statments [External Email] Check out the dplyr package, specifically the mutate function. # Create new column based on existing column value df <- df %>% mutate(FirstDay = if(ID = 2, 5)) df Repeat as needed to capture all of the day/firstday combinations you want to account for. Like everything else in R, there are probably at least a dozen other ways to do this, between base R and all of the library packages available. On Wednesday, November 27th, 2024 at 11:30 AM, Sorkin, John <jsor...@som.umaryland.edu> wrote: > > > I am an old, long time SAS programmer. I need to produce R code that > processes a dataframe in a manner that is equivalent to that produced by > using a by statement in SAS and an if first.day statement and a retain > statement: > > I want to take data (olddata) that looks like this ID Day > 1 1 > 1 1 > 1 2 > 1 2 > 1 3 > 1 3 > 1 4 > 1 4 > 1 5 > 1 5 > 2 5 > 2 5 > 2 5 > 2 6 > 2 6 > 2 6 > 3 10 > 3 10 > > and make it look like this: > (withing each ID I am copying the first value of Day into a new variable, > FirstDay, and propagating the FirstDay value through all rows that have the > same ID: > > ID Day FirstDay > 1 1 1 > 1 1 1 > 1 2 1 > 1 2 1 > 1 3 1 > 1 3 1 > 1 4 1 > 1 4 1 > 1 5 1 > 1 5 1 > 2 5 5 > 2 5 5 > 2 5 5 > 2 6 5 > 2 6 5 > 2 6 5 > 3 10 3 > 3 10 3 > > SAS code that can do this is: > > proc sort data=olddata; > by ID Day; > run; > > data newdata; > retain FirstDay; > set olddata; > by ID; > if first.ID then FirstDay=Day; > run; > > I have NO idea how to do this is R (so I can't post test-code), but below I > have R code that creates olddata: > > ID <- c(rep(1,10),rep(2,6),rep(3,2)) > date <- c(rep(1,2),rep(2,2),rep(3,2),rep(4,2),rep(5,2), > rep(5,3),rep(6,3),rep(10,2)) > date > olddata <- data.frame(ID=ID,date=date) olddata > > Any suggestions on how to do this would be appreciated. . . I have worked on > this for more than 12-hours, despite multiple we searches I have gotten > nowhere. . . > > Thanks > John > > > > > John David Sorkin M.D., Ph.D. > Professor of Medicine, University of Maryland School of Medicine; > Associate Director for Biostatistics and Informatics, Baltimore VA > Medical Center Geriatrics Research, Education, and Clinical Center; PI > Biostatistics and Informatics Core, University of Maryland School of > Medicine Claude D. Pepper Older Americans Independence Center; Senior > Statistician University of Maryland Center for Vascular Research; > > Division of Gerontology and Paliative Care, > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > Cell phone 443-418-5382 > > > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat/ > .ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C02%7Ctebert%40ufl.edu > %7Cd2ffd4065fbb410d5c0008dd0f05b081%7C0d4da0f84a314d76ace60a62331e1b84 > %7C0%7C0%7C638683239328228378%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGki > OnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ > %3D%3D%7C0%7C%7C%7C&sdata=MvED5XRiFxLMfQsagl1K8IoadbM7lxMPLWm9ord6Oac% > 3D&reserved=0 PLEASE do read the posting guide > https://www/. > r-project.org%2Fposting-guide.html&data=05%7C02%7Ctebert%40ufl.edu%7Cd > 2ffd4065fbb410d5c0008dd0f05b081%7C0d4da0f84a314d76ace60a62331e1b84%7C0 > %7C0%7C638683239328245109%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRy > dWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D% > 3D%7C0%7C%7C%7C&sdata=LTYa1YLUtR%2Bm26jjfvejSZq8WDfEsOlMKMdHxBsh9cg%3D > &reserved=0 and provide commented, minimal, self-contained, > reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.