Val: I wanted to add here a base R solution to your problem that I realize you can happily ignore. However, in the course of puzzling over how to do it using the R native pipe syntax ("|>") , I learned some new stuff that I thought others might find useful, and it seemed sensible to keep the code with this thread for comparison.
I want to acknowledge that in the course of my labor, I posted a query to R-Help to which Iris Simmons posted a very clever answer that I would never have figured out myself and that is used below at the end to change a subset of the names of the modified data frame via a pipe. Here's the whole solution starting from your (excellent!) example dat: dat <- dat$string |> strsplit(" ") |> sapply(FUN = \(x)c(x, rep(NA, 5 - length(x)))) |> t() |> cbind(dat, ..2 = _) ## And Iris's trick for changing a subset of attributes, i.e. the "names", in a pipe dat |> names() |> _[4:8] <- paste0("s", 1:5) ## and here's the result: > dat Year Sex string s1 s2 s3 s4 s5 1 2002 F 15 xc Ab 15 xc Ab <NA> <NA> 2 2003 F 14 14 <NA> <NA> <NA> <NA> 3 2004 M 18 xb 25 35 21 18 xb 25 35 21 4 2005 M 13 25 13 25 <NA> <NA> <NA> 5 2006 M 14 ac 256 AV 35 14 ac 256 AV 35 6 2007 F 11 11 <NA> <NA> <NA> <NA> As I noted previously, all columns beyond Sex are character Cheers, Bert On Fri, Jul 19, 2024 at 12:26 PM Val <valkr...@gmail.com> wrote: > > Thank you Jeff and Bert for your help! > The components of the string could be nixed (i.e, numeric, character > or date). Once that is splitted it would be easy for me to format it > accordingly. > > On Fri, Jul 19, 2024 at 2:10 PM Bert Gunter <bgunter.4...@gmail.com> wrote: > > > > I did not look closely at the solutions that you were offered, but > > note that you did not specify in your post whether the numbers in your > > string were to be character or numeric variables after they are broken > > out into their own columns. I believe that they are character in the > > solutions, but you should check this. If you want them as numeric, > > e.g., for further processing, you will need to convert them. Or > > vice-versa. > > > > Bert > > > > > > On Fri, Jul 19, 2024 at 9:52 AM Val <valkr...@gmail.com> wrote: > > > > > > Hi All, > > > > > > I want to extract new variables from a string and add it to the dataframe. > > > Sample data is csv file. > > > > > > dat<-read.csv(text="Year, Sex,string > > > 2002,F,15 xc Ab > > > 2003,F,14 > > > 2004,M,18 xb 25 35 21 > > > 2005,M,13 25 > > > 2006,M,14 ac 256 AV 35 > > > 2007,F,11",header=TRUE) > > > > > > The string column has a maximum of five variables. Some rows have all > > > and others may not have all the five variables. If missing then fill > > > it with NA, > > > Desired result is shown below, > > > > > > > > > Year,Sex,string, S1, S2, S3 S4,S5 > > > 2002,F,15 xc Ab, 15,xc,Ab, NA, NA > > > 2003,F,14, 14,NA,NA,NA,NA > > > 2004,M,18 xb 25 35 21,18, xb, 25, 35, 21 > > > 2005,M,13 25,13, 25,NA,NA,NA > > > 2006,M,14 ac 256 AV 35, 14, ac, 256, AV, 35 > > > 2007,F,11, 11,NA,NA,NA,NA > > > > > > Any help? > > > Thank you in advance. > > > > > > ______________________________________________ > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.