> On Nov 17, 2017, at 4:28 PM, Val <valkr...@gmail.com> wrote: > > Hi all, > I am reading a huge data set(12M rows) that contains family information, > Offspring, Parent1 and Parent2 > > Parent1 and parent2 should be in the first column as an offspring > before their offspring information. Their parent information (parent1 > and parent2) should be set to zero, if unknown. Also the first > column should be unique. > > > Here is my sample data set and desired output. > > > fam <- read.table(textConnection(" offspring Parent1 Parent2 > Smith Alex1 Alexa > Carla Alex1 0 > Jacky Smith Abbot > Jack 0 Jacky > Almo Jack Carla > "),header = TRUE) > > > > desired output. > Offspring Parent1 Parent2 > Alex1 0 0 > Alexa 0 0 > Abbot 0 0 > Smith Alex1 Alexa > Carla Alex1 0 > Jacky Smith Abbot > Jack 0 Jacky > Almo Jack Carla
You might get useful ideas by looking at ?'%in%" and ?union (set operations) > fam$Parent1[!fam$Parent1 %in% fam$offspring] [1] "Alex1" "Alex1" "0" > fam$Parent2[!fam$Parent1 %in% fam$offspring] [1] "Alexa" "0" "Jacky" David. > > Thank you. > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA 'Any technology distinguishable from magic is insufficiently advanced.' -Gehm's Corollary to Clarke's Third Law ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.