Hi Drake, Petr's suggestion to use the merge() function is good. Another (possibly overkill) approach is to use functions from the dplyr package, which is a fantastic package to get familiar with. For example, the last alternative that Petr suggests is an example of what is called a "left join" (meaning, when joining structures x and y, keep all the x rows, even if there is no corresponding row for y). You can do this via dplyr as follows:
dplyr::left_join( fr2, fr1, by="Fruit") HTH, Eric On Thu, Apr 18, 2019 at 11:40 AM PIKAL Petr <petr.pi...@precheza.cz> wrote: > Hi > > I wonder why such combination is so complicated in your text book. > > Having data frames fr1 and fr2 > > > dput(fr1) > structure(list(Fruit = structure(c(1L, 3L, 2L), .Label = c("banana", > "mango", "pear"), class = "factor"), Calories = c(100L, 100L, > 200L)), class = "data.frame", row.names = c("1", "2", "3")) > > dput(fr2) > structure(list(Fruit = structure(c(1L, 2L, 5L, 4L, 3L), .Label = c("apple", > "banana", "kiwi", "orange", "pear"), class = "factor"), Color = > structure(c(3L, > 4L, 1L, 2L, 1L), .Label = c("green", "orange", "red", "yellow" > ), class = "factor"), Shape = structure(c(3L, 1L, 2L, 3L, 3L), .Label = > c("oblong", > "pear", "round"), class = "factor"), Juice = c(1, 0, 0.5, 1, > 0)), class = "data.frame", row.names = c("1", "2", "3", "4", > "5")) > > > > > fr1 > Fruit Calories > 1 banana 100 > 2 pear 100 > 3 mango 200 > > > > you can use merge to combine those 2 data frames to get either all values > from both > > > merge(fr2, fr1, all=T) > Fruit Color Shape Juice Calories > 1 apple red round 1.0 NA > 2 banana yellow oblong 0.0 100 > 3 kiwi green round 0.0 NA > 4 orange orange round 1.0 NA > 5 pear green pear 0.5 100 > 6 mango <NA> <NA> NA 200 > > just values from data frame with calories > > > merge(fr2, fr1, all.y=T) > Fruit Color Shape Juice Calories > 1 banana yellow oblong 0.0 100 > 2 pear green pear 0.5 100 > 3 mango <NA> <NA> NA 200 > > or just values from data frame with colours > > > merge(fr2, fr1, all.x=T) > Fruit Color Shape Juice Calories > 1 apple red round 1.0 NA > 2 banana yellow oblong 0.0 100 > 3 kiwi green round 0.0 NA > 4 orange orange round 1.0 NA > 5 pear green pear 0.5 100 > > Cheers > Petr > > > > -----Original Message----- > > From: R-help <r-help-boun...@r-project.org> On Behalf Of Drake Gossi > > Sent: Thursday, April 18, 2019 1:24 AM > > To: r-help@r-project.org > > Subject: [R] combining data.frames with is.na & match (), two questions > > > > Hello everyone, > > > > I'm working through this book, *Humanities Data in R* (Arnold & Tilton), > and > > I'm just having trouble understanding this maneuver. > > > > In sum, I'm trying to combine data in two different data.frames. > > > > This data.frame is called fruitNutr > > > > Fruit Calories > > 1 banana 100 > > 2 pear 100 > > 3 mango 200 > > > > And this data.frame is called fruitData > > > > Fruit Color Shape Juice > > 1 apple red round 1 > > 2 banana yellow oblong 0 > > 3 pear green pear 0.5 > > 4 orange orange round 1 > > 5 kiwi green round 0 > > > > So, as you can see, these two data.frames overlap insofar as they both > have > > banana and pear. So, what happens next is the book suggests this: > > > > fruitData$calories <- NA > > > > > > As a result, I've created a new column for the fruitData data.frame: > > > > Fruit Color Shape Juice Calories > > 1 apple red round 1 N/A > > 2 banana yellow oblong 0 N/A > > 3 pear green pear 0.5 N/A > > 4 orange orange round 1 N/A > > 5 kiwi green round 0 N/A > > > > Then: > > > > > index <- match (x=fruitData$Fruit, table=fruitNutr$Fruit) index > > [1] NA 1 2 NA NA > > > is.na(index) > > [1] TRUE FALSE FALSE TRUE TRUE > > > fruitData$Calories [!is.na(index)] <- fruitNutr$Calories[index[!is.na > > (index)]] > > > fruitData > > > > Fruit Color Shape Juice Calories > > 1 apple red round 1 N/A > > 2 banana yellow oblong 0 100 > > 3 pear green pear 0.5 100 > > 4 orange orange round 1 N/A > > 5 kiwi green round 0 N/A > > > > I get what the first part means, that first part being this: > > fruitData$Calories [!is.na(index)] > > go into the fruitData data.frame, specifically into the calories column, > and only > > for what's true according to is.na(index). But I just literally can't > understand > > this last part. fruitNutr$Calories[index[!is.na(index)]] > > > > Two questions. > > > > > > 1. I just literally don't understand how this code works. It does > work, > > of course, but I don't know what it's doing, specifically this > [index[! > > is.na(index)]] part. Could someone explain it to me like I'm five? > I'm > > new at this... > > 2. And then: is there any other way to combine these two data.frames > so > > that we get this same result? maybe an easier to understand method? > > > > That same result, again, is > > > > Fruit Color Shape Juice Calories > > 1 apple red round 1 N/A > > 2 banana yellow oblong 0 100 > > 3 pear green pear 0.5 100 > > 4 orange orange round 1 N/A > > 5 kiwi green round 0 N/A > > > > > > Drake > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > > guide.html > > and provide commented, minimal, self-contained, reproducible code. > Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních > partnerů PRECHEZA a.s. jsou zveřejněny na: > https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information > about processing and protection of business partner’s personal data are > available on website: > https://www.precheza.cz/en/personal-data-protection-principles/ > Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou > důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení > odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any > documents attached to it may be confidential and are subject to the legally > binding disclaimer: https://www.precheza.cz/en/01-disclaimer/ > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.