Brilliant, David, thank you so much! Cheers,
Rune > 16. mai 2017 kl. 18.44 skrev David L Carlson <dcarl...@tamu.edu>: > > Fixing a typo in the original, adding a simplification, and using > dissimilarity instead of similarity: > > set.seed(42) > dta <- data.frame(ID=1:7, gender=sample(c("M", "F"), 7, replace=TRUE), > age=sample.int(75, 7)) > dsim <- dist(dta$age) # distance, already lower triangular > dsim > > dta1 <- dta > names(dta1) <- paste0(names(dta), "1") # generalizes to more than 3 columns > dta2 <- dta > names(dta2) <- paste0(names(dta), "2") > > dta12 <- merge(dta2, dta1) # order is important > dta12 <- dta12[dta12$ID1 < dta12$ID2, ] # get rid of duplicates > > dta12 <- data.frame(dta12, dsim=as.vector(dsim)) # Typo was here > dta12 <- dta12[, c("ID1", "ID2", "gender1", "gender2", "age1", "age2", > "dsim")] > dta12 > > David C > > > -----Original Message----- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of David L > Carlson > Sent: Tuesday, May 16, 2017 11:21 AM > To: Rune Grønseth <nielsenr...@me.com>; r-help@r-project.org > Subject: Re: [R] Extracting metadata information to corresponding > dissimilarity matrix > > I think this is what you are trying to do. I've created a data set with 7 > rows and a similarity matrix based on age: > > set.seed(42) > dta <- data.frame(ID=1:7, gender=sample(c("M", "F"), 7, replace=TRUE), > age=sample.int(75, 7)) > sim <- max(dist(dta$age)) - dist(dta$age) # already lower triangular > sim > > # 1 2 3 4 5 6 > # 2 24 > # 3 21 59 > # 4 40 46 43 > # 5 0 38 41 22 > # 6 7 45 48 29 55 > # 7 55 31 28 47 7 14 > > # Now duplicate dta: > dta1 <- dta > names(dta1) <- c("ID1", "gender1", "age1") > dta2 <- dta > names(dta2) <- c("ID2", "gender2", "age2") > > # Now merge and eliminate unneeded rows > dta12 <- merge(dta2, dta1) # order is important > dta12 <- dta12[dta12$ID1 < dta12$ID2, ] > > # Finally combine the similarities with the combined data and rearrange > # the variable names > dta12 <- data.frame(dta12mod, sim=as.vector(sim)) > dta12 <- dta12[, c("ID1", "ID2", "gender1", "gender2", "age1", "age2", "sim")] > dta12 > > # ID1 ID2 gender1 gender2 age1 age2 sim > # 2 1 2 F F 11 49 24 > # 3 1 3 F M 11 52 21 > # 4 1 4 F F 11 33 40 > # 5 1 5 F F 11 73 0 > # 6 1 6 F F 11 66 7 > # 7 1 7 F F 11 18 55 > # 10 2 3 F M 49 52 59 > # 11 2 4 F F 49 33 46 > # 12 2 5 F F 49 73 38 > # 13 2 6 F F 49 66 45 > # 14 2 7 F F 49 18 31 > # 18 3 4 M F 52 33 43 > # 19 3 5 M F 52 73 41 > # 20 3 6 M F 52 66 48 > # 21 3 7 M F 52 18 28 > # 26 4 5 F F 33 73 22 > # 27 4 6 F F 33 66 29 > # 28 4 7 F F 33 18 47 > # 34 5 6 F F 73 66 55 > # 35 5 7 F F 73 18 7 > # 42 6 7 F F 66 18 14 > > ------------------------------------- > David L Carlson > Department of Anthropology > Texas A&M University > College Station, TX 77840-4352 > > -----Original Message----- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Rune Grønseth > Sent: Tuesday, May 16, 2017 4:31 AM > To: r-help@r-project.org > Subject: [R] Extracting metadata information to corresponding dissimilarity > matrix > > Hi, > I am R beginner. I've tried googling and reading, but this might be too > simple to be found in the documentation. > > I have a dissimilarity index (symmetric matrix) from which I have extracted > the unique values using the exodist package command "lower". There are 14 > observations, so there are 91 unique comparisons. > > After this I'd like to extract corresponding metadata from a separate data > frame (the 14 observations organized in rows identified by a > samplenumber-vector, and other variables as gender, age, et cetera). The aim > is to have a new data frame with 91 rows and metadata vectors giving me the > value of the dissimilarity index, gender each of the two observations that > are compared by the dissimilarity metric. So if I'm looking for gender > differences, I need 5 vectors in the data frame: samplenumber1, > samplenumber2, gender1, gender2 and dissimilarity metric. > > Does anyone have suggestions or experiences in reformatting data in this > manner? This is just a test-dataset. My full data-set is for more than 100 > observations, so I need a more general code, if that is possible. > > With great appreciation of any help. > > Rune Grønseth > > --- > > Rune Grønseth, MD, PhD, postdoctoral fellow > Department of Thoracic Medicine > Haukeland University Hospital > N-5021 Bergen > Norway > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.