Hello Jeff, thanks a lot for your help. It seems to work well now.
Greetings Birgit Am 28.09.2007 um 20:33 schrieb Jeffrey Robert Spies: > Hi Birgit, > > I've updated the recipe here, including a change to the > dissimilarity function (making it more efficient): > > http://www.r-cookbook.com/node/40 > > You'll notice the change is: > > dissimilar <- function(tRow){ > (sum(tRow==FALSE, na.rm=TRUE) + sum(is.na(tRow)))/length(tRow) > } > > It's actually about 40% faster to use sums instead of sub-setting > the lists and using lengths (but the speed increase will only be > noticeable on very, very, very long lists). > > --Jeff. > > On Sep 28, 2007, at 12:47 PM, Birgit Lemcke wrote: > >> Thanks a lot for both solutions of my problem. >> >> I tried it immediately and I understood how they are working. >> >> The next problem for me is now to deal with the NAs. I thought >> perhaps it is possible to exclude the variable from the row >> comparison if in one of the rows is an NA? >> Furthermore it would be useful than to divide the resulting number >> by the number of used variables for the comparison to get back a >> number between 0 and 1. >> >> Unfortunately I am able to understand what happens if somebody >> gives me the code but I am not able at the moment to write it by >> myself. I hope this will change by and by. >> >> So I would be very pleased if you could help me once again. >> >> Greetings >> >> Birgit >> >> >> Am 28.09.2007 um 18:25 schrieb Jeffrey Robert Spies: >> >>> Not sure how you want to handle the NAs, but you could try the >>> following: >>> >>> #start >>> MalVar29_37 <- read.table(textConnection("V1 V2 V3 V4 V5 V6 V7 V8 V9 >>> 0 0 0 0 0 1 0 0 0 >>> 0 0 0 0 0 1 0 0 0 >>> 0 0 0 0 0 1 0 0 0 >>> NA NA NA NA NA NA NA NA NA >>> 0 1 0 0 0 1 0 0 0"), header=TRUE) >>> >>> FemVar29_37 <- read.table(textConnection(" V1 V2 V3 V4 V5 V6 V7 >>> V8 V9 >>> 1 1 0 0 0 0 0 0 0 >>> 0 1 0 0 1 1 0 0 0 >>> 1 0 0 1 0 0 0 0 0 >>> 0 1 0 0 1 0 0 0 0 >>> 0 1 0 0 0 0 0 0 0"), header=TRUE) >>> >>> comparison <- MalVar29_37 == FemVar29_37 >>> >>> dissimilar <- function(tRow){ >>> length(tRow[tRow==FALSE]) >>> } >>> >>> dissimilarity <- apply(comparison, c(1), dissimilar) >>> dissimilarity >>> # finish >>> >>> Variable comparison is an entry by entry comparison, resulting in >>> values of TRUE or FALSE. I've defined a function dissimilar as the >>> number of FALSEs in a given object (tRow). Variable >>> dissimilarity is >>> then the application of this dissimilar function for each row of >>> comparison. In this example, 0 means all of the entries in a row >>> matche, 9 means none of them matched. You can see the solution here >>> in recipe form: http://www.r-cookbook.com/node/40 >>> >>> Hope this helps, >>> >>> Jeff. >>> >>> On Sep 28, 2007, at 11:13 AM, Birgit Lemcke wrote: >>> >>>> Hello! >>>> >>>> I am R beginner and I have a question obout a simple matching. >>>> >>>> I have to datasets that i read in with: >>>> >>>> MalVar29_37<-read.table("MalVar29_37.csv", sep = ";") >>>> FemVar29_37<-read.table("FemVar29_37.csv", sep = ";") >>>> >>>> They look like this and show binary variables: >>>> >>>> V1 V2 V3 V4 V5 V6 V7 V8 V9 >>>> 1 0 0 0 0 0 1 0 0 0 >>>> 2 0 0 0 0 0 1 0 0 0 >>>> 3 0 0 0 0 0 1 0 0 0 >>>> 4 NA NA NA NA NA NA NA NA NA >>>> 5 0 1 0 0 0 1 0 0 0 >>>> >>>> V1 V2 V3 V4 V5 V6 V7 V8 V9 >>>> 1 1 1 0 0 0 0 0 0 0 >>>> 2 0 1 0 0 1 1 0 0 0 >>>> 3 1 0 0 1 0 0 0 0 0 >>>> 4 0 1 0 0 1 0 0 0 0 >>>> 5 0 1 0 0 0 0 0 0 0 >>>> >>>> each with 348 rows. >>>> >>>> I would like to perform a simple matching but only row 1 >>>> compared to >>>> row1, row 2 compared to row 2 (paired).......giving back a >>>> number as >>>> dissimilarity for each comparison. >>>> >>>> How can i do that? >>>> >>>> Thanks in advance >>>> >>>> Birgit >>>> >>>> >>>> >>>> >>>> Birgit Lemcke >>>> Institut für Systematische Botanik >>>> Zollikerstrasse 107 >>>> CH-8008 Zürich >>>> Switzerland >>>> Ph: +41 (0)44 634 8351 >>>> [EMAIL PROTECTED] >>>> >>>> >>>> >>>> >>>> >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/posting- >>>> guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting- >>> guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> Birgit Lemcke >> Institut für Systematische Botanik >> Zollikerstrasse 107 >> CH-8008 Zürich >> Switzerland >> Ph: +41 (0)44 634 8351 >> [EMAIL PROTECTED] >> >> >> >> >> > Birgit Lemcke Institut für Systematische Botanik Zollikerstrasse 107 CH-8008 Zürich Switzerland Ph: +41 (0)44 634 8351 [EMAIL PROTECTED] [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.