You could give 1-nearest neighbor classification a try. For example, a <- data.frame(person=1:10, ht=rnorm(10, mean=5, sd=1), wt=rnorm(10, mean=180, sd=30), bp=rnorm(10, mean=120, sd=10)) meas.err <- data.frame(ht=rnorm(10, sd=0.1), wt=rnorm(10, sd=3), bp=rnorm(10, sd=1)) b <- (a[, -1] + meas.err)[sample(10), ]
library(class) b$person <- knn1(a[, -1], b, a$person) On Mon, Jun 24, 2013 at 8:47 PM, Leif Kirschenbaum < kirschenbaum.l...@ssd.loral.com> wrote: > Dear list, > I've searched the archives and tried some code, however would appreciate > some input - even a pointer in the direction of the correct function to use. > > Given N samples each of which is measured for characteristics x1, x2, > x3,... (m 6) where each characteristic is a roughly normally distributed > numeric, but with different center and scale. > Then the N samples are measured again for characteristics again as x1, x2, > x3,..., however the identity of the samples is unknown. > > Is there a function which will assign the unique identities from the first > measurement to the second measurement? > > I've tried scaling by using the pooled variance of each x1 (i.e. 2N values > to estimate the variance of the measure of characteristic x1, the > characteristic x2, etc.) to construct the normalized distance from one > sample's second measurement x1, x2, x3... to each of the first measurements > and then pick the minimum distance to assign an identity to the second > measurement. Then loop over all the second measurements to find the first > measurement "closest" to it. > However I result with one sample ID from the first measurement being > assigned to multiple second measurements. > > How could I minimize the matching between the second measurements and the > first with unique sample ID assignment? > > > Example: > measure height, weight, and blood pressure of 100 people with their names > recorded (scale and ruler both have some random unknown error) > measure the height, weight, and blood pressure of those 100 people again, > but you forgot to write down their names. (assume that the scale and ruler > errors have not changed since the first measurement) > > How to assign the second set of measurements to the first? > > > Leif Kirschenbaum, Ph.D., PMP > Principal Reliability Engineer > Parts Engineering > Design Reliability > Product Reliability > SSL > 3825 Fabian Way M/S H-21 > Palo Alto, CA 94303 > Tel: +1-650-852-6580 > Facsimile: +1-650-852-7832 > www.ssloral.com > > This e-mail, and any attachments, are intended solely for the use of the > intended recipient(s) and may contain > legally privileged, proprietary and/or confidential information. Any use, > disclosure, dissemination, distribution or > copying of this e-mail and any attachments for any purposes that have not > been specifically authorized by the > sender is strictly prohibited. If you are not the intended recipient, > please immediately notify the sender by reply > e-mail and permanently delete all copies and attachments. > The entire content of this e-mail is for "information purposes" only and > should not be relied upon by the recipient > in any way unless otherwise confirmed in writing by way of letter or > facsimile. > > > ________________________________ > This message (including any attachments) may contain con...{{dropped:7}} > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.