For a much simpler solution that does always work for numbers, see unique's methods for matrices and data frames.

On Thu, 4 Sep 2008, Henrik Bengtsson wrote:

On Thu, Sep 4, 2008 at 8:44 PM, Ralph S. <[EMAIL PROTECTED]> wrote:

Hi all,

I am trying to create a unique identifier for each row, combining numbers from three columns.

Do you know if there is a general formula to do this (or some manual where I can read about this)?

I figure I can use the numeric entries of the columns as "coordinates" and multiply them with different coefficients (different magnitudes) to get the unique ID - but it would be nice to read about such algorithms in general.

What are you numbers?  Are they in a fixed range?  Integers or reals?
If fixed range integers, it is easy.  Think regular numerical
representation, e.g. binary, octadecimal, decimal and hexadecimal.

For a more generic solution that works with any data types, see e.g.
MD5 [http://en.wikipedia.org/wiki/MD5].  It is not guaranteed to
generated unique codes, but it is extremely rare that two different
inputs gives the same MD5 code.  MD5 (and others) are implemented in
the 'digest' packages, e.g.

library(digest)
digest(list(a=1, b=list(1:10, c=letters)))
[1] "73e0ae066a97bfff7f79d41c65b55fde"

My $.02

/Henrik



Any links/input would be great -

Ralph

--
Brian D. Ripley,                  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to