David Andrew Smith from XL Solutions emailed me privately to clear up the
confusion, too (thanks, David A).  To avoid any future confusion, I'll use
my middle initial (M) from now on.

To get this thread back to R and statistics, there might be an interesting
counting problem here somewhere. :)  What are the chances that two David
Smiths could both be working for companies in the US that support R?

I'm not a geneaologist, but the US census bureau publishes some statistics
on the frequency of names here:

http://www.census.gov/genealogy/names/names_files.html

"David" is the sixth most common first name in the US (with a frequency of
2.363%) and "Smith" is the most common last name (with a frequency of
1.006%).   Now, presumably first and last names aren't chosen independently
(choice of a first name that in the same ethnic domain as the second comes
immediately to mind), but I don't know of any first/last name frequency
data.  Anyway, let's be naive and say that of the approximately 138M males
in the US population (http://www.census.gov/prod/2005pubs/censr-20.pdf),
0.024% are called David Smith, for a total of about 33,000 individuals.

This is the bit where I get stuck.  If we narrow the problem to just R
users, I need an estimate of the size of the population of R users to go
forward.  (Anyone have any info on that?)  Then we'd need to limit it to R
users working for R companies and...

And then, on reflection, I guess we should be simply calculating the
likelihood that two individuals in a population have the same name (given
that the common name was David Smith was only established after the fact),
which makes it a complex form of the Birthday Problem.  Hmmm.  Any
geneaologists want to chime in?

# David M Smith

-- 
David Smith <[EMAIL PROTECTED]>
Director of Community, REvolution Computing www.revolution-computing.com
Tel: +1 (206) 577-4778 x3203 (Seattle, USA)

On Tue, Oct 28, 2008 at 11:57 AM, eugene dalt <[EMAIL PROTECTED]> wrote:

>  I attended a course at XLSolutions and glad they can publish a list of
> recommended R books.
>
> IMHO they shouldn't  post it here though.
>
> This name issue is hilarious given thousands of David Smith out there but
> to cut the story short, my class at XLSolutions was taught by a David Andrew
> Smith who prefers to be called Drew because there are 2 David smith in his
> department.
>
> My 2 cents !
>
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to