Hello Niel, Thank you for writing this.
In my original post I wrapped the word "real" with quotes for just the reason you've mentioned (although I didn't know of Xi'ans post), so I agree with you that I should have instead wrap the sentence with words of caution (instead of merely with quotes). Cheers, Tal ----------------Contact Details:------------------------------------------------------- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) ---------------------------------------------------------------------------------------------- On Fri, Apr 22, 2011 at 5:43 PM, Niels Richard Hansen < niels.r.hansen+li...@math.ku.dk> wrote: > Tal > > Let me express some concern about using words like "true" or "real" > in relation to random number generation - for exactly the same reasons > as mentioned here: > > http://xianblog.wordpress.com/2010/09/07/truly-random/ > > Device random number generators (whether provided via web-services or not) > should be regarded with as much skepticism as algorithmic generators, and > they typically don't have a set.seed() function for reproducibility -- you > would have to store the entire sequence. > > - Niels > > > On 22/04/11 04.28, Tal Galili wrote: > >> BTW, Ken Kleinman recently wrote a post on how to get a "real" random >> numbers (into R) from a web-service: >> >> http://www.r-bloggers.com/example-8-35-grab-true-not-pseudo-random-numbers-passing-api-urls-to-functions-or-macros/ >> >> < >> http://www.r-bloggers.com/example-8-35-grab-true-not-pseudo-random-numbers-passing-api-urls-to-functions-or-macros/ >> > >> Cheers, >> Tal >> >> ----------------Contact >> Details:------------------------------------------------------- >> Contact me: tal.gal...@gmail.com | 972-52-7275845 >> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | >> www.r-statistics.com (English) >> >> ---------------------------------------------------------------------------------------------- >> >> >> >> >> On Fri, Apr 22, 2011 at 6:47 AM, Joshua Wiley<jwiley.ps...@gmail.com >> >wrote: >> >> On Thu, Apr 21, 2011 at 8:34 PM, Penny Bilton<pennybil...@xnet.co.nz> >>> wrote: >>> >>>> Hi Josh, >>>> >>>> Thanks for your reply. >>>> >>>> The problem is have is in trying to retain the proportions of 2 groups >>>> in >>>> >>> my >>> >>>> data while sampling into training and test sets. I find that different >>>> arguments for set.seed give very different proportions of my 2 groups >>>> in >>>> the training and test sets. >>>> >>> >>> Sure, just because numbers are random does not guarantee that equal >>> numbers from both groups will be sampled. Perhaps you are looking for >>> some sort of constrained random sampling like sampling x from group 1 >>> and x from group 2? If so, try calling sample() separately on each >>> group (for help applying the same function to different groups, take a >>> look at ?by or ?tapply for example). >>> >>> Josh >>> >>> PS cced back to list >>> >>> >>>> >>>> Penny. >>>> >>>> >>>> >>>> On 22/04/2011 3:27 p.m., Joshua Wiley wrote: >>>> >>>>> >>>>> Hi, >>>>> >>>>> On Thu, Apr 21, 2011 at 8:18 PM, Penny Bilton<pennybil...@xnet.co.nz> >>>>> wrote: >>>>> >>>>>> >>>>>> I am using /set.seed()/ before the /sample/ function. >>>>>> >>>>>> How does the length of the argument of /set.seed()/ and order of the >>>>>> digits affect how the sampling is carried out? >>>>>> >>>>> >>>>> You can use set.seed() to specify a particular seed so that while >>>>> pseudo-random numbers are sampled, you can repeat it. For example: >>>>> >>>>> set.seed(10) >>>>> rnorm(10) >>>>> set.seed(10) >>>>> rnorm(10) >>>>> >>>>> Specifically, I have used set.seed(123456789). Will this configuration >>>>>> give me a genuinely random sampling?? >>>>>> >>>>> >>>>> You will never get truly random sampling from a computer algorithm, >>>>> but it is darn close and more than adequate in the majority of cases. >>>>> 123456789 is just a length 1 vector containing the number 123456789, >>>>> not 9 separate numbers. >>>>> >>>>> Google will be able to give you a lot of information on pseudo-random >>>>> number algorithms as well as the concept of "seeds". Also see >>>>> ?set.seed >>>>> >>>>> Cheers, >>>>> >>>>> Josh >>>>> >>>>> >>>>>> Thank you in anticipation. >>>>>> >>>>>> Penny. >>>>>> >>>>>> >>>>>> [[alternative HTML version deleted]] >>>>>> >>>>>> ______________________________________________ >>>>>> R-help@r-project.org mailing list >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>> PLEASE do read the posting guide >>>>>> http://www.R-project.org/posting-guide.html >>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >>> >>> >>> -- >>> Joshua Wiley >>> Ph.D. Student, Health Psychology >>> University of California, Los Angeles >>> http://www.joshuawiley.com/ >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Niels Richard Hansen Web: www.math.ku.dk/~richard > Associate Professor Email: niels.r.han...@math.ku.dk > Department of Mathematical Sciences > nielsrichardhan...@gmail.com > University of Copenhagen Skype: nielsrichardhansen.dk > Universitetsparken 5 Phone: +1 510 502 8161 > 2100 Copenhagen Ø > Denmark > > > > > > > > [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.