I still like the number 4 option, so I think we need to come up with a formal definition for a "junk" of data. I read somewhere that Tukey coined the word "bit" as it applies to computers, we can share the credit/blame for "junks" of data.
My proposal for a statistical/data definition of the work junk: Junk (noun): A quantity of data just large enough to get the client excited about the "great" dataset they provided, but not large enough to make any useful conclusions. Example sentence: We just received another junk of data from the boss, who gets to give him the bad news that it still does not prove his pet theory? -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 > -----Original Message----- > From: Patrick Burns [mailto:[EMAIL PROTECTED] > Sent: Friday, June 06, 2008 12:58 PM > To: Gabor Grothendieck > Cc: Greg Snow; r-help@r-project.org > Subject: Re: [R] Improving data processing efficiency > > My guess is that number 2 is closest to the mark. > Typing too fast is unfortunately not one of my habitual attributes. > > Gabor Grothendieck wrote: > > On Fri, Jun 6, 2008 at 2:28 PM, Greg Snow > <[EMAIL PROTECTED]> wrote: > > > >>> -----Original Message----- > >>> From: [EMAIL PROTECTED] > >>> [mailto:[EMAIL PROTECTED] On Behalf Of Patrick Burns > >>> Sent: Friday, June 06, 2008 12:04 PM > >>> To: Daniel Folkinshteyn > >>> Cc: r-help@r-project.org > >>> Subject: Re: [R] Improving data processing efficiency > >>> > >>> That is going to be situation dependent, but if you have a > >>> reasonable upper bound, then that will be much easier and not far > >>> from optimal. > >>> > >>> If you pick the possibly too small route, then increasing > the size > >>> in largish junks is much better than adding a row at a time. > >>> > >> Pat, > >> > >> I am unfamiliar with the use of the word "junk" as a unit > of measure for data objects. I figure there are a few > different possibilities: > >> > >> 1. You are using the term intentionally meaning that you > suggest he increases the size in terms of old cars and broken > pianos rather than used up pens and broken pencils. > >> > >> 2. This was a Freudian slip based on your opinion of some > datasets you have seen. > >> > >> 3. Somewhere between your mind and the final product > "jumps/chunks" became "junks" (possibly a microsoft > "correction", or just typing too fast combined with number 2). > >> > >> 4. "junks" is an official measure of data/object size that > I need to learn more about (the history of the term possibly > being related to 2 and 3 above). > >> > >> > > > > 5. Chinese sailing vessel. > > http://en.wikipedia.org/wiki/Junk_(ship) > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.