Thank you again for all the R help folks who responded. I again appreciate all the help and insight and will investigate the options suggested.
I guess I still doing a little head scratching at how the division occurred: It looks like the default hist(...) behavior is doing the following: HouseHist<-hist(as.numeric(HouseYear_array)) HouseHist$counts [1] 2 1 4 4 8 8 That would equate to the following grouping of the years: [90, 91] (91, 92] (92, 93] (93, 94] (94, 95] (95, 96] However, the true division is something like the following: table(as.numeric(HouseYear_array)) 1990 1991 1992 1993 1994 1995 1996 1 1 1 4 4 8 8 Seems like hist behavior could have been: (89, 90] (90, 91] (91, 92] (92, 93] (93, 94] (94, 95] (95, 96] Of course, I haven't had any coffee yet... This goes with the following example: http://n2.nabble.com/What-is-going-on-with-Histogram-Plots-td3022645.htm --- On Thu, 6/4/09, ted.hard...@manchester.ac.uk <ted.hard...@manchester.ac.uk> wrote: > From: ted.hard...@manchester.ac.uk <ted.hard...@manchester.ac.uk> > Subject: RE: [R] Understanding R Hist() Results... > To: R-help@r-project.org > Cc: "Jason Rupert" <jasonkrup...@yahoo.com> > Date: Thursday, June 4, 2009, 5:13 AM > On 04-Jun-09 04:00:11, Jason Rupert > wrote: > > > > Think I'm missing something to understand what is > going on with > > hist(...) > > > > http://n2.nabble.com/What-is-going-on-with-Histogram-Plots-td3022645.htm > > l > > > > For my example I count 7 unique years, however, on the > histogram there > > only 6. It looks like the bin to the left of the > tic mark on the > > x-axis represents the number of entries for that year, > i.e. Frequency. > > > > I guess it looks like the bin for 1990 is > missing. Is there a better > > way or a different histogram R command to use in order > to see all the > > age bins and them for them to be aligned directly over > the year tic > > mark on the x-axis? > > > > Thanks again for any insights that can be provided. > > It's doing what it's supposed to -- which admitredly could > be confusing > when all your data lie on the exact boundaries between > bins. > > From ?hist, by default "include.lowest = TRUE, right = > TRUE", and: > > If 'right = TRUE' (default), the histogram cells are > intervals of > the form '(a, b]', i.e., they include their > right-hand endpoint, > but not their left one, with the exception of the > first cell when > 'include.lowest' is 'TRUE'. > > In your data: > > sort(HouseYear_array) > [1] "1990" "1991" "1992" "1993" "1993" "1993" "1993" > "1994" "1994" > [10] "1994" "1994" "1995" "1995" "1995" "1995" "1995" > "1995" "1995" > [20] "1995" "1996" "1996" "1996" "1996" "1996" "1996" > "1996" "1996" > > and, with > > H<-hist(as.numeric(HouseYear_array)) > H$breaks > # [1] 1990 1991 1992 1993 1994 1995 1996 > > so you get 2 (1990,1991) in the [1990-1] bin, 1 in the > [1991-2] bin, > 4 in [1992-3], and so on, exactly as observed. > > You can get what you're expecting to see by setting the > 'breaks' > parameter explicitly, and making sure the breakpoints do > not > coincide with data (which ensures that there is no > confusion about > what goes in which bin): > > > hist(as.numeric(HouseYear_array),breaks=0.5+(1989:1996)) > > Ted. > > -------------------------------------------------------------------- > E-Mail: (Ted Harding) <ted.hard...@manchester.ac.uk> > Fax-to-email: +44 (0)870 094 0861 > Date: 04-Jun-09 > > Time: > 11:13:22 > ------------------------------ XFMail > ------------------------------ > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.