Dear R Help list, I have data in a comma delimited format with an unequal number of lines per case, ranging from 1 to 5. Each line contains that individual's rating of a televised conference they observed. I'm interested in the influence of group size on ratings. My questions: how can I create a data frame that will a) treat the group as the unit of analysis and b) permit me to calculate group results, e.g., average ratings per group among those with data (which will vary)? The data are formatted like those below (but with commas), which contains 3 groups (my unit of analysis). The header is the first row. Groupname is alpha (which I'll convert later) and the rest are numeric. Number is the number of the individual rater in the group, ranging from 1-5. Rate1 and Rate2 range from 0-7 but my example has a more limited range. Male, female, white and nonwhite are 1 yes or 0 no.
So I have a maximum of 5 raters in 3 groups. Is it necessary to create a rectangular data frame with empty lines for groups with fewer than 5 raters? How should I do that? Or is there a better way? ( I have a lot more cases than provided below. ) Thank you for a referral to apps or a suggested strategy to deal with this. Pat J. Header: Groupname number rate1 rate2 males females white nonwhite Data: Blue 1 3 1 1 0 0 1 Blue 2 2 3 1 0 1 0 Orange 1 4 4 0 1 1 0 Yellow 1 3 2 1 0 1 0 Yellow 2 5 2 0 1 0 1 Yellow 3 4 3 1 0 0 1 Yellow 4 4 2 1 0 1 0 Yellow 5 2 2 0 1 0 1 [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.