I (and probably everyone who has looked at your email) have literally no clue what you are trying to do. Take a look at this first. And for your âquestionâ I assume you have tried to aggregate the data frame you have called dataset â what you have produced is a new data frame that has all of the combinations gender and position, it has then simply filled the category and salary with counts of how many female managers, female sales reps... and so on. If you are trying to show average category and salaries for each of the combinations use: dataset2<-aggregate(dataset[,c(3:4)],by = dataset[,c(1:2), drop = F], mean)
But, please actually enlighten us as to the nature of your problem. Rob -----Original Message----- From: Carl Witthoft Sent: Thursday, November 24, 2011 3:08 AM To: r-help@r-project.org Subject: Re: [R] what is wrong with this dataset? As the Kroger Data Munger Guru would say, "What is the problem you are trying to solve?" The datasets look just fine from a structural point of view. What do you want to do and what is wrong with the results you get? <quote> From: Kaiyin Zhong <kindlychung_at_gmail.com> Date: Thu, 24 Nov 2011 09:39:20 +0800 > d = data.frame(gender=rep(c('f','m'), 5), pos=rep(c('worker', 'manager', 'speaker', 'sales', 'investor'), 2), lot1=rnorm(10), lot2=rnorm(10)) > d gender pos lot1 lot2 1 f worker 1.1035316 0.8710510 2 m manager -0.4824027 -0.2595865 3 f speaker 0.8933589 -0.5966119 4 m sales 0.4489920 0.4971199 5 f investor 0.9246900 -0.7531117 6 m worker 0.2777642 -0.3338369 7 f manager -1.0890828 0.7073686 8 m speaker -1.3045821 0.4373199 9 f sales 0.3092965 -2.6441382 10 m investor -0.5770073 -1.5200347 > cast(melt(d)) Using gender, pos as id variables gender pos lot1 lot2 1 f investor 0.9246900 -0.7531117 2 f manager -1.0890828 0.7073686 3 f sales 0.3092965 -2.6441382 4 f speaker 0.8933589 -0.5966119 5 f worker 1.1035316 0.8710510 6 m investor -0.5770073 -1.5200347 7 m manager -0.4824027 -0.2595865 8 m sales 0.4489920 0.4971199 9 m speaker -1.3045821 0.4373199 10 m worker 0.2777642 -0.3338369 > dataset = read.csv('datalist.csv') > dataset Gender Title Category Salary 1 M Manager 3 27000 2 F Manager 2 22500 3 M Sales Rep 1 18000 4 M Sales Rep 3 27000 5 F Manager 3 27000 6 M Secretary 4 31500 7 M Sales Rep 2 22500 8 M Secretary 2 22500 9 M Worker 4 40500 10 M Manager 4 37100 11 F Secretary 2 22500 12 F Manager 3 27000 13 M Worker 2 20000 14 M Manager 4 32000 15 F Sales Rep 2 22900 16 M Sales Rep 3 27000 17 F Sales Rep 2 22500 18 M Manager 1 18000 19 M Secretary 3 27000 20 F Sales Rep 3 27000 21 M Secretary 4 31500 22 M Worker 2 22500 23 M Manager 2 22500 24 M Worker 4 40500 25 M Worker 4 37100 26 F Secretary 2 22500 27 F Manager 3 27000 28 M Worker 2 20000 29 M Manager 4 32000 30 F Sales Rep 2 22900 > cast(melt(dataset)) Using Gender, Title as id variables Aggregation requires fun.aggregate: length used as default Gender Title Category Salary 1 F Manager 4 4 2 F Sales Rep 4 4 3 F Secretary 2 2 4 M Manager 6 6 5 M Sales Rep 4 4 6 M Secretary 4 4 7 M Worker 6 6 The content of datalist.xls is here: http://paste.pound-python.org/show/15098/ -- Sent from my Cray XK6 "Pendeo-navem mei anguillae plena est." ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.