Hello,
I don't understand why you are splitting data1 and then unlisting the
result.
if you want to apply a modeling function to each of the subdf's, split
by Product name, you can follow more or less these steps:
0. Create a dataset
set.seed(9376) # Make the results reproducible
n <- 100
PN <- c("Target Brand", "3M", "Avery")
data1 <- data.frame(Product_name = sample(PN, n, TRUE),
Year_of_Record = sample(2011:2018, n, TRUE),
Sales = runif(n, 10, 1000),
Region = sample(letters[1:5], n, TRUE)
)
head(data1)
1. Split the dataset by product name. Thsi gives a list of subdf's.
X <- split(data1, data1$Product_name)
2. Now lappy a modeling function to each subdf.
modelFun <- function(DF){
lm(Sales ~ Region, data = DF)
}
model_list <- lapply(X, modelFun )
model_smry <- lapply(model_list, summary)
model_smry[[1]]
#
#Call:
# lm(formula = Sales ~ Region, data = DF)
#
#Residuals:
# Min 1Q Median 3Q Max
#-487.41 -196.17 1.76 195.96 498.48
#
#Coefficients:
# Estimate Std. Error t value Pr(>|t|)
#(Intercept) 437.300 108.147 4.044 0.000355 ***
# Regionb 437.019 167.540 2.608 0.014229 *
# Regionc 102.989 179.341 0.574 0.570217
#Regiond 105.520 152.942 0.690 0.495721
#Regione -5.638 138.342 -0.041 0.967773
#---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
#Residual standard error: 286.1 on 29 degrees of freedom
#Multiple R-squared: 0.2426, Adjusted R-squared: 0.1381
#F-statistic: 2.322 on 4 and 29 DF, p-value: 0.08039
Hope this helps,
Rui Barradas
Às 16:54 de 01-06-2018, nguy2952 University of Minnesota escreveu:
Hello folks,
I have a big project to work on and the dataset is classified so I am just
going to use my own example so everyone can understand what I am targeting.
Let's take Target as an example: We consider three brands of tape: Target
brand, 3M and Avery. The original data frame has 4 columns: Year of Record,
Product_Name(which contains three brands of tape), Sales, and Region. I
want to create a new data frame that looks like this:
Year of Record Sales Region
Target Brand
3M
Avery
Here is what I did.
1.
I split the original data frame which I called data1:
X = split(data1, Product_name)
2.
Unlist X
X1 = unlist(X)
3.
Create a new data frame
new_df = as.data.frame(X1)
But, when I used the command View(new_df), I had only two columns: The left
one is similar to TargetBrand.Sales, etc. and the right one is just "X1"
I did not achieve what I wanted.
**A potentially big question from readers:*
Why am I doing this?
*Answer:*
I want to run a multiple regression model later to see among different
regions, what the sales look like for these three brands of tape:
*Does Mid-west buy more house brand than East Coast?*
or
*Does region really affect the sales? Are Mid-West's purchases similar to
those of East Coast and West Coast?*
I need help. Please give me guidance.
Sincerely,
Hugh N
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.