The datatable (and the split obviously) only contain characters and numeric data.
I found that 4 regression in a row work if I don't use the calculated columns as variables but 2 of the original columns. RAM usage stays below 3GB! --> Why does R has such problems with the calculated columns? Their calculation is already done before the regression starts. It's like this: Create the calculated columns: Dataset$ExtraColumn1 <- Dataset$ColumnA / Dataset$ColumnB Dataset$ExtraColumn2 <- Dataset$ColumnC / Dataset$ColumnD Perform the split of the dataset inc. calculated columns (the criteria for the split have a hierarchy): Datasplit <- split(Dataset, paste(Dataset$ColumnE, Dataset$ColumnE)) Perform the regression on the splitted data: Regression1 <- lapply(Datasplit, function(d) lm(ExtraColumn1 ~ ExtraColumn2, d, na.action = na.omit, singular.ok = TRUE)) BTW: There are no NA values in the data source. What is my mistake? When I calculate the columns I might divide by zero (=inf). Could that create the problem in the regression? Thanks, Jonas -- View this message in context: http://r.789695.n4.nabble.com/lm-Regression-takes-24-GB-RAM-Error-message-tp4660434p4660496.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.