[R] R how to find outliers and zero mean columns?
Hi team I am new to R so please help me to do this task. Please find the attached data sample. But in the original data frame I have 350 features and 40 observations. I need to carryout these tasks. 1. How to Identify features (names) that have all zeros? 2. How to remove features that have all zeros from the dataset? 3. How to identify features (names) that have outliers such as 9,-1 in the data frame. 4. How to remove outliers? Many thanks __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R how to find outliers and zero mean columns?
Hi David, > Please find the attached data sample. No. Nothing attached. Please read the Rhelp Info page and the Posting Guide. *I attached it. Anyway I have attached it again (sample train.xlsx).* Who is assigning you this task? Homework? (Read the Posting Guide.) *This is my new job role so I have to do that. I know some basic R * > 1. How to Identify features (names) that have all zeros? That's generally pretty simple if "names" refers to columns in a data frame. *You mean such as something like names(data.nrow(means==0))* > 2. How to remove features that have all zeros from the dataset? But maybe you mean to process by rows? *in a column(feature) * > 3. How to identify features (names) that have outliers such as 9,-1 in > the data frame. *Please refer to the attached excel file* > 4. How to remove outliers? You could start by defining "outliers" in something other than vague examples. If this is data from a real-life data gathering effort, then defining outliers would start with an explanation of the context. *By looking at data I need to find the outliers* *Thanks * On Thu, Mar 31, 2016 at 12:20 PM, David Winsemius wrote: > > > On Mar 30, 2016, at 3:56 PM, Norman Pat wrote: > > > > Hi team > > > > I am new to R so please help me to do this task. > > > > Please find the attached data sample. > > No. Nothing attached. Please read the Rhelp Info page and the Posting > Guide. > > > But in the original data frame I > > have 350 features and 40 observations. > > > > I need to carryout these tasks. > > Who is assigning you this task? Homework? (Read the Posting Guide.) > > > 1. How to Identify features (names) that have all zeros? > > That's generally pretty simple if "names" refers to columns in a dataframe. > > > > > 2. How to remove features that have all zeros from the dataset? > > But maybe you mean to process by rows? > > > > 3. How to identify features (names) that have outliers such as 9,-1 > in > > the data frame. > > > > 4. How to remove outliers? > > You could start by defining "outliers" in something other than vague > examples. If this is data from a real-life data gathering effort, then > defining outliers would start with an explanation of the context. > > > > > > > > Many thanks > > Please at least do the following "homework". > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > David Winsemius > Alameda, CA, USA > > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R how to find outliers and zero mean columns?
Hi Jim, Thanks for your reply. I know these basic stuffs in R. But I want to know let say you have a data frame X with 300 features. >From that 300 features I need to pullout the names of each feature that has zero values for all the observations in that sample. Here I am looking for a package or a function to do that. And how do I know whether there are abnormal values for each feature. Let say I have 300 features and 10 observations. It is hard to look everything in the excel file. Instead of that I am looking for a package that does the work. I hope you understood. Thanks a lot Cheers On Thu, Mar 31, 2016 at 1:13 PM, Jim Lemon wrote: > Hi Norman, > To check whether all values of an object (say "x") fulfill a certain > condition (==0): > > all(x==0) > > If your object (X) is indeed a data frame, you can only do this by > column, so if you want to get the results: > > X<-data.frame(A=c(0,1:10),B=c(0,2:10,9), > C=c(0,-1,3:11),D=rep(0,11)) > all_zeros<-function(x) return(all(x==0)) > which_cols<-unlist(lapply(X,all_zeros)) > > If your data frame (or a subset) contains all numeric values, you can > finesse the problem like this: > > which_rows<-apply(as.matrix(X),1,all_zeros) > > What you get is a list of logical (TRUE/FALSE) values from lapply, so > it has to be unlisted to get a vector of logical values like you get > with "apply". > > You can then use that vector to index (subset) the original data frame > by logically inverting it with ! (NOT): > > X[,!which_cols] > X[!which_rows,] > > Your "outliers" look suspiciously like missing values from certain > statistical packages. If you know the values you are looking for, you > can do something like: > > NA9<-X==9 > > and then "remove" them by replacing those values with NA: > > X[NA9]<-NA > > Be aware that all these hackles (diminutive of hacks) are pretty > specific to this example. Also remember that if this is homework, your > karma has just gone down the cosmic sinkhole. > > Jim > > > On Thu, Mar 31, 2016 at 9:56 AM, Norman Pat wrote: > > Hi team > > > > I am new to R so please help me to do this task. > > > > Please find the attached data sample. But in the original data frame I > > have 350 features and 40 observations. > > > > I need to carryout these tasks. > > > > 1. How to Identify features (names) that have all zeros? > > > > 2. How to remove features that have all zeros from the dataset? > > > > 3. How to identify features (names) that have outliers such as 9,-1 > in > > the data frame. > > > > 4. How to remove outliers? > > > > > > Many thanks > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R cases on predictive maintenance
Hi Team, Can you please suggest me some good cases where we can use R programming to tackle predictive maintenance problems Many thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.