This is trivial, so perhaps there is a miscommunication. How do you want to handle values outside your desired range? I would simply change them to NA (see below), but perhaps you have something else in mind that you need to describe more explicitly. Anyway, below is a simple example of what I *think* you asked for. Apologies if I have misunderstood.
> set.seed(567) > ## create a data frame with 3 columns and 5 rows from norm(0,3) > d <- as.data.frame(lapply(rep(5,3), function(x)round(rnorm(x,0,3),2))) > names(d) <- LETTERS[1:3] > d A B C 1 1.97 -1.23 -3.41 2 1.02 -1.12 -2.27 3 -1.92 -6.37 -6.44 4 -4.32 0.18 4.08 5 0.66 -5.82 -0.81 > d[abs(d) > 3] <- NA > d A B C 1 1.97 -1.23 NA 2 1.02 -1.12 -2.27 3 -1.92 NA NA 4 NA 0.18 NA 5 0.66 NA -0.81 Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, May 9, 2022 at 9:44 AM Paul Bernal <paulberna...@gmail.com> wrote: > Dear Rui, > > I was trying to dput() the datasets I am working on, but since it is a bit > large (42,000 rows by 60 columns) couldn´t retrieve all the structure of > the data to include it here, so I am attaching a couple of files. One is > the raw data (called trainFeatures42k), which is the data I need to > normalize, and the other is normalized_Data, which is the data normalized > (or at least I think I got to normalize it). > > Normalized_Data.csv > < > https://drive.google.com/file/d/143I1O710gAqWjzx48Gt1bwUbrG0mbpfa/view?usp=drive_web > > > trainFeatures42k.xls > < > https://drive.google.com/file/d/1deMzGMkJyeVsnRzTKirmm4VqIBRzbvzV/view?usp=drive_web > > > > I have tried some of the code you and other friends from the community have > kindly shared, but have not been able to filter values > -3 and < 3. > > Thank you all for your valuable help always. > Best, > Paul > > El lun, 9 may 2022 a las 4:22, Rui Barradas (<ruipbarra...@sapo.pt>) > escribió: > > > Hello, > > > > Something like this? > > First normalize the data. > > Then a apply loop creates a logical matrix giving which numbers are in > > the range -3 to 3. > > If they are all TRUE then their sum by rows is equal to the number of > > columns. This creates a logical index i. > > Use that index i to subset the scaled data set. > > > > # test data set, remove the Species column (not numeric) > > df1 <- iris[-5] > > > > df1_norm <- scale(df1) > > i <- rowSums(apply(df1_norm, 2, \(x) x > -3 & x < 3)) == ncol(df1_norm) > > > > # returns a matrix > > df1_norm[i, ] > > > > # returns a data.frame > > as.data.frame(df1_norm[i,]) > > > > > > Hope this helps, > > > > Rui Barradas > > > > Às 09:23 de 09/05/2022, Paul Bernal escreveu: > > > Dear friends, > > > > > > I have a dataframe which every single (i,j) entry (i standing for ith > > row, > > > j for jth column) has been normalized (converted to z-scores). > > > > > > Now I want to filter or subset the dataframe so that I only end up with > > a a > > > dataframe containing only entries greater than -3 or less than 3. > > > > > > How could I accomplish this? > > > > > > Best, > > > Paul > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.