Hello dear R-helpers,
I have a small problem in my algorithm. I have sequences of "0" and "1"
values in a column of a huge data frame, and I just would like to keep the
first value of each sequences of "1" values, such like in this example:
data <-
data.frame(mydata=c(0,0,0,1,1,1,1,1,0,0,0,0,1,1,
Great! Both ways works well for my whole data!
Thanks guys!
--
View this message in context:
http://r.789695.n4.nabble.com/keep-only-the-first-value-of-a-numeric-sequence-tp4700774p4700783.html
Sent from the R help mailing list archive at Nabble.com.
_
Dear R-users,
I would like to transpose a large data.frame according to a specific column.
Here's a reproductible example, it will be more understandable.
At the moment, my data.frame looks like this example:
DF <- data.frame(id=c("A","A","A","B","B","B","C","C","C"),
Year=c(2001,2002,2003,2002,
Both ways are doing well the job. Nice!
Thanks again!
--
View this message in context:
http://r.789695.n4.nabble.com/transpose-a-data-frame-according-to-a-specific-variable-tp4702971p4703007.html
Sent from the R help mailing list archive at Nabble.com.
_
Dear R-users,
I would like to speed up a double-loop I developed for detecting and
removing outliers in my whole data.frame. The idea is to remove data with a
too big difference with the previous value. If detected, this test must be
done here on maximum the next 10 values following the last corre
Hi,
Maybe a beginning of solution with this?
test <-
data.frame(x=c(1,1,1,1,1,1,2,2,2,2,2,2),y=c("a","a","a","b","b","b","a","a","b","b","b","a"))
test[order(test$x),]
out <- split(test,test$x)
for (i in 1:length(out)) {
foo <- unique(out[[i]][,2])
out[[i]][,2] <- rep(foo,(nrow(out[[
Hi Petr,
Thanks for your reply,
Actually it's not what I'm looking for. The aim is not simply to remove each
value > 15.
In my loop, I consider the first numeric value of my column as "correct".
Then, I want to test the second value. If the absolute difference with the
previous correct one is <
I tried another faster way which seems to do the trick right now:
myts
<-data.frame(x=c(10,2,50,40,NA,NA,0,50,1,2,0,0,NA,50,0,15,3,5,4,20,0,0,25,22,0,1,100),z=NA)
test <- function(x){
st1 <- numeric(length(x))
temp <- st1[1]
for (i in 2:(length(x))){
if((!is.na(
Hi everyone,
I have a small problem in my R-code.
Imagine this DF for example:
DF <- data.frame(number=c(1,4,7,3,11,16,14,17,20,19),data=c(1:10))
I would like to add a new column "Station" in this DF. This new column must
be automatically filled with: "V1" or "V2" or "V3".
The choice must be do
Yes this is it!
Thank you for your help Berend!
--
View this message in context:
http://r.789695.n4.nabble.com/create-new-column-in-a-DF-according-to-values-from-another-column-tp4644217p4644225.html
Sent from the R help mailing list archive at Nabble.com.
Hi everybody,
I have a little problem about filling some gaps of NAs in my data.
These gaps are between nearly constant data (temperature under snow). Here's
a fake example to illustrate how it looks like approximately:
DF <-
data.frame(data=c(-0.51,-0.51,-0.48,-0.6,-0.54,-0.38,-0.6,-0.42,NA,NA,
Hello,
Try this, It'll maybe help you:
a <- "1,2"
b <- strsplit(a,",") #split your data according to ","
b <- unlist(b) # it creates a list, so we unlist the result to obtain a
vector like c(1,2)
--
View this message in context:
http://r.789695.n4.nabble.com/converting-a-string-to-an-in
Still so perfect Rui! A bit much more complicated as what I thought,
nevertheless it's what I want!
Thank you Rui!
--
View this message in context:
http://r.789695.n4.nabble.com/filling-NA-gaps-according-to-previous-data-mean-and-following-data-mean-tp4646613p4646620.html
Sent from the R help
Hi everybody,
I have a little problem in my R-code which seems be easy to solve, but I
wasn't able to find the solution by myself for the moment.
Here's an example of the form of my data:
data <-
data.frame(col1=c("a","a","b","b"),col2=c(1,1,2,2),col3=c(NA,"ST001","ST002",NA))
I would like to r
Yes, this is the good one Arun! Thank you very much.
I tried each solution but yours was the best. It works well.
Thanks anyway for all your replies!
--
View this message in context:
http://r.789695.n4.nabble.com/remove-duplicated-row-according-to-NA-condition-tp4691362p4691422.html
Sent fr
Hi everybody,
I have a small problem in a function, about removing short sequences of
identical numeric values.
For the example, we can consider this data, containing only some "0" and
"1":
test <- data.frame(x=c(0,0,1,1,1,0,0,0,0,1,1,1,1,1,1,1,1))
The aim of my purpose here is simply to remove
Hello everybody,
I'm trying to create my own color palette on R, in order to interpolate some
different temperature data on different maps (daily means, seasonal
means,...).
I would like to create a color palette which works for each map, so I need a
color palette between -40 and +40°C. Sometimes
Thank you Nicole!
I did it with the "color.palette" function in the link you gave me.
I added then in my levelplot function a sequence with "at":
at=seq(-40,40,1)
And it works quite good.
Thanks again Nicole.
Merci à toi aussi pascal, et vive le CRC ainsi que le grand C. C. !
;)
Dear users,
I have for the moment a function which looks for the best correlation for
each file I have in my correlation matrix. I'm working on a list.files.
Here's the function:
get.max.cor <- function(station, mat){
mat[row(mat) == col(mat)] <- -Inf
which( mat[station, ]
Hello dear R-users,
I have a problem in my code about ignoring NA values without removing them.
I'm working on a list of files. The aim is to fill one file from another
according to the highest correlation (correlation coeff between all my
files, so the file which looks like the most to the one I
Thanks for answering Jeff.
Yes sorry it's not easy to explain my problem. I'll try to give you a
reproductible example (even if it'll not be exactly like my files), and I'll
try to explain my function and what I want to do more precisely.
Imagine for the example: df1, df2 and df3 are my files:
df1
Hello Rui,
Sorry I read your post after having answered to jeff.
If seems effectively to be better than ifelse, thanks. But I still have some
errors:
Error in x[1:8700, 1] : incorrect number of dimensions AND
In is.na(xx) : is.na() applied to non-(list or vector) of type 'NULL
It seems to have m
Thanks again but my errors are still here. Is it maybe coming from the next
fonction (I combinate these 2 functions but I thought it was coming from the
first one):
process.all <- function(df.list, mat){
f <- function(station)
na.fill(df.list[[ station ]], df.list[[ m
Hello,
I added your flags in my code but there are still errors.
Actually I tried some things:
- in function "na.fill", I changed:
if(all(!is.na(y[1:8700,1]))) return(NA) to
if(all(!is.finite(y[1:8700,1]))) return(y)
In order to have this file unchanged.
It has removed my dimension problem.
Ok Jeff, but then it'll be a big one. I'm working on a list of files and my
problem depends on different functions used previously. So it's very hard
for me to summarize to reproduct my error. But here is the reproductible
example with the error at the last line of the code (just copy and paste
it)
Thanks again for your help jeff.
Sorry if I'm not very clear. It's programmingly speaking hard to explain,
and even to explain in english as I'm French.
But i'll try again.
Well your proposition removes the error, but it's not the result I'm
expecting. You've removed NULL data.frames, but I need t
Hi everybody,
I have a small question about the function "na.locf" from the package "zoo".
I saw in the help that this function is able to fill NA gaps with the last
value before the NA gap (or with the next value).
But it is possible to fill my NA gaps according to the last AND the next
value at
Seems to work very well!
Thank you very much Gabor!
--
View this message in context:
http://r.789695.n4.nabble.com/using-na-locf-from-package-zoo-to-fill-NA-gaps-tp4635150p4635160.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@
Hi everybody.
I'll first explain my problem and what I'm trying to do.
Admit this example:
I'm working on 5 different weather stations.
I have first in one file 3 of these 5 weather stations, containing their
data. Here's an example of this file:
DF1 <- data.frame(station=c("ST001","ST004","ST00
"merge" is enough for me, thanks!
I was thinking about a loop, or a function like "grep", or maybe another
function.
I'll have to think easier next time!
Thanks again!
--
View this message in context:
http://r.789695.n4.nabble.com/duplicate-data-between-two-data-frames-according-to-row-names-tp4
Hello everybody,
I need to calculate seasonal means with temperature data for my work.
I have 70 files coming from weather stations, which looks like this for
example:
startdate <- as.POSIXct("01/01/2006", format = "%d/%m/%Y")
enddate <- as.POSIXct("05/01/2006", format = "%d/%m/%Y")
date <- seq(
Thank you both for your answers.
I found a best way to delete the first 2 months (Jan + Feb) and the last
month (Dec), which should work everytime:
DF$year <- as.numeric(format(DF$Day, format = "%Y"))
DF$month <- as.numeric(format(DF$Day, format = "%m"))
# delete first 2 months
for(i in DF[1,3]
It's working now!
The problem was not for winter, but with the "with" you had in your object
"DF$season. I got an error: invalid 'envir' argument.
I removed it and now it seems to be OK.
Thank you very much for your help ricardo.
--
View this message in context:
http://r.789695.n4.nabble.com/
Hello,
I have created a spatial map of temperature over an area thanks to
interpolation. So I have temperature data everywhere on my map.
My question is: how can I find temperature data on my map for a specific
location according to coordinates?
For this, I have a data frame containing 4 columns:
Yes this is it!
It works also in this way with your code, without calling directly the user
in the console.
Thank you very much Rui. You're still so helpful!
--
View this message in context:
http://r.789695.n4.nabble.com/How-to-find-data-in-a-map-according-to-coordinates-tp4639724p4639876.ht
Hi everybody,
I have a question about applying a specific function (with the calculations
I want to do), on a list of elements.
Each elements are like a data.frame (with nrows and ncolumns), and have the
same structure.
At frist, I had a big data.frame that I splitted in all my elements of my
lis
Hello,
And this:
get(MyList[[1]]) with [[ ]] instead of [ ] ?
If you do for example:
MyList <- list()
MyList [length(MyList )+1]<- "MyVar"
MyVar <- c(1:10)
get(MyList[[1]])
It seems to do what you want
--
View this message in context:
http://r.789695.n4.nabble.com/Get-variable-data-Reading
Yes, this is it (as would say michael)! Thank you guys!
Last question about another function on this list: imagine this list is my
data after your function for the regression model:
mydf <- data.frame(x=c(1:5), y=c(21:25),z=rnorm(1:5))
mylist <- rep(list(mydf),5)
Don't care about this fake data
Hello,
You can try this:
x=seq(-3,0,length=30)
y=1/sqrt(2*pi)*exp(-x^2/2)
plot(x,y,type="l",lwd=2,col="red")
with:
x: your vector between -3 and 0 (you can choose the length of your vector)
y: the probability density function for the standard normal distribution
formula
--
View this message
Hi everybody,
I'm a new R french user. Sorry if my english is not perfect. Hope you'll
understand my problem ;)
I have to work on temperature data (35000 lines in one file) containing some
missing data (N/A). Sometimes I have only 2 or 3 N/A following each other,
but I have also sometimes 100 or
Michael,
First of all, thank you very much for your answer.
I've read your 2 answers, but I'm not really sure that they corresponds to
my problem of NAs.
I'll try to detail you a bit more.
This problem concerns the second part of my program. In the first part, I've
already created a timeseries ob
Wow, thank you for all your answers.
You were completely right michael. Well, it's my fault. I didn't understood
your 2nd reply, when you were talking about arguments for larger gaps. I
thought it was for deleting big gaps too. I apologize.
It was too easy in fact. I also didn't noticed the argume
Dear users,
I'm quite a new french R-user, and I have a problem about doing a
correlation matrix.
I have temperature data for each weather station of my study area and for
each year (for example, a data file for the weather station N°1 for the year
2009, a data file for the N°2 for the year 2010,
Hello Rui,
Thanks a lot for your answer.
Hou hoped that your script would help me?
I answer you: It is WON-DER-FUL!
It works very well! I had first some difficulties to adapt it to my data,
but I succeeded afterwords when I made a test between 2 stations.
It's not perfect yet (I still have to mo
I improved yesterday a bit your script (mostly according to station numbers
for the automatization). Here's the final version. thanks again!
filenames <- list.files(pattern="\\_2008_reconstruit.csv$")
Sensors <- paste("capteur_", 1:4, sep="")
Stations <-substr(filenames,1,5)
nsensors <- length
Hi everyone.
I have a question about a work on R I have to do for my job.
I have temperature data coming from 70 weather stations. One data file
corresponds to one station for one year (so 70 files for one year). Each
file looks like this (important: each file contains NAs):
time
Hi Sarah,
Thank you for your answer.
Yes I know that my proposition is not necessary the better way to do it. But
my problem concerns only big gaps of course (more than half a day of missing
data, till several months of missing data).
I've already filled small gaps with the interpolation that you
Hi Rui,
Yes you're right. It's me again ^^
This post is the last part (I hope) of my job. You helped me a lot last time
for the correlation matrices.
I have to leave my work now, so I'll check and test your proposition
tomorrow. But it makes no doubt that it'll help me a lot again.
I'll tell you
Hi again Rui,
I tested your script as you wrote it with my examples, it works perfectly!
It seems to be exactly what I'm trying to do.
I just have a question about your function na.fill.
When I'm trying to apply your script to my data, it doesn't work. I think
it's because in your example, you alr
Seems to work great!
I have a last question (or 2) for you about it, and I will leave you alone
afterwords, I promise :)
I tested your function process.all for the automatization. It seems to be
OK.
It's just when I'd like to save the filled data files.
If I name process.all, for example: test
Hello Rui,
For the write.table, it's OK!
And for the second one (for the 2nd best correlation) seems to work great!
You're too strong ^^
I have to check a bit more to be sure, but it seems to do it!
If you come in the Alps, it will be more liqueurs such as Chartreuse or
Génépi (from mountain plan
Hi Rui it's me again.
I would have another question in the function "process.all" you explained
me. But as you already helped me a lot, and as I promised I won't disturb
you again, I want to ask you first if you accept to help me one more time
before telling you more precisely my problem (about add
Dear R users,
For the moment, I have a script and a function which calculates correlation
matrices between all my data files. Then, it chooses the best correlation
for each data and take it in order to fill missing data in the analysed file
(so the data from the best correlation file is put automa
Hi dear R-users,
I have a question about a function I'm trying to improve.
How can I stop the function calculation at the last numeric value of my
data?
The problem is that the end of my data contains missing values (NAs). And
the aim of my function is to compare the first numeric value with the n
Thank you for your reply sarah.
Well actually I don't try to access x[i+1]. The line where you saw it starts
with #. It was just try I wanted to keep (sorry I should have removed it
before posting).
But I ask him to access to the next value if conditions in the loop are not
verified (restart the c
Thanks for your answer too Berend.
Yes you're right about x[i+1]. You answered juste before me.
Well your idea of declaring all in numeric is great. It avoids my problem.
But actually I also have small missing data gaps in the rest of my data (in
the middle of numeric values).
And one of the aim of
I tried your proposition Sarah (I was answering to Berend when you posted
your answer).
Well it seems to work!
I just had to add afterwords a line to have my NAs again.
I converted values = 0 by NA (numeric() in the function did the contrary for
the calculation):
mydata[mydata==0] <- NA
At firs
Hi everybody,
I have a small question about R.
I'm doing some correlation matrices between my files. These files contains
each 4 columns of data.
These data files contains missing data too. It could happen sometimes that
in one file, one of the 4 columns contains only missing data NA. As I'm
doing
Hi Jim,
Thanks for your answer.
I tried your proposition. The idea seems to be good but I still have my
error.
Actually, the error is in the next function, which uses the function
get.max.cor I told you before.
I also tried these 2 functions with data containing no missing data, and it
works well.
Hello Rui,
Thanks for your answer too.
I tried your proposition too, but by giving the value 0 for this file, it
still wants to make a calculation with it. As it is looking for the best
correlation, and then the 2nd best correlation, giving only 0 seems to be a
problem for the 2nd best correlation
I tried your function. It works great thanks. I used then diag() in order to
have the value "1" for the whole diagonal of my matrix. But it still doesn't
work it's crazy.
By deleting colums and rows (and so some files) containing only NAs in the
correlation matrix, it doesn't work when I apply
Hi everyone.
I'm working on a list of files (about 50 files). I've listed them thanks to
the function: list.files.
Each of my files contains 35000 lines of data. These files may also contain
some missing values NA (sometimes till 10 000 NAs following each other).
The aim is to do some correlation
Hi Joshua,
Thanks you for your answer. I have to leave my work now but I'll try your
proposition tomorrow and I'll tell you if it works for me.
Good evening
--
View this message in context:
http://r.789695.n4.nabble.com/select-part-of-files-from-a-list-files-tp4630769p4630777.html
Sent from the
Hi again Joshua.
I tried your function. I think it's what I need. It works well in the small
example of my first post. But I have difficulties to adapt it to my data.
I'll try to give you another fake example with my real script and kind of
data (you can just copy and paste it to try):
ST1 <-
dat
Hi everybody.
I'm trying to do a correlation matrix in a list of files. Each file contains
2 columns: "capt1" and "capt2". For the example, I merged all in one
data.frame. My data also contains many missing data. The aim is to do a
correlation matrix for the same data for course (one correlation m
65 matches
Mail list logo