[R] create a new dataframe with intervals and computing a weighted average for each of its rows

Luis Miguel Cerchiaro Barros Sun, 24 Nov 2013 05:56:58 -0800


I need you help with this problem, I have a data-frame like this:     
    BHID=c(43,43,43,43,44,44,44,44,44)    
FROM=c(50.9,46.7,44.2,43.1,52.3,51.9,49.3,46.2,42.38)    
TO=c(46.7,44.2,43.1,40.9,51.9,49.3,46.2,42.38,36.3)    
AR=c(45,46,0.0,38.45,50.05,22.9,0,25,9)    DF<-data.frame(BHID,FROM,TO,VALUE)   
     #add the length     DF$LENGTH=DF$FROM-DF$TO
where:
+ BHID: is the borehole identification+ FROM: is  the start for every interval+ 
TO: is the end for every interval+ AR: is the value of our variable+ LENGTH: is 
the distance between FROM and TO
what I want, is create a data frame which is "normalized", it means that every 
interval has the same length and the column **AR** is calculated as a Weighted 
arithmetic mean from the old **AR** and  **LENGTH** as its weight.
For more clarity I going to show you how should look the desire data frame.
    BHID        FROM    TO          AR          LENGTH    43        50.9        
47.9       45.0     3.0    43       47.9        44.9       45.6     3.0    43   
    44.9        41.9       26.113      3.0    43            41.9        40.9    
38.45        1.0    44....
where:
1. AR is the Weighted arithmetic mean
I have to make a clarification about the result:
here I attached an example of my excel table with calculations:
    ROW_ID BHID NEW_FROM NEW_TO NEW_AR  OLD_FROM OLD_TO WEIGHTS OLD_AR    1     
            43   50.9         47.9              45               50.9    46.7   
           3.0      45    2                 43   47.9         44.9              
45.6             50.9    46.7              1.2      45    2                 43  
 47.9         44.9                               46.7    44.2              1.8  
    46    3                 43   44.9         41.9              26.113    46.7  
 44.2              0.7      46    3                 43   44.9         41.9      
                          44.2   43.1              1.1      0    3              
    43   44.9         41.9                               43.1    40.9           
   1.2      38.45    4              43   41.9         40.9              38.45   
  43.1   40.9              1.0      38.45

you see guys, the NEW_AR is the weighted mean of the OLD_AR and its weights are 
in the column WEIGHTS.
If you see the column LENGTH in the original data frame you can see, that the 
values are different, with the "normalization" we try to make that LENGTH 
uniform, in this case we choose the value 3.0 of course the last value of each 
borehole data could had a different LENGTH in this case 1.0
What I have done to achieve the result
OK guys in first place I have to say, I am not a professional and I am still 
learning  how to use R,
my approximation is not elegant, I am trying to take the start and end of each 
borehole and use the function skeleton what I wrote, to create an uniform 
skeleton for the whole dataframe.
    skeleton<-function(DF,LEN){    # define function to create a new skeleton   
 divide.int<-function(FROM,TO,div){    n=as.integer((FROM-TO)/div)+1    
from=seq(FROM,(FROM-(n-1)*div),-div)    
to=seq(FROM-(n-(n-1))*div,FROM-(n-1)*div,-div)    to[n]=TO    
range<-data.frame(BHID=borehole_names[i,1],FROM=from,TO=to) # create a 
data.frame class object    range<-range[!(range$FROM==range$TO),] # erase the 
last value    }    # subset the data set for every borehole    
borehole_names<-unique(DF["BHID"]) # collars id with cores    
borehole_number<-nrow(borehole_names)  # collar number    #define an empty 
data.frame     
borehole_Out<-data.frame(BHID=integer(),FROM=numeric(),TO=numeric())    # 
initialize the counter    i=1    # from this point starts the 
loop---------------    while(i<=borehole_number){    DFi <- subset(DF, BHID 
%in% borehole_names[i,1]) # Individual data frame for each boreholes    # take 
the beginning and end of every BOREHOLE    startBH<-head(DFi$FROM,1)    
endBH<-t!
 ail(DFi$TO,1)    # create the normalized intervals    
borehole_i<-divide.int(FROM=startBH,TO=endBH,div=LEN)    
borehole_Out<-rbind(borehole_Out,borehole_i)    i=i+1    }    borehole_Out    } 
   # TEST------------------------------------------    
TEST<-skeleton(DF=DF,LEN=3.0)    TEST$LENGTH=TEST$FROM-TEST$TO
later I am trying to use the packages PLYR or DATA.TABLE to calculate the 
weighted means in AR but as I said I just started to use R and don't understand 
yet how this packages work
again thanks in advanced and sorry for my bumpy english



                                          
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[R] create a new dataframe with intervals and computing a weighted average for each of its rows

Reply via email to