Re: [R] Applying function to multiple data

Ivan Calandra Thu, 03 Mar 2011 07:24:15 -0800

Hi,

It might not be the best approach, but here is what I would do.


##########

1) If you have your data in 3 different data.frames:

#create a named list where each element is one of your data.frame
list_df <- vector(mode="list", length=3)
names(list_df) <- c("Bank", "Corporate", "Sovereign")

list_df[[1]] <- data.frame(k = c(1:8), ratings = c("A", "B", "C", "D","E", "F", "G","H"), default_frequency =c(0.00229,0.01296,0.01794,0.04303,0.04641,0.06630,0.06862,0.06936))list_df[[2]] <- data.frame(k = c(1:8), ratings = c("A", "B", "C", "D","E", "F", "G","H"), default_frequency =c(0.00101,0.01433,0.02711,0.03701,0.04313,0.05600,0.06041,0.07112))list_df[[3]] <- data.frame(k = c(1:8), ratings = c("A", "B", "C", "D","E", "F", "G","H"), default_frequency =c(0.00210,0.01014,0.02001,0.04312,0.05114,0.06801,0.06997,0.07404))

#apply your function DP to each element of the list, i.e. to eachdata.frame:out1 <- lapply(list_df, FUN=function(x) DP(k=x$k,ODF=x$default_frequency, ratings=x$ratings))


##########

2) If you have your data in a single data.frame, as it looks from yourexample, I would first fill all the cells, so that it looks like this:

df2 <- structure(list(Class = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L),.Label = c("Bank", "Corporate", "Sovereign"), class = "factor"), k =c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L,2L, 3L, 4L, 5L, 6L, 7L, 8L), rating = structure(c(1L, 2L, 3L, 4L, 5L,6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,8L), .Label = c("A", "B", "C", "D", "E", "F", "G", "H"), class ="factor"), default_frequency = c(0.00229, 0.01296, 0.01794, 0.04303,0.04641, 0.0663, 0.06862, 0.06936, 0.00101, 0.01433, 0.02711, 0.03701,0.04313, 0.056, 0.06041, 0.07112, 0.0021, 0.01014, 0.02001, 0.04312,0.05114, 0.06801, 0.06997, 0.07404)), .Names = c("Class", "k","ratings", "default_frequency"), class = "data.frame", row.names = c(NA,-24L))


#then split by Class:
list_df2 <- split(df2, df2$Class)
#and apply as before:

out2 <- lapply(list_df2, FUN=function(x) DP(k=x$k,ODF=x$default_frequency, ratings=x$ratings))


#or in one step using plyr:
library(plyr)

out3 <- dlply(.data=df2, .variables="Class", .fun=function(x) DP(k=x$k,ODF=x$default_frequency, ratings=x$ratings))



##########

3) all solutions give the same results:

all.equal(out1, out2, check.attributes=FALSE)
[1] TRUE
all.equal(out1, out3, check.attributes=FALSE)
[1] TRUE
all.equal(out2, out3, check.attributes=FALSE)
[1] TRUE


HTH,
Ivan




Le 3/3/2011 11:06, Akshata Rao a écrit :

Dear R helpers,

I know R language at a preliminary level. This is my first post to this R
forum. I have recently learned the use of function and have been successful
in writing few on my own. However I am not able to figure out how to apply
the function to multiple sets of data.

# MY QUERY

Suppose I am having following data.frame

df = data.frame(k = c(1:8), ratings = c("A", "B", "C", "D", "E", "F", "G",
"H"),
default_frequency =
c(0.00229,0.01296,0.01794,0.04303,0.04641,0.06630,0.06862,0.06936))

# -------------------------------

DP = function(k, ODF, ratings)

{

n<-  length(ODF)
tot_klnODF<-  sum(k*log(ODF))
tot_k<-  sum(k)
tot_lnODF<-  sum(log(ODF))
tot_k2<-  sum(k^2)
slope<-  exp((n * tot_klnODF - tot_k * tot_lnODF)/(n * tot_k2 -
tot_k^2))
intercept<-  exp((tot_lnODF - log(slope)* tot_k)/n)
IPD<-  intercept * slope^k

return(data.frame(ratings = ratings, default_probability = round(IPD, digits
= 4)))

}

result = DP(k = df$k, ODF = df$default_frequency, ratings = df$ratings)

#
________________________________________________________________________________________

The above code fetches me following result. However, I am dealing with only
one set of data here as defined in 'df'.

result

   ratings default_probability
1       A              0.0061
2       B              0.0094
3       C              0.0145
4       D              0.0222
5       E              0.0342
6       F              0.0527
7       G              0.0810
8       H              0.1247


# MY PROBLEM

Suppose I have data as given below

Class            k      rating      default_frequency
Bank            1         A            0.00229
                    2         B             0.01296
                    3         C             0.01794
                    4         D             0.04303
                    5         E             0.04641
                    6         F             0.06630
                   7         G             0.06862
                   8         H             0.06936
Corporate    1         A             0.00101
                   2         B             0.01433
                   3         C             0.02711
                   4         D             0.03701
                   5         E             0.04313
                   6         F             0.05600
                   7         G             0.06041
                   8         H             0.07112
Sovereign    1         A             0.00210
                   2         B             0.01014
                   3         C             0.02001
                   4         D             0.04312
                   5         E             0.05114
                   6         F             0.06801
                   7         G             0.06997
                   8         H             0.07404

So I need to use the function "DP" defined above to generate three sets of
results viz. for Bank, Corporate, Sovereign and save each of these results
as diffrent csv files say as bank.csv, corporate.csv etc. Again please note
that there could be say 'm' number of classes. I was trying to use the apply
function but things are not working for me. I will really apprecaite the
guidenace. I hope I am able to put up my query in a neat manner.

Regards and thanking you all in advance.

Akshata Rao

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**********
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Applying function to multiple data

Reply via email to