Hi,
It might not be the best approach, but here is what I would do.
##########
1) If you have your data in 3 different data.frames:
#create a named list where each element is one of your data.frame
list_df <- vector(mode="list", length=3)
names(list_df) <- c("Bank", "Corporate", "Sovereign")
list_df[[1]] <- data.frame(k = c(1:8), ratings = c("A", "B", "C", "D",
"E", "F", "G","H"), default_frequency =
c(0.00229,0.01296,0.01794,0.04303,0.04641,0.06630,0.06862,0.06936))
list_df[[2]] <- data.frame(k = c(1:8), ratings = c("A", "B", "C", "D",
"E", "F", "G","H"), default_frequency =
c(0.00101,0.01433,0.02711,0.03701,0.04313,0.05600,0.06041,0.07112))
list_df[[3]] <- data.frame(k = c(1:8), ratings = c("A", "B", "C", "D",
"E", "F", "G","H"), default_frequency =
c(0.00210,0.01014,0.02001,0.04312,0.05114,0.06801,0.06997,0.07404))
#apply your function DP to each element of the list, i.e. to each
data.frame:
out1 <- lapply(list_df, FUN=function(x) DP(k=x$k,
ODF=x$default_frequency, ratings=x$ratings))
##########
2) If you have your data in a single data.frame, as it looks from your
example, I would first fill all the cells, so that it looks like this:
df2 <- structure(list(Class = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L),
.Label = c("Bank", "Corporate", "Sovereign"), class = "factor"), k =
c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L), rating = structure(c(1L, 2L, 3L, 4L, 5L,
6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L), .Label = c("A", "B", "C", "D", "E", "F", "G", "H"), class =
"factor"), default_frequency = c(0.00229, 0.01296, 0.01794, 0.04303,
0.04641, 0.0663, 0.06862, 0.06936, 0.00101, 0.01433, 0.02711, 0.03701,
0.04313, 0.056, 0.06041, 0.07112, 0.0021, 0.01014, 0.02001, 0.04312,
0.05114, 0.06801, 0.06997, 0.07404)), .Names = c("Class", "k",
"ratings", "default_frequency"), class = "data.frame", row.names = c(NA,
-24L))
#then split by Class:
list_df2 <- split(df2, df2$Class)
#and apply as before:
out2 <- lapply(list_df2, FUN=function(x) DP(k=x$k,
ODF=x$default_frequency, ratings=x$ratings))
#or in one step using plyr:
library(plyr)
out3 <- dlply(.data=df2, .variables="Class", .fun=function(x) DP(k=x$k,
ODF=x$default_frequency, ratings=x$ratings))
##########
3) all solutions give the same results:
all.equal(out1, out2, check.attributes=FALSE)
[1] TRUE
all.equal(out1, out3, check.attributes=FALSE)
[1] TRUE
all.equal(out2, out3, check.attributes=FALSE)
[1] TRUE
HTH,
Ivan
Le 3/3/2011 11:06, Akshata Rao a écrit :
Dear R helpers,
I know R language at a preliminary level. This is my first post to this R
forum. I have recently learned the use of function and have been successful
in writing few on my own. However I am not able to figure out how to apply
the function to multiple sets of data.
# MY QUERY
Suppose I am having following data.frame
df = data.frame(k = c(1:8), ratings = c("A", "B", "C", "D", "E", "F", "G",
"H"),
default_frequency =
c(0.00229,0.01296,0.01794,0.04303,0.04641,0.06630,0.06862,0.06936))
# -------------------------------
DP = function(k, ODF, ratings)
{
n<- length(ODF)
tot_klnODF<- sum(k*log(ODF))
tot_k<- sum(k)
tot_lnODF<- sum(log(ODF))
tot_k2<- sum(k^2)
slope<- exp((n * tot_klnODF - tot_k * tot_lnODF)/(n * tot_k2 -
tot_k^2))
intercept<- exp((tot_lnODF - log(slope)* tot_k)/n)
IPD<- intercept * slope^k
return(data.frame(ratings = ratings, default_probability = round(IPD, digits
= 4)))
}
result = DP(k = df$k, ODF = df$default_frequency, ratings = df$ratings)
#
________________________________________________________________________________________
The above code fetches me following result. However, I am dealing with only
one set of data here as defined in 'df'.
result
ratings default_probability
1 A 0.0061
2 B 0.0094
3 C 0.0145
4 D 0.0222
5 E 0.0342
6 F 0.0527
7 G 0.0810
8 H 0.1247
# MY PROBLEM
Suppose I have data as given below
Class k rating default_frequency
Bank 1 A 0.00229
2 B 0.01296
3 C 0.01794
4 D 0.04303
5 E 0.04641
6 F 0.06630
7 G 0.06862
8 H 0.06936
Corporate 1 A 0.00101
2 B 0.01433
3 C 0.02711
4 D 0.03701
5 E 0.04313
6 F 0.05600
7 G 0.06041
8 H 0.07112
Sovereign 1 A 0.00210
2 B 0.01014
3 C 0.02001
4 D 0.04312
5 E 0.05114
6 F 0.06801
7 G 0.06997
8 H 0.07404
So I need to use the function "DP" defined above to generate three sets of
results viz. for Bank, Corporate, Sovereign and save each of these results
as diffrent csv files say as bank.csv, corporate.csv etc. Again please note
that there could be say 'm' number of classes. I was trying to use the apply
function but things are not working for me. I will really apprecaite the
guidenace. I hope I am able to put up my query in a neat manner.
Regards and thanking you all in advance.
Akshata Rao
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de
**********
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.