Re: [R] lmom package - Resending the email
Dear Dalgaard sir, Thanks a lot for detailed clarification. It indeed is very enlightening and will be very useful for me in future. And your suggestion is well taken. Thanks again. Regards Katherine On Thu, 4/12/14, peter dalgaard wrote: Subject: Re: [R] lmom package - Resending the email To: "Simon Zehnder" Date: Thursday, 4 December, 2014, 2:04 PM lmom is based on L-moments, which are different from ordinary moments, except for the 1st one. It would be truly miraculous if it gave the same result as the ordinary method of moments or maximum likelihood. Estimates of any distributional parameter requires that the model actually fits the data, and in your case a qqnorm(amounts) shows that they are certainly not normal. In such cases, the L-moment estimator of the std.dev. is not necessarily an estimate of the std.dev. of the actual distribution. A lognormal distribution seems to fit the data better. However, the L-moments suggest a value for zeta (the lower bound) of 3226 which is well inside the range of the actual data. In fact there are 16 observations that are less than 3226. Maximum likelihood would never do that, but the same sort of effect is well-known for the ordinary method of moments. In short, you need to study the theory before you appply its results. - Peter D. On 03 Dec 2014, at 10:57 , Simon Zehnder wrote: > Katherine, > > for a deeper understanding of differing values it makes sense to provide the list at least with an online description of the corresponding functions used in Minitab and SPSS… > > Best > Simon > On 03 Dec 2014, at 10:45, Katherine Gobin via R-help wrote: > >> Dear R forum >> I sincerely apologize as my earlier mail with the captioned subject, since all the values got mixed up and the email is not readable. I am trying to write it again. >> My problem is I have a set of data and I am trying to fit some distributions to it. As a part of this exercise, I need to find out the parameter values of various distributions e.g. Normal distribution, Log normal distribution etc. I am using lmom package to do the same, however the parameter values obtained using lmom pacakge differ to a large extent from the parameter values obtained using say MINITAB and SPSS as given below - >> _ >> >> amounts = c(38572.5599129508,11426.6705314315,21974.1571641187,118530.32782443,3735.43055996748,66309.5211176106,72039.2934132668,21934.8841708626,78564.9136114375,1703.65825161293,2116.89180930203,11003.495671332,19486.3296339113,1871.35861218795,6887.53851253407,148900.978055447,7078.56497101651,79348.1239806592,20157.6241066905,1259.99802108593,3934.45912233674,3297.69946631591,56221.1154121067,13322.0705174134,45110.2498756567,31910.3686613912,3196.71168501252,32843.0140437202,14615.1499458453,13013.9915051561,116104.176753387,7229.03056392023,9833.37962177814,2882.63239493673,165457.372543821,41114.066453219,47188.1677766245,25708.5883755617,82703.7378298092,8845.04197017415,844.28834047836,35410.8486123933,19446.3808445684,17662.2398792892,11882.8497070776,4277181.17817307,30239.0371267968,45165.7512343364,22102.8513746687,5988.69296597127,51345.0146170238,1275658.35495898,15260.4892854214,8861.76578480635,37647.1638704867,4979.53544046949,7012.48134772332 ,3385.20612391205,1911.03114395959,66886.5036605189,2223.47536156462,814.947809578378,234.028589468841,5397.4347625133,13346.3226579065,28809.3901352898,6387.69226236731,5639.42730553242,2011100.92675507,4150.63707173462,34098.7514446498,3437.10672573502,289710.315303182,8664.66947305203,13813.3867161134,208817.521491857,169317.624400274,9966.78447705792,37811.1721605562,2263.19211279927,80434.5581206454,19057.8093104899,24664.5067589624,25136.5042354789,3582.85741610706,6683.13898432794,65423.9991390846,134848.302304064,3018.55371579808,546249.641168158,172926.689143006,3074.15064180208,1521.70624812788,59012.4248281661,21226.928522236,17572.5682970983,226.646947337851,56232.2982652019,14641.0043361533,6997.94414914865) >> >> library(lmom) >> lmom = samlmu(amounts) >> # __ >> # Normal Distribution parameters >> parameters_of_NOR <- pelnor(lmom); parameters_of_NOR >> >> mu sigma 115148.4 175945.8 >> Location Scale Minitab 115148.4 485173SPSS 115148.4 485173 >> # __ >> # Log Normal (3 Parameter) Distribution parameters >> zeta mu sigma 3225.798890 9.114879 2.240841 >> Location Scale
[R] VGAM package : Frechet distribution - 2 parameter estimation
Dear R forum, I am trying to execute following code (Page no 259 - VGAM.pdf) # . library(VGAM) set.seed(123) fdata <- data.frame(y1 = rfrechet(nn <- 1000, shape = 2 + exp(1))) with(fdata, hist(y1)) fit2 <- vglm(y1 ~ 1, frechet, data = fdata, trace = TRUE) # . However, I receive following error Error in vglm(y1 ~ 1, frechet, data = fdata, trace = TRUE) : object 'frechet' not found Earlier there used to be a function called "frechet3" which I guess has been withdrawn by VGAM. Kindly guide Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] VGAM package : Frechet distribution - 2 parameter estimation
Dear Mr Michael, Thanks a lot for your guidance. The pdf file describing VGAM package has mentioned 'frechet' in the example, so I got the error. Regards Katherine On Thursday, 6 November 2014 2:54 PM, Michael Dewey wrote: On 06/11/2014 06:04, Katherine Gobin wrote: > Dear R forum, > > I am trying to execute following code (Page no 259 - VGAM.pdf) > > # > . > > library(VGAM) > > set.seed(123) > fdata <- data.frame(y1 = rfrechet(nn <- 1000, shape = 2 + exp(1))) > with(fdata, hist(y1)) > fit2 <- vglm(y1 ~ 1, frechet, data = fdata, trace = TRUE) > > # > . > Is it not called frechet2? > > However, I receive following error > > Error in vglm(y1 ~ 1, frechet, data = fdata, trace = TRUE) : >object 'frechet' not found > > > Earlier there used to be a function called "frechet3" which I guess has been > withdrawn by VGAM. > > Kindly guide > > Katherine > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > - > No virus found in this message. > Checked by AVG - www.avg.com > Version: 2015.0.5557 / Virus Database: 4189/8518 - Release Date: 11/05/14 > > -- Michael http://www.dewey.myzen.co.uk [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Double return statement
Dear R forum, I have a function which generates say two outputs, say output_1 and output_2. Output_1 is a single row output whereas Output_2 is a dataframe having multiple records. Is it possible to use two return statements in function. Output_2 uses some records from output_1, hence I need to have these outputs generated from the same function. Thanking in advance With warm regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Display of data points in the Scatterplot
Respected R forum I am learning R and relatively quite new to R. I am generating a scatter-plot as given below. (My actual table is much larger). # Sample data frame y = c(20, 23, 17, 31, 68) x = c(200, 300, 400, 500, 600) plot(x, y, type = 'l') If I plot this scatter-plot in excel, the data values are displayed if I place the cursor at some desired place of the graph. E.g. if I place the cursor say at the point (400, 31), then the value (400, 31) is displayed. My question is (A) once I plot a graph in R, is it possible to display a particular (x, y) co-ordinate by placing the cursor there? (B) Suppose I have 100 pairs of (x, y ). then is it possible to display in the graph (irrespective of the curosr position) the values of (x, y) corresponding to say 10th, 20th, 30th, 40th etc. observations in the graph. Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to count the nos. in a range?
Dear R forum I have a following vector of random no.s x = runif(100, 0.01, 0.99) [1] 0.47212037 0.77867992 0.33947474 0.93369035 [5] 0.03720073 0.79307831 0.81801835 0.92710688 . I need to count the random no. falling in the range (0 - 0.10), (0.10 - 0.20), (0.20 - 0.30)..upto (0.90 - 1) Thus, I need to have a data frame as range frequency 0 - 0.10 ... 0.10 - 0.20 ... .. 0.90 - 1 . I understand I need to write my code and ask for some help if the need be. But I am simply clueless at the moment. Kindly guide. Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Can data.frame be saved as image?
Dear R forum I have one stupid question, but I have no other solution to it in sight? Suppose some R process creates graphs etc alongwith main output as data.frame e.g output1 = data.frame(bands = c("A", "B", "C"), results = c(74, 108, 65)) I normally save this output as some csv file. But I need to save this output as some image (I understand this is weird, but I need to find out some way to do so) e.g. for graph, I use 'png' as png("histogram.png", width=480,height=480) . .. dev.off() Please advise. Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can data.frame be saved as image?
Dear Sir, Thanks a lot for your suggestion. In the meantime I came across http://stackoverflow.com/questions/10587621/how-to-print-to-paper-a-nicely-formatted-data-frame and got to know about the package "gridExtra" So I used following code png(filename = "output1.png", width=480,height=480) grid.table(output1) dev.off() And that solved it. Thanks again Sir for suggesting other two pacakages 'lattice' or 'ggplot2' as definitely I will like to decipher these two. I understand before posting the mail to the forum, I should have tried old mails etc, but I was bit desperate to know the solution and somehow I felt it's a stupid thing to do so. I will remember it next time. Regards Katherine --- On Fri, 21/12/12, jim holtman wrote: From: jim holtman Subject: Re: [R] Can data.frame be saved as image? To: "Katherine Gobin" Cc: r-help@r-project.org Date: Friday, 21 December, 2012, 2:39 PM do you want to save the dataframe used in the plot and then the plot itself? If so consider using 'lattice' or 'ggplot2' which create an object for "print" and this would allow you to use 'save' to save both objects in a file. If you want to generate the 'png' file, the you would have to 'save' the dataframe and then 'zip' the .RData and png file into a new file. So what is it that you intend to do with the data that is saved in the common file? On Fri, Dec 21, 2012 at 8:59 AM, Katherine Gobin wrote: > Dear R forum > > I have one stupid question, but I have no other solution to it in sight? > > Suppose some R process creates graphs etc alongwith main output as data.frame > e.g > > output1 = data.frame(bands = c("A", "B", "C"), results = c(74, 108, 65)) > > I normally save this output as some csv file. > > But I need to save this output as some image (I understand this is weird, but > I need to find out some way to do so) e.g. for graph, I use 'png' as > > png("histogram.png", width=480,height=480) > > . > > .. > > dev.off() > > Please advise. > > Regards > > Katherine > > > [[alternative HTML version deleted]] > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lmomco package - Random number generation using Wakeby distribution
Dear R forum >From the given data, I have estimated the parameters of Wakeby distribution >using lmomco package as library(lmomco) (amounts <- read.csv("input_S.csv")$amount) # ___ # Wakeby distribution - Parameter estimation N = length(amounts) lmr = lmom.ub(amounts) parameters_of_Wakeby = parwak(lmr) > parameters_of_Wakeby $type [1] "wak" $para xi alpha 1.18813927666405e+04 0.00e+00 beta gamma 0.00e+00 8.11391042554567e+04 delta 9.57554297149062e-01 This means the scale parameters are 0. However, assuming, all the five parameters of Wakeby distribution (viz. location parameter m (xi), the scale parameters a, b, and shape parameters g and d are available. Then, how do I generate say 100 random no.s using Wakeby distribution w.r.t. these 5 available parameters. I couldn't find any information about this in lmomco. Kindly guide if random no.s can be generated or not and if yes, how it can be done in r. Thanking in advance Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lmomco package - Random number generation using Wakeby distribution
Dear Sir, Thanks a lot for your eye-opener reply. I was just thinking of our usual commands like rnorm, runif etc. So I was wondering if there exists something like rwakeby etc. And lastly, I have calculated the parameters using > lmr = lmom.ub(amounts) > parameters_of_Wakeby = parwak(lmr) whereas you have mentioned lmom2par(), Will it create different set of parameters? Actually I am travelling and don't have R installed on the laptop I am carrying with me to verify ther results. Regards Katherine --- On Mon, 21/1/13, David Winsemius wrote: From: David Winsemius Subject: Re: [R] lmomco package - Random number generation using Wakeby distribution To: "Katherine Gobin" Cc: r-help@r-project.org Date: Monday, 21 January, 2013, 7:46 PM On Jan 21, 2013, at 10:30 AM, Katherine Gobin wrote: > Dear R forum > >> From the given data, I have estimated the parameters of Wakeby distribution >> using lmomco package as > > library(lmomco) > > (amounts <- read.csv("input_S.csv")$amount) > > # ___ > > # Wakeby distribution - Parameter estimation > > N = > length(amounts) > lmr = lmom.ub(amounts) > parameters_of_Wakeby = parwak(lmr) It appears you have a) not included the code that produced that output and b) failed to read the Index page for that package help(package="lmomco") help(package="lmomco") ?rlmomco # Random Deviates of a Distribution So on the assumption that you have an object in your workspace named "parameters_of_Wakeby" and it is an lmomco produced object like that returned by lmom2par() I would try: rlmomco(100, parameters_of_Wakeby) > >> parameters_of_Wakeby > > $type > [1] > "wak" > > $para > xi alpha > 1.18813927666405e+04 0.00e+00 > beta gamma > 0.00e+00 8.11391042554567e+04 > delta > 9.57554297149062e-01 > > This means the scale parameters are 0. > > However, assuming, all the five parameters of Wakeby distribution (viz. > location parameter m (xi), the scale parameters a, b, and shape parameters g > and d are available. > > Then, how do I generate say 100 random no.s using Wakeby distribution w.r.t. > these > 5 available parameters. > > I couldn't find any information about this in lmomco. Kindly guide if random > no.s can be generated or not and if yes, how it can be done in r. You should have been able to find this with: help.search("random", package="lmomco") -- David Winsemius Alameda, CA, USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to extract values of results in gamlss.tr
Dear R helpers, I have following loss data and I need to fit LEFT truncated Log Normal distribution to this data which is Truncated at 100. dat = c(1333834,5710254,9987567,7809469,6940935,3473671,1270209,1102523,1124002, 5830159,4302300,3925242,2638409,2324421,7238436,9088709,7439250,4976551,4864319, 8741334,1863770,7098310,4942288,4971829,4986372) library(gamlss.tr) gen.trun(5, LOGNO) result <- gamlss(dat~1, family=LOGNOtr) # THIS GIVES > result Family: c("LOGNOtr", "left truncated Log Normal") Fitting method: RS() Call: gamlss(formula = dat ~ 1, family = LOGNOtr) Mu Coefficients: (Intercept) 15.23 Sigma Coefficients: (Intercept) -0.3977 Degrees of Freedom for the fit: 2 Residual Deg. of Freedom 23 Global Deviance: 812.568 AIC: 816.568 SBC: 819.006 My problem is how do I extract these values of Mu Coefficients and Sigma Coefficients, if I want to use these values for further analyses? Kindly guide Katherine Gobin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Counting various elemnts in a vactor
Dear R forum I have a vector say as given below df = c("F", "C", "F", "B", "D", "A", "D", "D", "A", "F", "D", "F", "B", "C") I need to find (1) how many times each element occurs? e.g. in above vector F occurs 4 times, C occurs 2 times etc. (2) Depending on the number of occurrences, I need to repeat the element 100 times of the occurrences e.g. I need to repeat F 6 * 100 = 600 times, C 2*100 = 200 times. I can manage the second part i.e. repeating but I am not able to count the number of times the element is appearing in a given vector. Kindly guide Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting various elemnts in a vactor
Dear Sir, Thanks a lot for your great help. I couldn't have figured it out. Thanks again. Regards Katherine --- On Tue, 26/3/13, D. Rizopoulos wrote: From: D. Rizopoulos Subject: Re: [R] Counting various elemnts in a vactor To: "Katherine Gobin" Cc: "r-help@r-project.org" Date: Tuesday, 26 March, 2013, 8:23 AM try this: df <- c("F", "C", "F", "B", "D", "A", "D", "D", "A", "F", "D", "F", "B", "C") tab <- table(df) tab rep(names(tab), 100 * tab) I hope it helps. Best, Dimitris On 3/26/2013 9:12 AM, Katherine Gobin wrote: > Dear R forum > > I have a vector say as given below > > df = c("F", "C", "F", "B", "D", "A", "D", "D", "A", "F", "D", "F", "B", > "C") > > I need to find > > (1) how many times each element occurs? e.g. in above vector F occurs 4 > times, C occurs 2 times etc. > > (2) Depending on the number of occurrences, I need to repeat the element 100 > times of the occurrences e.g. I need to repeat F 6 * 100 = 600 times, C 2*100 > = 200 times. > > I can manage the second part i.e. repeating but I am not able to count the > number of times the element is appearing in a given vector. > > Kindly guide > > Katherine > > > > > > > > > > > > [[alternative HTML version deleted]] > > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Archieve of mails from R forum
Dear R helpers, Everyday I do receive many many mails from R forum and after some period of times, INBOX is filled with numerous mails. At times if for some period of time, I haven't accessed mails, it becomes difficult to keep track of mails and many times simply due to the volume (and owing to the lack of time due to office constraints), I have to simply delete the mails without opening them and I understand this is a huge loss. If in case I wish to refer to all the old emails that have been appeared in the R forum, where do I get these? Is there any list where I will get subject-wise of thread-wise archive of old emails? I understand that will be an ocean of quality information and one can learn a lot from these old mails and I don't need to keep track of my emails all the time. Kindly guide. Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Archieve of mails from R forum
Dear Sir, Thanks a lot for the input. I am sure it will go a long way for me to understand R. Thanks again. Regards Katherine --- On Wed, 27/3/13, Mason wrote: From: Mason Subject: Re: [R] Archieve of mails from R forum To: "Marc Schwartz" Cc: "Katherine Gobin" , "r-help@r-project.org help" Date: Wednesday, 27 March, 2013, 7:30 PM http://r-help.markmail.org/ has a nice interface for searching the archives, too. On Wed, Mar 27, 2013 at 12:18 PM, Marc Schwartz wrote: On Mar 27, 2013, at 1:58 PM, Katherine Gobin wrote: > Dear R helpers, > > Everyday I do receive many many mails from R forum and after some period of > times, INBOX is filled with numerous mails. At times if for some period of > time, I haven't accessed mails, it becomes difficult to keep track of mails > and many times simply due to the volume (and owing to the lack of time due to > office constraints), I have to simply delete the mails without opening them > and I understand this is a huge loss. > > If in case I wish to refer to all the old emails that have been appeared in > the R forum, where do I get these? Is there any list where I will get > subject-wise of thread-wise archive of old emails? I understand that will be > an ocean of quality information and one can learn a lot from these old mails > and I don't need to keep track of my emails all the time. > > > Kindly guide. > > Regards > > Katherine > The official archives for R-Help are here: https://stat.ethz.ch/pipermail/r-help/ and these are mirrored in various locations, such as: http://www.mail-archive.com/r-help@stat.math.ethz.ch/ http://dir.gmane.org/gmane.comp.lang.r.general You can also search the archives for all R lists at: http://rseek.org/ http://finzi.psych.upenn.edu/search.html http://tolstoy.newcastle.edu.au/R/ My recommendation would be to set up a mail filter or rule (using r-help@r-project.org in the the sender and cc: address fields) so that the list e-mails are automatically moved from your main inbox to a folder just for these e-mails and you can then browse them as your schedule permits, rather than having them interspersed with other e-mails in the same location. I do this with a number of the R related lists and have a folder for each one to keep them separated. Most e-mail clients and/or online services have some type of filtering or rule configuration available to do this. Regards, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to delete Identical columns
Dear R forum Suppose I have a data.frame df = data.frame(id = c(1:6), x = c(15, 21, 14, 21, 14, 38), y = c(36, 38, 55, 11, 5, 18), x.1 = c(15, 21, 14, 21, 14, 38), z = c("D", "B", "A", "F", "H", "P")) > df id x y x.1 z 1 1 15 36 15 D 2 2 21 38 21 B 3 3 14 55 14 A 4 4 21 11 21 F 5 5 14 5 14 H 6 6 38 18 38 P Clearly columns x and x.1 are identical. In reality, I have a large data.frame and can't make out which columns are identical, but I am sure that column with name say x is repeated as x.1, x.2 etc. How to automatically identify and retain only one column (in this example column x) among the identical columns besides other non-identical columns (viz. id, y and z). Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to delete Identical columns
Dear Sir, Thanks a lot for your wonderful solution. When I applied it my data.frame, however, it was deleting many other columns also having repeated type of column names i.e. suppose I wanted only to delete say ABC.1, ABC.2 etc. and retain XYZ, XYZ.1, XYZ2 etc. But this was not happening and alongwith ABC series, it was deleting XYZ series too. So I changed the command you had given as - df[ -grep( "\\.", names( df))] to df[ -grep( "XYZ\\.", names( df))] And it lead me to the desired result. Thanks again sir. Regards Katherine --- On Thu, 28/3/13, Gerrit Eichner wrote: From: Gerrit Eichner Subject: Re: [R] How to delete Identical columns To: "Katherine Gobin" Cc: r-help@r-project.org Date: Thursday, 28 March, 2013, 8:58 AM Hi, Katherine, IF the naming scheme of the columns of your data frame is consistently and if duplicated columns appear THEN (something like) df[ -grep( "\\.", names( df))] could help. (But it's maybe more efficient to avoid - a priori - producing duplicated columns, if the data frame is large, as you say.) Regards -- Gerrit On Thu, 28 Mar 2013, Katherine Gobin wrote: > Dear R forum > > Suppose I have a data.frame > > df = data.frame(id = c(1:6), x = c(15, 21, 14, 21, 14, 38), y = c(36, 38, 55, > 11, 5, 18), x.1 = c(15, 21, 14, 21, 14, 38), z = c("D", "B", "A", "F", "H", > "P")) > > >> df > id x y x.1 z > 1 1 15 36 15 D > 2 2 21 38 21 B > 3 3 14 55 14 A > 4 4 21 11 21 F > 5 5 14 5 14 H > 6 6 38 18 38 P > > > Clearly columns x and x.1 are identical. In reality, I have a large > data.frame and can't make out which columns are identical, but I am sure that > column with name say x is repeated as x.1, x.2 etc. > > How to automatically identify and retain only one column (in this example > column x) among the identical columns besides other non-identical columns > (viz. id, y and z). > > > Regards > > Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Better way of writing R code
Dear R forum, (Pl note this is not a finance problem) I have two data.frames as currency_df = data.frame(current_date = c("3/4/2013", "3/4/2013", "3/4/2013", "3/4/2013"), issue_date = c("27/11/2012", "9/12/2012", "14/01/2013", "28/02/2013"), maturity_date = c("27/04/2013", "3/5/2013", "14/6/2013", "28/06/2013"), currency = c("USD", "USD", "GBP", "SEK"), other_currency = c("EURO", "CAD", "CHF", "USD"), transaction = c("Buy", "Buy", "Sell", "Buy"), units_currency = c(10, 25000, 15, 4), units_other_currency = c(78000, 25350, 99200, 6150)) rate_df = data.frame(date = c("28/3/2013","27/3/2013","26/3/2013","25/3/2013","28/3/2013","27/3/2013","26/3/2013", "25/3/2013","28/3/2013","27/3/2013","26/3/2013","25/3/2013","28/3/2013","27/3/2013","26/3/2013", "25/3/2013","28/3/2013","27/3/2013","26/3/2013","25/3/2013","28/3/2013","27/3/2013","26/3/2013", "25/3/2013","28/3/2013","27/3/2013","26/3/2013","25/3/2013","28/3/2013","27/3/2013","26/3/2013", "25/3/2013","28/3/2013","27/3/2013","26/3/2013","25/3/2013"), currency = c("USD","USD","USD","USD", "USD", "USD", "USD","USD","USD","USD", "USD","USD", "GBP","GBP","GBP","GBP","GBP","GBP","GBP","GBP", "GBP","GBP", "GBP","GBP", "EURO","EURO","EURO","EURO","EURO","EURO","EURO", "EURO", "EURO","EURO", "EURO","EURO"), tenor = c("1 day","1 day","1 day","1 day","1 week","1 week","1 week","1 week","2 weeks","2 weeks","2 weeks","2 weeks","1 day","1 day","1 day","1 day","1 week","1 week","1 week","1 week","2 weeks","2 weeks","2 weeks","2 weeks","1 day","1 day","1 day","1 day","1 week","1 week","1 week","1 week","2 weeks","2 weeks","2 weeks","2 weeks"), rate = c(0.156,0.157,0.157,0.155,0.1752,0.1752,0.1752,0.1752,0.1752,0.1752,0.1752, 0.1752,0.48625, 0.485,0.48625,0.4825,0.49,0.49125,0.4925,0.49,0.49375,0.49125,0.4925, 0.49125,0.02643,0.02214, 0.02214,0.01929,0.034,0.034,0.034125,0.034,0.044,0.044, 0.041,0.045)) # ___ # 1st data.frame > currency_df current_date issue_date maturity_date currency 1 3/4/2013 27/11/2012 27/04/2013 USD 2 3/4/2013 9/12/2012 3/5/2013 USD 3 3/4/2013 14/01/2013 14/6/2013 GBP 4 3/4/2013 28/02/2013 28/06/2013 SEK other_currency transaction units_currency 1 EURO Buy 10 2 CAD Buy 25000 3 CHF Sell 15 4 USD Buy 4 units_other_currency 1 78000 2 25350 3 99200 4 6150 # ... # 2nd data.frame > rate_df date currency tenor rate 1 28/3/2013 USD 1 day 0.156000 2 27/3/2013 USD 1 day 0.157000 3 26/3/2013 USD 1 day 0.157000 4 25/3/2013 USD 1 day 0.155000 5 28/3/2013 USD 1 week 0.175200 6 27/3/2013 USD 1 week 0.175200 7 26/3/2013 USD 1 week 0.175200 8 25/3/2013 USD 1 week 0.175200 9 28/3/2013 USD 2 weeks 0.175200 10 27/3/2013 USD 2 weeks 0.175200 11 26/3/2013 USD 2 weeks 0.175200 12 25/3/2013 USD 2 weeks 0.175200 13 28/3/2013 GBP 1 day 0.486250 14 27/3/2013 GBP 1 day 0.485000 15 26/3/2013 GBP 1 day 0.486250 16 25/3/2013 GBP 1 day 0.482500 17 28/3/2013 GBP 1 week 0.49 18 27/3/2013 GBP 1 week 0.491250 19 26/3/2013 GBP 1 week 0.492500 20 25/3/2013 GBP 1 week 0.49 21 28/3/2013 GBP 2 weeks 0.493750 22 27/3/2013 GBP 2 weeks 0.491250 23 26/3/2013 GBP 2 weeks 0.492500 24 25/3/2013 GBP 2 weeks 0.491250 25 28/3/2013 EURO 1 day 0.026430 26 27/3/2013 EURO 1 day 0.022140 27 26/3/2013 EURO 1 day 0.022140 28 25/3/2013 EURO 1 day 0.019290 29 28/3/2013 EURO 1 week 0.034000 30 27/3/2013 EURO 1 week 0.034000 31 26/3/2013 EURO 1 week 0.034125 32 25/3/2013 EURO 1 week 0.034000 33 28/3/2013 EURO 2 weeks 0.044000 34 27/3/2013 EURO 2 weeks 0.044000 35 26/3/2013 EURO 2 weeks 0.041000 36 25/3/2013 EURO 2 weeks 0.045000 # ___ Using plyr and reshape libraries, I have converted the rate_df into tabular form as date USD_1 day USD_1 week USD_2 weeks GBP_1 day 1 25/3/2013 0.155 0.1752 0.1752 0.48250 2 26/3/2013 0.157 0.1752 0.1752 0.48625 3 27/3/2013 0.157 0.1752 0.1752 0.48500 4 28/3/2013 0.156 0.1752 0.1752 0.48625 GBP_1 week GBP_2 weeks EURO_1 day EURO_1 week 1 0.49000 0.49125 0.01929 0.034000 2 0.49250 0.49250 0.02214 0.034125 3 0.49125 0.49125 0.02214 0.034000 4 0.49000 0.49375 0.02643 0.034000 EURO_2 weeks 1 0.045 2 0.041 3 0.044 4 0.044 # __ Depending on the maturity period, I hav
Re: [R] Better way of writing R code
Dear Sirs, I sincerely apologize for the blunder at my end. Problem is I was told that one cannot or should not send any ATTACHMENTS. In the past, when I had tried to attach some files and the message was displayed less the attachment. Also, at times it becomes very difficult to attach the csv file. As my input files contain the csv files and since I was under the impression that we cannot attach the files to this forum. I once again apologize to all of you for the inconvenience caused. Regards Katherine --- On Thu, 4/4/13, Gabor Grothendieck wrote: From: Gabor Grothendieck Subject: Re: [R] Better way of writing R code To: "Adams, Jean" Cc: "Katherine Gobin" , "R help" Date: Thursday, 4 April, 2013, 2:48 PM On Thu, Apr 4, 2013 at 9:32 AM, Adams, Jean wrote: > Katherine, > > You should cc the R-help on all correspondence. > The more eyes that see your query, the quicker and probably the better the > response will be. > Send your message as plain text with no attachments ... so, include your > code, and use dput() to share some example data. > Although many types of attachments are not allowed it seems that .txt, .R, .png, .pdf and possibly certain other types are accepted. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Better way of writing R code
Dear Sir, Thanks a lot for your great help. Do appreciate it a lot. In my earlier mail, where I had attached some files, I have realized yesterday that instead of sending the R code customized by me based on your guidance, I had by mistake attached the contents of email. I do apologize to you for the same. Thanks once again and sorry for the inconvenience caused by me. Regards Katherine --- On Fri, 5/4/13, Adams, Jean wrote: From: Adams, Jean Subject: Re: [R] Better way of writing R code To: "Katherine Gobin" Cc: "R help" Date: Friday, 5 April, 2013, 2:40 PM Katherine, To preserve the original order, you could create a new variable for the currency data frame (BEFORE the merges), then use this variable to reorder at the end. currency_df$orig.order <- 1:dim(currency_df)[1] You can do another merge for the other currency, you just need to specify the columns that you want to merge by. The rate information will be called rate.x for the first currency (from the first merge) and rate.y for the other currency (from the second merge). both2 <- merge(both, rate_df, by.x=c("other_currency", "tenor"), by.y=c("currency", "tenor"), all.x=TRUE) Then reorder. both2 <- both2[order(both2$orig.order), ] Jean On Thu, Apr 4, 2013 at 3:19 AM, Katherine Gobin wrote: Dear Mr Adams, I sincerely apologize for taking the liberty of writing to you. I wholeheartedly thank you for the wonderful solution you had provided me yesterday. I have customized the R code you had provided and it's yielding the results. I can't imagine me repeating the 1 lines code after receving such a powerful solution from you. In future it will save lots of efforts from my side as I always deal with such situation. There is one small problem though - I am dealing with pair of currencies e.g. currency other_currency transaction USD EURO Buy USD CAD Buy GBP CHF Sell SEK USD Buy The R code gives me the currency rates (w.r.t. appropriate "tenor"), however, I need the corresponding rates pertaining to the other currency too i.e. in the first case, the maturity period applicable is one month so the R - code gives me one month LIBOR wr.t. USD, but I need the corresponding one month LIBOR w.r.t. the other currency i.e. EURO in this case. I tried to improve upon the merge statement and used "?merge", but couldn't. Another problem is the order of the original portfolio is not mainteained , but I think I can manage the order. With warm regards Katherine --- On Wed, 3/4/13, Adams, Jean wrote: From: Adams, Jean Subject: Re: [R] Better way of writing R code To: "Katherine Gobin" Cc: "R help" Date: Wednesday, 3 April, 2013, 2:08 PM Katherine, You don't need to convert rate_df into tabular form. You just need to categorize each row in currency_df into a "tenor". Then you can merge the two data frames (by currency and tenor). For example ... # convert dates to R dates, to calculate the number of days to maturity# I am assuming this is the number of days from the current date to the maturity date currency_df$maturity <- as.Date(currency_df$maturity_date, "%d/%m/%Y")currency_df$current <- as.Date(currency_df$current_date, "%d/%m/%Y")currency_df$days2mature <- as.numeric(currency_df$maturity - currency_df$current) # categorize the number of days to maturity as you wish# you may need to change the breaks= option to suit your needs# read about the cut function to make sure you get the cut points included in the proper category, ?cut currency_df$tenor <- cut(currency_df$days2mature, breaks=c(0, 1, 7, 14, seq(from=30.5, length=12, by=30.5)),labels=c("1 day", "1 week", "2 weeks", "1 month", paste(2:12, "months"))) # merge the currency_df and rate_df# this will work better with real data, since the example data you provided didn't have matching tenorsboth <- merge(currency_df, rate_df, all.x=TRUE) Jean On Wed, Apr 3, 2013 at 5:21 AM, Katherine Gobin wrote: Dear R forum, (Pl note this is not a finance problem) I have two data.frames as currency_df = data.frame(current_date = c("3/4/2013", "3/4/2013", "3/4/2013", "3/4/2013"), issue_date = c("27/11/2012", "9/12/2012", "14/01/2013", "28/02/2013"), maturity_date = c("27/04/2013", "3/5/2013", "14/6/2013", "28/06/2013"), currency = c("USD", "USD", "GBP", "SEK"), other_currency = c("EURO", "CAD", "CHF", "USD"), transactio
[R] lmomco - Three-Parameter Pearson 5 Distribution
Dear R forum, I am bit confused and please guide me - (1) Is "Pearson Type III Distribution" as given in lmomco package same as Three Parameter Pearson 5 Distribution? If not, how do I estimate the parameters of Three Parameter Pearson 5 Distribution? (2) Is there any other R forum dealing with only Statistical queries? Kindly guide Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Package ‘FAdist’ - Log-Pearson Type III Distribution
Dear Sir, I am referring to your package "FAdist". I wish to know how to estimate the parameters of the distribution - "Log-Pearson Type III Distribution"? Will it be possible for you to guide me or inform the package in R, I can use to estimate the parameters. Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sorting data.frame and again sorting within data.frame
Dear R forum, I have a data.frame as defied below - df = data.frame(names = c("C", "A", "A", "B", "C", "B", "A", "B", "C"), dates = c("4/15/2013", "4/13/2013", "4/15/2013", "4/13/2013", "4/13/2013", "4/15/2013", "4/14/2013", "4/14/2013","4/14/2013" ),values = c(10, 31, 31, 17, 11, 34, 102, 47, 29)) > df names dates values 1 C 4/15/2013 10 2 A 4/13/2013 31 3 A 4/15/2013 31 4 B 4/13/2013 17 5 C 4/13/2013 11 6 B 4/15/2013 34 7 A 4/14/2013 102 8 B 4/14/2013 47 9 C 4/14/2013 29 I need to sort df first on "names" in increasing order and then further on "dates" in a decreasing order i.e. I need names dates values A 4/15/2013 31 A 4/14/2013 102 A 4/13/2013 31 B 4/15/2013 34 B 4/14/2013 47 B 4/13/2013 17 C 4/15/2013 10 C 4/14/2013 29 C 4/13/2013 11 I tried df_sorted = df[order(df$names, (as.Date(df$dates, "%m/%d/%Y")), decreasing = TRUE),] > df_sorted names dates values 1 C 4/15/2013 10 9 C 4/14/2013 29 5 C 4/13/2013 11 6 B 4/15/2013 34 8 B 4/14/2013 47 4 B 4/13/2013 17 3 A 4/15/2013 31 7 A 4/14/2013 102 2 A 4/13/2013 31 I need A to appear first with all three corresponding dates in decreasing order, then B and so on. Please guide. With regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sorting data.frame and again sorting within data.frame
Dear Sir, Thanks a lot for your valuable input and guidance. Regards Katherine --- On Mon, 15/4/13, Jeff Newmiller wrote: From: Jeff Newmiller Subject: Re: [R] Sorting data.frame and again sorting within data.frame To: "David Winsemius" , "Katherine Gobin" Cc: r-help@r-project.org Date: Monday, 15 April, 2013, 5:33 PM Yes, that would be because she converted to Date on the fly in her example, and so apparently did not need this reminder. --- Jeff Newmiller The . . Go Live... DCN: Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. David Winsemius wrote: > >On Apr 14, 2013, at 11:01 PM, Katherine Gobin wrote: > >> Dear R forum, >> >> I have a data.frame as defied below - >> >> df = data.frame(names = c("C", "A", "A", "B", "C", "B", "A", "B", >"C"), dates = c("4/15/2013", "4/13/2013", "4/15/2013", "4/13/2013", >"4/13/2013", "4/15/2013", "4/14/2013", "4/14/2013","4/14/2013" ),values >= c(10, 31, 31, 17, 11, 34, 102, 47, 29)) >> >>> df >> names dates values >> 1 C 4/15/2013 10 >> 2 A 4/13/2013 31 >> 3 A 4/15/2013 31 >> 4 B 4/13/2013 17 >> 5 C 4/13/2013 11 >> 6 B >> 4/15/2013 34 >> 7 A 4/14/2013 102 >> 8 B 4/14/2013 47 >> 9 C 4/14/2013 29 >> >> I need to sort df first on "names" in increasing order and then >further on "dates" in a decreasing order i.e. I need >> > >So far no one has pointed out that these are not really "Dates" in the >R sense and will not sort correctly if any of the proposed methods are >applied to sequences that extend beyond6 months, i.e, until October >forward. You would be advised to convert to real Date-classed >variables. > >?strptime >?as.Date [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Splitting the Elements of character vector
Dear R forum I have a data.frame df = data.frame(currency_type = c("EURO_o_n", "EURO_o_n", "EURO_1w", "EURO_1w", "USD_o_n", "USD_o_n", "USD_1w", "USD_1w"), rates = c(0.47, 0.475, 0.461, 0.464, 1.21, 1.19, 1.41, 1.43)) currency_type rates 1 EURO_o_n 0.470 2 EURO_o_n 0.475 3 EURO_1w 0.461 4 EURO_1w 0.464 5 USD_o_n 1.210 6 USD_o_n 1.190 7 USD_1w 1.410 8 USD_1w 1.430 I need to split the values appearing under currency_type to obtain following data.frame in the "original order" currency tenor rates EURO o_n 0.470 EURO o_n 0.475 EURO 1w 0.461 EURO 1w 0.464 USD o_n 1.210 USD o_n 1.190 USD 1w 1.410 USD 1w 1.430 Basically I need to split the currency name and tenors. I tried strsplit(df$currency_type, "_") Error in strsplit(df$currency_type, "_") : non-character argument Kindly guide Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating a vector with repeating dates
Dear R forum I have a data.frame df = data.frame(dates = c("4/15/2013", "4/14/2013", "4/13/2013", "4/12/2013"), values = c(47, 38, 56, 92)) I need to to create a vector by repeating the dates as "Current_date", 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, "Current_date", 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013 i.e. I need to create a new vector as given below which I need to use for some other purpose. Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Is it possible to construct such a column? Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a vector with repeating dates
Dear Andrija Djurovic, Thanks for the suggestion. Ia m aware of "rep". However, here I need to repeat not only dates, but a string "Current_date". Thus, I need to create a vector ( to be included in some other data.frame) with the name say "dt" which will contain dt Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 So this is combination of dates and a string. Hence, I am just wondering if it is possible to create such a vector or not? Regards Katherine --- On Wed, 17/4/13, andrija djurovic wrote: From: andrija djurovic Subject: Re: [R] Creating a vector with repeating dates To: "Katherine Gobin" Cc: "r-help@r-project.org" Date: Wednesday, 17 April, 2013, 10:14 AM ?rep On Wed, Apr 17, 2013 at 11:11 AM, Katherine Gobin wrote: Dear R forum I have a data.frame df = data.frame(dates = c("4/15/2013", "4/14/2013", "4/13/2013", "4/12/2013"), values = c(47, 38, 56, 92)) I need to to create a vector by repeating the dates as "Current_date", 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, "Current_date", 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, Current_date, 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013 i.e. I need to create a new vector as given below which I need to use for some other purpose. Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Current_date 4/15/2013 4/14/2013 4/13/2013 4/12/2013 Is it possible to construct such a column? Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a vector with repeating dates
Dear Sir, Thanks a lot for your valuable suggestions and help. Regards Katherine --- On Wed, 17/4/13, Jim Lemon wrote: From: Jim Lemon Subject: Re: [R] Creating a vector with repeating dates To: "Katherine Gobin" Cc: r-help@r-project.org Date: Wednesday, 17 April, 2013, 10:35 AM On 04/17/2013 07:11 PM, Katherine Gobin wrote: > Dear R forum > > I have a data.frame > > df = data.frame(dates = c("4/15/2013", "4/14/2013", "4/13/2013", > "4/12/2013"), values = c(47, 38, 56, 92)) > > I need to to create a vector by repeating the dates as > > "Current_date", 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, "Current_date", > 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, Current_date, 4/15/2013, > 4/14/2013, 4/13/2013, 4/12/2013 > > i.e. I need to create a new vector as given below which I need to use for > some other purpose. > > Current_date > 4/15/2013 > 4/14/2013 > 4/13/2013 > 4/12/2013 > Current_date > 4/15/2013 > 4/14/2013 > 4/13/2013 > 4/12/2013 > Current_date > 4/15/2013 > 4/14/2013 > 4/13/2013 > 4/12/2013 > > Is it possible to construct such a > column? > Hi Katherine, How about: rep(c("Current date",paste(4,15:12,2013,sep="/")),3) Jim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error with function
Dear R forum, I have a data.frame as given below: df = data.frame(tran = c("tran1", "tran2", "tran3", "tran4"), tenor = c("2w", "1m", "7m", "3m")) Also, I define libor_tenor_labels = as.character(c("o_n", "1w", "2w", "1m", "2m", "3m", "4m", "5m", "6m", "7m", "8m", "9m", "10m", "11m", "12m")) # > df tran tenor 1 tran1 2w 2 tran2 1m 3 tran3 7m 4 tran4 3m # __ # libor_tenor_labels can be anything and need not be 15. Also, df need not be consisting of only 4 record. Basically, I can't HARD CODE anything. In df, first tenor is 2w. So I need to define a previous tenor as "1w" and nest tenor as "1m" i.e. I need the output > df_new tran tenor prev_tenor nxt_tenor 1 tran1 2w 1w 1m 2 tran2 1m 2w 2m 3 tran3 7m 6m 8m 4 tran4 3m 2m 4m # ___ # I have two special cases also. If the tenor is "o_n" or "12m" i.e. extremes, I needed to adjust the rates as given in code. # My code # == tenor_function = function(tran, tenor) { if (tenor == libor_tenor_labels[1]) { prev_tenor = libor_tenor_labels[1] nxt_tenor = libor_tenor_labels[2] } for (i in 2:(length(libor_tenor_labels)-1)) { if (tenor == libor_tenor_labels[i]) { prev_tenor = libor_tenor_labels[i-1] nxt_tenor = libor_tenor_labels[i+1] } } if (tenor == libor_tenor_labels[length(libor_tenor_labels)]) { prev_tenor = libor_tenor_labels[(length(libor_tenor_labels)-1)] nxt_tenor = libor_tenor_labels[length(libor_tenor_labels)] } return(data.frame(tran = tran, prev_tenor = prev_tenor, tenor = tenor, nxt_tenor = nxt_tenor) } (tenor_libors = ddply(.data = df, .variables = "tran", .fun = function(x) tenor_function(tran = x$tran, tenor = x$tenor))) # __ # ERROR - I get following error Error: unexpected '}' in: " }" > > (tenor_libors = ddply(.data = df, .variables = "tran", .fun = function(x) > tenor_function(tran = x$tran, tenor = x$tenor))) Error in .fun(piece, ...) : could not find function "tenor_function" # __ Kindly guide With warn regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fw: Error with function - USING library(plyr)
Dear R forum, Please refer to my query regarding "Error with function". I forgot to mention that I am using "plyr" library. Sorry for inconvenience. Regards Katherine --- On Tue, 23/4/13, Katherine Gobin wrote: From: Katherine Gobin Subject: [R] Error with function To: r-help@r-project.org Date: Tuesday, 23 April, 2013, 7:06 AM Dear R forum, I have a data.frame as given below: df = data.frame(tran = c("tran1", "tran2", "tran3", "tran4"), tenor = c("2w", "1m", "7m", "3m")) Also, I define libor_tenor_labels = as.character(c("o_n", "1w", "2w", "1m", "2m", "3m", "4m", "5m", "6m", "7m", "8m", "9m", "10m", "11m", "12m")) # > df tran tenor 1 tran1 2w 2 tran2 1m 3 tran3 7m 4 tran4 3m # __ # libor_tenor_labels can be anything and need not be 15. Also, df need not be consisting of only 4 record. Basically, I can't HARD CODE anything. In df, first tenor is 2w. So I need to define a previous tenor as "1w" and nest tenor as "1m" i.e. I need the output > df_new tran tenor prev_tenor nxt_tenor 1 tran1 2w 1w 1m 2 tran2 1m 2w 2m 3 tran3 7m 6m 8m 4 tran4 3m 2m 4m # ___ # I have two special cases also. If the tenor is "o_n" or "12m" i.e. extremes, I needed to adjust the rates as given in code. # My code # == tenor_function = function(tran, tenor) { if (tenor == libor_tenor_labels[1]) { prev_tenor = libor_tenor_labels[1] nxt_tenor = libor_tenor_labels[2] } for (i in 2:(length(libor_tenor_labels)-1)) { if (tenor == libor_tenor_labels[i]) { prev_tenor = libor_tenor_labels[i-1] nxt_tenor = libor_tenor_labels[i+1] } } if (tenor == libor_tenor_labels[length(libor_tenor_labels)]) { prev_tenor = libor_tenor_labels[(length(libor_tenor_labels)-1)] nxt_tenor = libor_tenor_labels[length(libor_tenor_labels)] } return(data.frame(tran = tran, prev_tenor = prev_tenor, tenor = tenor, nxt_tenor = nxt_tenor) } (tenor_libors = ddply(.data = df, .variables = "tran", .fun = function(x) tenor_function(tran = x$tran, tenor = x$tenor))) # __ # ERROR - I get following error Error: unexpected '}' in: " }" > > (tenor_libors = ddply(.data = df, .variables = "tran", .fun = function(x) > tenor_function(tran = x$tran, tenor = x$tenor))) Error in .fun(piece, ...) : could not find function "tenor_function" # __ Kindly guide With warn regards Katherine [[alternative HTML version deleted]] -Inline Attachment Follows- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fw: " PROBLEM SOLVED" - Error with function
Dear R forum Please refer to my query captioned Error with function. I had missed in bracket ")" in the return statement and hence I was getting the error. I has struggled for more than 2 hours to find out the problem and only then has posted to the forum. I sincerely apologize to all for consuming your valuable time. Thanks for the efforts at your end. Regards Katherine --- On Tue, 23/4/13, Katherine Gobin wrote: From: Katherine Gobin Subject: [R] Error with function To: r-help@r-project.org Date: Tuesday, 23 April, 2013, 7:06 AM Dear R forum, I have a data.frame as given below: df = data.frame(tran = c("tran1", "tran2", "tran3", "tran4"), tenor = c("2w", "1m", "7m", "3m")) Also, I define libor_tenor_labels = as.character(c("o_n", "1w", "2w", "1m", "2m", "3m", "4m", "5m", "6m", "7m", "8m", "9m", "10m", "11m", "12m")) # > df tran tenor 1 tran1 2w 2 tran2 1m 3 tran3 7m 4 tran4 3m # __ # libor_tenor_labels can be anything and need not be 15. Also, df need not be consisting of only 4 record. Basically, I can't HARD CODE anything. In df, first tenor is 2w. So I need to define a previous tenor as "1w" and nest tenor as "1m" i.e. I need the output > df_new tran tenor prev_tenor nxt_tenor 1 tran1 2w 1w 1m 2 tran2 1m 2w 2m 3 tran3 7m 6m 8m 4 tran4 3m 2m 4m # ___ # I have two special cases also. If the tenor is "o_n" or "12m" i.e. extremes, I needed to adjust the rates as given in code. # My code # == tenor_function = function(tran, tenor) { if (tenor == libor_tenor_labels[1]) { prev_tenor = libor_tenor_labels[1] nxt_tenor = libor_tenor_labels[2] } for (i in 2:(length(libor_tenor_labels)-1)) { if (tenor == libor_tenor_labels[i]) { prev_tenor = libor_tenor_labels[i-1] nxt_tenor = libor_tenor_labels[i+1] } } if (tenor == libor_tenor_labels[length(libor_tenor_labels)]) { prev_tenor = libor_tenor_labels[(length(libor_tenor_labels)-1)] nxt_tenor = libor_tenor_labels[length(libor_tenor_labels)] } return(data.frame(tran = tran, prev_tenor = prev_tenor, tenor = tenor, nxt_tenor = nxt_tenor) } (tenor_libors = ddply(.data = df, .variables = "tran", .fun = function(x) tenor_function(tran = x$tran, tenor = x$tenor))) # __ # ERROR - I get following error Error: unexpected '}' in: " }" > > (tenor_libors = ddply(.data = df, .variables = "tran", .fun = function(x) > tenor_function(tran = x$tran, tenor = x$tenor))) Error in .fun(piece, ...) : could not find function "tenor_function" # __ Kindly guide With warn regards Katherine [[alternative HTML version deleted]] -Inline Attachment Follows- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Linear Interpolation : Missing rates
Dear R forum I have data.frame as df = data.frame(rate_name = c("USD_1w", "USD_1w", "USD_1w", "USD_1w", "USD_1m", "USD_1m", "USD_1m", "USD_1m", "USD_2m", "USD_2m", "USD_2m", "USD_2m", "GBP_1w", "GBP_1w", "GBP_1w", "GBP_1w", "GBP_1m", "GBP_1m", "GBP_1m", "GBP_1m", "GBP_2m", "GBP_2m", "GBP_2m", "GBP_2m", "EURO_1w", "EURO_1w", "EURO_1w", "EURO_1w", "EURO_2w", "EURO_2w", "EURO_2w", "EURO_2w", "EURO_2m", "EURO_2m", "EURO_2m", "EURO_2m"), rates = c(2.05, 2.07, 2.06, 2.06, 2.22, 2.24, 2.23, 2.23, 2.31, 2.33, 2.33, 2.31, 1.06, 1.08, 1.08, 1.08, 1.21, 1.21, 1.23, 1.21, 1.41, 1.39, 1.39, 1.37, 1.82, 1.82, 1.81, 1.80, 1.98, 1.98, 1.97, 1.97, 2.1, 2.09, 2.09, 2.11)) currency = c("EURO", "GBP", "USD") tenor = c("1w", "2w", "1m", "2m", "3m") # _ > df rate_name rates rate_name rates 1 USD_1w 2.05 2 USD_1w 2.07 3 USD_1w 2.06 4 USD_1w 2.06 5 USD_1m 2.22 6 USD_1m 2.24 7 USD_1m 2.23 8 USD_1m 2.23 9 USD_2m 2.31 10 USD_2m 2.33 11 USD_2m 2.33 12 USD_2m 2.31 13 GBP_1w 1.06 14 GBP_1w 1.08 15 GBP_1w 1.08 16 GBP_1w 1.08 17 GBP_1m 1.21 18 GBP_1m 1.21 19 GBP_1m 1.23 20 GBP_1m 1.21 21 GBP_2m 1.41 22 GBP_2m 1.39 23 GBP_2m 1.39 24 GBP_2m 1.37 25 EURO_1w 1.82 26 EURO_1w 1.82 27 EURO_1w 1.81 28 EURO_1w 1.80 29 EURO_2w 1.98 30 EURO_2w 1.98 31 EURO_2w 1.97 32 EURO_2w 1.97 33 EURO_2m 2.10 34 EURO_2m 2.09 35 EURO_2m 2.09 36 EURO_2m 2.11 As can be seen that USD_2w, GBP_2w and EURO_1m are missing and I need to INTERPOLATE these rates, which can be done using approx or approxfun. In reality I can have many currencies with many tenors. Problem is when the data.frame "df" is read or accessed in R, I am not aware which tenor is missing. For a given currency, it is possible that mare than 1 consecutive tenors may be missing e.g. in case of EURO, I may have EURO_1w, EURO_2w and then EURO_4m. So EURO_1m, EURO_2m and EURO_3m are missing. I understand it's sort of vague question from me and do apologize for the same. Any suggestion please. Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear Interpolation : Missing rates
Dear Mr Adams, Thanks a lot for your solution. I understand it was very tricky and needed lot of application. Thanks again and do appreciate your efforts. Regards Katherine --- On Thu, 25/4/13, Adams, Jean wrote: From: Adams, Jean Subject: Re: [R] Linear Interpolation : Missing rates To: "Katherine Gobin" Cc: "R help" Date: Thursday, 25 April, 2013, 2:23 PM Katherine, Split the rate names into their currency and tenor parts and assign a numeric value to each tenor. Choose a model to do your approximations (I used linear regression in the example below). Use this model to generate estimates for all combinations of currency and tenor. For example: # split the rate names into currency and tenorsplitnames <- do.call(rbind, strsplit(df$rate_name, "_"))df$currency <- as.factor(splitnames[, 1]) df$tenor <- splitnames[, 2] # assign numeric value to each tenoruniquetenors <- c("1w", "2w", "1m", "2m")uniquedays <- c(7, 14, 30.5, 61) df$tenordays <- uniquedays[match(df$tenor, uniquetenors)] # fit a linear model of rate on tenordays for each currencyfit <- lm(rates ~ currency*tenordays, data=df) # estimate rates for all combinations of currency and tenorfulldf <- expand.grid(tenordays=unique(df$tenordays), currency=unique(df$currency))fulldf$est.rates = predict(fit, newdata=fulldf) # merge observed rates with estimated ratesdfwithest <- merge(df, fulldf, all=TRUE) Jean On Thu, Apr 25, 2013 at 12:33 AM, Katherine Gobin wrote: Dear R forum I have data.frame as df = data.frame(rate_name = c("USD_1w", "USD_1w", "USD_1w", "USD_1w", "USD_1m", "USD_1m", "USD_1m", "USD_1m", "USD_2m", "USD_2m", "USD_2m", "USD_2m", "GBP_1w", "GBP_1w", "GBP_1w", "GBP_1w", "GBP_1m", "GBP_1m", "GBP_1m", "GBP_1m", "GBP_2m", "GBP_2m", "GBP_2m", "GBP_2m", "EURO_1w", "EURO_1w", "EURO_1w", "EURO_1w", "EURO_2w", "EURO_2w", "EURO_2w", "EURO_2w", "EURO_2m", "EURO_2m", "EURO_2m", "EURO_2m"), rates = c(2.05, 2.07, 2.06, 2.06, 2.22, 2.24, 2.23, 2.23, 2.31, 2.33, 2.33, 2.31, 1.06, 1.08, 1.08, 1.08, 1.21, 1.21, 1.23, 1.21, 1.41, 1.39, 1.39, 1.37, 1.82, 1.82, 1.81, 1.80, 1.98, 1.98, 1.97, 1.97, 2.1, 2.09, 2.09, 2.11)) currency = c("EURO", "GBP", "USD") tenor = c("1w", "2w", "1m", "2m", "3m") # _ > df rate_name rates rate_name rates 1 USD_1w 2.05 2 USD_1w 2.07 3 USD_1w 2.06 4 USD_1w 2.06 5 USD_1m 2.22 6 USD_1m 2.24 7 USD_1m 2.23 8 USD_1m 2.23 9 USD_2m 2.31 10 USD_2m 2.33 11 USD_2m 2.33 12 USD_2m 2.31 13 GBP_1w 1.06 14 GBP_1w 1.08 15 GBP_1w 1.08 16 GBP_1w 1.08 17 GBP_1m 1.21 18 GBP_1m 1.21 19 GBP_1m 1.23 20 GBP_1m 1.21 21 GBP_2m 1.41 22 GBP_2m 1.39 23 GBP_2m 1.39 24 GBP_2m 1.37 25 EURO_1w 1.82 26 EURO_1w 1.82 27 EURO_1w 1.81 28 EURO_1w 1.80 29 EURO_2w 1.98 30 EURO_2w 1.98 31 EURO_2w 1.97 32 EURO_2w 1.97 33 EURO_2m 2.10 34 EURO_2m 2.09 35 EURO_2m 2.09 36 EURO_2m 2.11 As can be seen that USD_2w, GBP_2w and EURO_1m are missing and I need to INTERPOLATE these rates, which can be done using approx or approxfun. In reality I can have many currencies with many tenors. Problem is when the data.frame "df" is read or accessed in R, I am not aware which tenor is missing. For a given currency, it is possible that mare than 1 consecutive tenors may be missing e.g. in case of EURO, I may have EURO_1w, EURO_2w and then EURO_4m. So EURO_1m, EURO_2m and EURO_3m are missing. I understand it's sort of vague question from me and do apologize for the same. Any suggestion please. Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Splitting data.frame and saving to csv files
Dear R Forum, I have a data.frame as df = data.frame(date = c("2013-04-15", "2013-04-14", "2013-04-13", "2013-04-12", "2013-04-11"), ABC_f = c(62.80739769,81.04525895,84.65712455,12.78237251,57.61345256), LMN_d = c(21.16794336,54.6580401,63.8923307,87.59880367,87.07693716), XYZ_p = c(55.8885464,94.1358684,84.0089114,98.99746696,64.71083712), LMN_a = c(56.6768395,25.81530198,40.12268441,35.74175237,47.95892209), ABC_e = c(11.36783959,62.29651784,47.63481552,32.27820673,52.12561419), LMN_c = c(45.4484695,17.72362438,36.7690054,68.58912931,35.80767235), XYZ_zz = c(85.74755089,63.48582415,81.61107212,58.1572924,27.44132817), PQR = c(71.22867519,95.09994812,83.62437819,30.18524735,25.81804865), ABC_d = c(38.71089816,93.48216193,93.14432203,78.2738731,31.87170019), ABC_m = c(40.28473769,43.97076327,47.38761559,97.33573412,22.06884976)) > df date ABC_f LMN_d XYZ_p LMN_a ABC_e 1 2013-04-15 62.80740 21.16794 55.88855 56.67684 11.36784 2 2013-04-14 81.04526 54.65804 94.13587 25.81530 62.29652 3 2013-04-13 84.65712 63.89233 84.00891 40.12268 47.63482 4 2013-04-12 12.78237 87.59880 98.99747 35.74175 32.27821 5 2013-04-11 57.61345 87.07694 64.71084 47.95892 52.12561 LMN_c XYZ_zz PQR ABC_d ABC_m 1 45.44847 85.74755 71.22868 38.71090 40.28474 2 17.72362 63.48582 95.09995 93.48216 43.97076 3 36.76901 81.61107 83.62438 93.14432 47.38762 4 68.58913 58.15729 30.18525 78.27387 97.33573 5 35.80767 27.44133 25.81805 31.87170 22.06885 I need to identify columns with same labels and along-with the dates in the first column, save the columns in different csv files. E.g. in the above data frame, I have 4 columns beginning with ABC so I need to save these four columns with the date in the first column as ABC.csv, then LMN_d, LMN_a, LMN_c in the LMN.csv file as date, LMN_a, LMN_c, LMN_d and so on. In my actual data.frame, I won't be aware how many such rates combinations are available. If there is no matching column as "PQR", the PQR.csv file should have only date and PQR column. Kindly guide how do I split the data.frame and save the respective csv files. Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Adding elements in data.frame subsets and also subtracting an element from the rest elements in data.frame
Dear R forum I have a data.frame as cashflow_df = data.frame(instrument = c("ABC","ABC","ABC","ABC","ABC","ABC","ABC","ABC","ABC","ABC","ABC","ABC","ABC","ABC", "ABC", "PQR", "PQR", "PQR","PQR","PQR","PQR","PQR","PQR","PQR","PQR", "PQR", "PQR", "PQR","PQR", "PQR","PQR","PQR","PQR", "PQR","PQR","UVWXYZ","UVWXYZ", "UVWXYZ", "UVWXYZ", "UVWXYZ","UVWXYZ","UVWXYZ","UVWXYZ", "UVWXYZ", "UVWXYZ"), id = c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5, 1,1,2,2,3,3,4,4, 5,5), cashflow = c(5000,5000,505000,5000,5000,505000,5000,5000,505000, 5000,5000, 505000, 5000,5000,505000,500,500,500,102000,500,500,500,102000,500,500,500,102000,500,500,500,102000,500,500,500,102000,8000,808000,8000,808000,8000,808000,8000,808000,8000,808000), cashflows_pv = c(4931.054, 4479.1116, 431160.8529,4931.9604, 4485.6393, 432064.0228, 4932.5438,4489.8451,432646.2398,4932.1548,4487.0404,432257.9551,4932.6087,4490.3129,432711.0084,493.6326,474.0524,455.2489,82252.0304,493.8083,474.7543,456.4356,82744.9157,493.6003,473.9235,455.031,82161.7368,493.8175,474.7913,456.4982,82770.9849,493.8592,474.9581,456.7804,82888.4556,7451.3118,681810.5522,7462.0148,684153.4992,7441.1294,679585.9186,7426.6407,676427.7274,7427.1225,676532.6262)) # __ > cashflow_df instrument id cashflow cashflows_pv 1 ABC 1 5000 4931.0540 2 ABC 1 5000 4479.1116 3 ABC 1 505000 431160.8529 4 ABC 2 5000 4931.9604 5 ABC 2 5000 4485.6393 6 ABC 2 505000 432064.0228 7 ABC 3 5000 4932.5438 8 ABC 3 5000 4489.8451 9 ABC 3 505000 432646.2398 10 ABC 4 5000 4932.1548 11 ABC 4 5000 4487.0404 12 ABC 4 505000 432257.9551 13 ABC 5 5000 4932.6087 14 ABC 5 5000 4490.3129 15 ABC 5 505000 432711.0084 16 PQR 1 500 493.6326 17 PQR 1 500 474.0524 18 PQR 1 500 455.2489 19 PQR 1 102000 82252.0304 20 PQR 2 500 493.8083 21 PQR 2 500 474.7543 22 PQR 2 500 456.4356 23 PQR 2 102000 82744.9157 24 PQR 3 500 493.6003 25 PQR 3 500 473.9235 26 PQR 3 500 455.0310 27 PQR 3 102000 82161.7368 28 PQR 4 500 493.8175 29 PQR 4 500 474.7913 30 PQR 4 500 456.4982 31 PQR 4 102000 82770.9849 32 PQR 5 500 493.8592 33 PQR 5 500 474.9581 34 PQR 5 500 456.7804 35 PQR 5 102000 82888.4556 36 UVWXYZ 1 8000 7451.3118 37 UVWXYZ 1 808000 681810.5522 38 UVWXYZ 2 8000 7462.0148 39 UVWXYZ 2 808000 684153.4992 40 UVWXYZ 3 8000 7441.1294 41 UVWXYZ 3 808000 679585.9186 42 UVWXYZ 4 8000 7426.6407 43 UVWXYZ 4 808000 676427.7274 44 UVWXYZ 5 8000 7427.1225 45 UVWXYZ 5 808000 676532.6262 # === # My PROBLEM For a given instrument and id, I need the totals of cashflow and cashflows_pv and also the difference of (total_cashflow_pv pertaining to the first ID for the given instrument from total_cashflow_pv for the same instrument) as shown in the fourth column of following output. output instrument id total_cashflow total_cashflow_pv 1 ABC 1 515000 440571.02 2 ABC 2 515000 441481.62 3 ABC 3 515000 442068.63 4 ABC 4 515000 441677.15 5 ABC 5 515000 442133.93 6 PQR 1 103500 83674.96 7 PQR 2 103500 84169.91 8 PQR 3 103500 83584.29 9 PQR 4 103500 84196.09 10 PQR 5 103500 84314.05 11 UVWXYZ 1 816000 689261.86 12 UVWXYZ 2 816000 691615.51 13 UVWXYZ 3 816000 687027.05 14 UVWXYZ 4 816000 683854.37 15 UVWXYZ 5 816000 683959.75 cashflow_change 1 0. # This is (440571.02 - 440571.02) 1st ID value - 1st ID value for ABC 2 910.6040 # This is (441481.62 - 440571.02) 2nd ID value - 1st ID value for ABC 3 1497.6102 # This is (442068.63 - 440571.02) 3rd ID value - 1st ID value for ABC 4 1106.1318 5 1562.9115 6 0. # This is (83674.96 - 83674.96) 1st ID value - 1st ID value for PQR 7 494.9496 8 -90.6727 9 521.1276 10 639.0890 11 0. 12 2353.6500 13 -2234.8160 14 -5407.4959 15 -5302.1153 # This is (683959.75 -689261.86 ) 5th ID value - 1st ID value for UVWXYZ Kindly guide Regards Ka
[R] Clean Price of Bond : Can't install "RQuantLib" in R version 3.0.0
Dear Forum, I have R version 3.0.0 installed and need to install RQuantLib pacakge. I tried to install it from CRAN Mirror and I couldn't load it. I had saved the package i zip format and tried to install it locally but I am getting following error. > utils:::menuInstallLocal() Error in read.dcf(file.path(pkgname, "DESCRIPTION"), c("Package", "Type")) : cannot open the connection In addition: Warning message: In read.dcf(file.path(pkgname, "DESCRIPTION"), c("Package", "Type")) : cannot open compressed file 'RQuantLib_0.3.10(1)/DESCRIPTION', probable reason 'No such file or directory' I need to install this package as I need to find out how the clean price of bond is arrived at? Kindly guide Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Clean Price of Bond : Can't install "RQuantLib" in R version 3.0.0
Dear Sir, Thanks a lot. The "1 in 'RQuantLib_0.3.10(1)", I understand was appearing because I had saved RQuantLib number of times in my local directory and each time it renamed the installation file with no added to it. Thanks again. Now I am able to install the package along-with Rcpp. Regards Katherine --- On Thu, 2/5/13, Prof Brian Ripley wrote: From: Prof Brian Ripley Subject: Re: [R] Clean Price of Bond : Can't install "RQuantLib" in R version 3.0.0 To: "Katherine Gobin" Cc: r-help@r-project.org Date: Thursday, 2 May, 2013, 1:23 PM On 02/05/2013 13:09, Katherine Gobin wrote: > Dear Forum, > > I have R version 3.0.0 installed and need to install RQuantLib pacakge. I > tried to install it from CRAN Mirror and I couldn't load it. I had saved the > package i zip format and tried to install it locally but I am getting > following error. What is the name of the file you downloaded? I am guessing it did not arrive with the name on the archive. You seem to be using Windows without saying so. The file for Windows is http://cran.r-project.org/bin/windows/contrib/r-release/RQuantLib_0.3.10.zip without (1) in the name. > >> utils:::menuInstallLocal() > Error in read.dcf(file.path(pkgname, "DESCRIPTION"), c("Package", "Type")) : > cannot open the connection > In addition: Warning message: > In read.dcf(file.path(pkgname, "DESCRIPTION"), c("Package", "Type")) : > cannot open compressed file 'RQuantLib_0.3.10(1)/DESCRIPTION', probable >reason 'No such file or directory' > > I need to install this package as I need to find out how the clean price of > bond is arrived at? > > Kindly > guide > > Regards > > Katherine > > [[alternative HTML version deleted]] > > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Finding Beta
Dear R forum I have a dataframe (of prices) as given below - dat = data.frame(company = rep(c("A", "B", "C", "D", "index"), each = 5), prices = c(runif(5, 10, 12), runif(5, 108, 112), runif(5, 500, 510), runif(5, 40, 50), runif(5, 1000, 1020))) company prices 1 A 10.61727 2 A 10.51892 3 A 11.80495 4 A 11.15243 5 A 10.77543 6 B 111.23817 7 B 109.19825 8 B 108.80053 9 B 110.79876 10 B 108.84385 11 C 504.71801 12 C 504.11778 13 C 502.89416 14 C 500.65996 15 C 502.26748 16 D 42.35901 17 D 43.71947 18 D 46.46092 19 D 43.62220 20 D 48.47480 21 index 1017.24476 22 index 1002.88139 23 index 1005.16148 24 index 1014.54480 25 index 1014.12103 I need to find the beta of A, B, C and D w.r.t index. Beta between two variables X and Y (where Y is dependent) is given by, beta = coef(lm(Y ~ X))[2] Any guidance is appreciated. With regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Removing "NA" from matrix
Dear R forum, I have a data frame dat = data.frame( ABC = c(25.28000732,48.33857234,19.8013245,10.68361461), DEF = c(14.02722251,10.57985168,11.81890316,21.40171514), GHI = c(1,1,1,1), JKL = c(45.96423231,44.52986236,16.56514176,32.14545122), MNO = c(45.38438063,15.54338206,18.78444777,24.29486984)) > dat ABC DEF GHI JKL MNO 1 25.28001 14.02722 1 45.96423 45.38438 2 48.33857 10.57985 1 44.52986 15.54338 3 19.80132 11.81890 1 16.56514 18.78445 4 10.68361 21.40172 1 32.14545 24.29487 When I try to find the correlation I get (which is obvious as my one column shows no variation) dat_cor = cor(dat) Warning message: In cor(dat) : the standard deviation is zero > dat_cor ABC DEF GHI JKL MNO ABC 1.000 -0.75600764 NA 0.55245223 -0.2735585 DEF -0.7560076 1. NA -0.06479082 0.2020781 GHI NA NA 1 NA NA JKL 0.5524522 -0.06479082 NA 1. 0.4564568 MNO -0.2735585 0.20207810 NA 0.45645683 1.000 In reality I am dealing with about 300 variables and don't know which variables don't vary. My query is how do I remove the columns and rows with NA's. So for example, I need the correlation matrix for ABC, DEF, JKL and MNO only. Kindly guide. Thanking in advance. Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Choosing subset of data.frame
Dear R Forum I have a data frame as beta_results = data.frame(instrument = c("ABC", "DEF", "JKL", "LMN", "PQR", "STU", "UVW", "XYZ"), beta_values = c(1.27, -0.22, 0.529, 0.011, 2.31, -1.08, -2.7, 0.42)) > beta_results instrument beta_values 1 ABC 1.270 2 DEF -0.220 3 JKL 0.529 4 LMN 0.011 5 PQR 2.310 6 STU -1.080 7 UVW -2.700 8 XYZ 0.420 Through some other process, I am getting instrument names as say (which may change each time I run this process and hence I can't hard code it). instru = c("JKL", "STU", "XYZ") Now I want the subset of beta_results, (say beta_results_A) pertaining to only instru i.e beta_results_A = instrument beta_values 3 JKL 0.529 6 STU -1.080 8 XYZ 0.420 I did try beta_results_A = beta_results[instru] or beta_results_A = subset(beta_results, beta_results$instrument = instru] but I guess it's failing. Kindly guide Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to store interim print results
Dear R forum, Following is an customized extract of a code I am working on. settlement = as.Date("2013-11-25") maturity = as.Date("2015-10-01") coupon = 0.066 yield = 0.1040 basis = 1 frequency = 2 redemption = 100 # __ add.months = function(date, n) { nC <- seq(date, by=paste (n, "months"), length = 2)[2] fD <- as.Date(strftime(as.Date(date), format='%Y-%m-01')) C <- (seq(fD, by=paste (n+1, "months"), length = 2)[2])-1 if(nC>C) return(C) return(nC) } date.diff = function(end, start, basis=1) { if (basis != 0 && basis != 4) return(as.numeric(end - start)) e <- as.POSIXlt(end) s <- as.POSIXlt(start) d <- (360 * (e$year - s$year)) + (30 * (e$mon - s$mon )) + (min(30, e$mday) - min(30, s$mday)) return (d) } cashflows <- 0 last.coupon <- maturity while (last.coupon > settlement) { print(last.coupon) # I need to store these dates last.coupon <- add.months(last.coupon, -12/frequency) cashflows <- cashflows + 1 print(cashflows) # I need to store these cashflow numbers } The print command causes the following output [1] "2015-10-01" [1] 1 [1] "2015-04-01" [1] 2 [1] "2014-10-01" [1] 3 [1] "2014-04-01" [1] 4 My problem is how do I store these print outputs or while the loop is getting executed, how do I save these to some data.frame say output_dat cashflow_tenure cashflow_nos 1 2015-10-01 1 2 2015-04-01 2 3 2014-10-01 3 4 2014-04-01 4 Kindly advise With regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to store interim print results
Dear Jean Thanks a lot for this solution. Its very useful. I did only one small change and defined cashflow.tenure <- numeric(0) instead of character(0). This helps me in further numerical calculations using these dates like finding the difference between two dates etc. Thanks again, Regards Katherine On Thursday, 3 April 2014 6:58 PM, "Adams, Jean" wrote: Katherine, One easy way to do this for small data is by using the append() function (see code below). But, if you have a lot of data, it may be too slow for you. In that case, you can gain some efficiency if you determine in advance how long the vectors will be, then use indexing to fill in the vectors without using the append() function. Or, rewrite the code to be vectorized instead of using a while() loop. cashflows <- 0 last.coupon <- maturity # create "empty" vectors cashflow.tenure <- character(0) cashflow.nos <- numeric(0) while (last.coupon > settlement) { print(last.coupon) # store the dates cashflow.tenure <- append(cashflow.tenure, last.coupon) last.coupon <- add.months(last.coupon, -12/frequency) cashflows <- cashflows + 1 print(cashflows) # store the cashflow numbers cashflow.nos <- append(cashflow.nos, cashflows) } output.dat <- data.frame(cashflow.tenure, cashflow.nos) output.dat Jean On Thu, Apr 3, 2014 at 5:22 AM, Katherine Gobin wrote: Dear R forum, > >Following is an customized extract of a code I am working on. > >settlement = as.Date("2013-11-25") >maturity = as.Date("2015-10-01") >coupon = 0.066 >yield = 0.1040 >basis = 1 >frequency = 2 >redemption = 100 > ># __ > >add.months = function(date, n) >{ > nC <- seq(date, by=paste (n, "months"), length = 2)[2] > fD <- as.Date(strftime(as.Date(date), format='%Y-%m-01')) > C <- (seq(fD, by=paste (n+1, "months"), length = 2)[2])-1 > if(nC>C) return(C) > return(nC) >} > >date.diff = function(end, start, basis=1) { > if (basis != 0 && basis != 4) > return(as.numeric(end - start)) > e <- as.POSIXlt(end) > s <- as.POSIXlt(start) > d <- (360 * (e$year - s$year)) + (30 * (e$mon - s$mon )) + (min(30, >e$mday) - min(30, s$mday)) > > return (d) >} > > cashflows <- 0 > last.coupon <- maturity > while (last.coupon > settlement) { > print(last.coupon) # I need to store these dates > last.coupon <- add.months(last.coupon, -12/frequency) > cashflows <- cashflows + 1 >print(cashflows) # I need to store these cashflow numbers > } > >The print command causes the following output > >[1] "2015-10-01" >[1] 1 >[1] "2015-04-01" >[1] 2 >[1] "2014-10-01" >[1] 3 >[1] "2014-04-01" >[1] 4 > >My problem is how do I store these print outputs or while the loop is getting >executed, how do I save these to some data.frame say > >output_dat > >cashflow_tenure cashflow_nos > >1 2015-10-01 1 >2 2015-04-01 2 >3 2014-10-01 3 >4 2014-04-01 4 > >Kindly advise > >With regards > >Katherine > [[alternative HTML version deleted]] > > >__ >R-help@r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to store interim print results
Dear Sir, Thanks a lot for your guidance and efforts. Appreciate it. Thanks again. Katherine On Thursday, 3 April 2014 6:55 PM, jim holtman wrote: This will get you close: > settlement = as.Date("2013-11-25") > maturity = as.Date("2015-10-01") > coupon = 0.066 > yield = 0.1040 > basis = 1 > frequency = 2 > redemption = 100 > > # __ > > add.months = function(date, n) + { + nC <- seq(date, by=paste (n, "months"), length = 2)[2] + fD <- as.Date(strftime(as.Date(date), format='%Y-%m-01')) + C <- (seq(fD, by=paste (n+1, "months"), length = 2)[2])-1 + if(nC>C) return(C) + return(nC) + } > > date.diff = function(end, start, basis=1) { + if (basis != 0 && basis != 4) + return(as.numeric(end - start)) + e <- as.POSIXlt(end) + s <- as.POSIXlt(start) + d <- (360 * (e$year - s$year)) + (30 * (e$mon - s$mon )) + (min(30, e$mday) - min(30, s$mday)) + + return (d) + } > > output <- capture.output({ # collect the print output + cashflows <- 0 + last.coupon <- maturity + while (last.coupon > settlement) { + print(last.coupon) # I need to store these dates + last.coupon <- add.months(last.coupon, -12/frequency) + cashflows <- cashflows + 1 + print(cashflows) # I need to store these cashflow numbers + } + + }) > > # remove line numbers > output <- sub("^", "", output) > > # remove extra quotes > output <- gsub('"', '', output) > > > # now read in the data > report <- matrix(output, ncol = 2, byrow = TRUE) > > report [,1] [,2] [1,] "2015-10-01" "1" [2,] "2015-04-01" "2" [3,] "2014-10-01" "3" [4,] "2014-04-01" "4" > > Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Thu, Apr 3, 2014 at 6:22 AM, Katherine Gobin wrote: Dear R forum, > >Following is an customized extract of a code I am working on. > >settlement = as.Date("2013-11-25") >maturity = as.Date("2015-10-01") >coupon = 0.066 >yield = 0.1040 >basis = 1 >frequency = 2 >redemption = 100 > ># __ > >add.months = function(date, n) >{ > nC <- seq(date, by=paste (n, "months"), length = 2)[2] > fD <- as.Date(strftime(as.Date(date), format='%Y-%m-01')) > C <- (seq(fD, by=paste (n+1, "months"), length = 2)[2])-1 > if(nC>C) return(C) > return(nC) >} > >date.diff = function(end, start, basis=1) { > if (basis != 0 && basis != 4) > return(as.numeric(end - start)) > e <- as.POSIXlt(end) > s <- as.POSIXlt(start) > d <- (360 * (e$year - s$year)) + (30 * (e$mon - s$mon )) + (min(30, >e$mday) - min(30, s$mday)) > > return (d) >} > > cashflows <- 0 > last.coupon <- maturity > while (last.coupon > settlement) { > print(last.coupon) # I need to store these dates > last.coupon <- add.months(last.coupon, -12/frequency) > cashflows <- cashflows + 1 >print(cashflows) # I need to store these cashflow numbers > } > >The print command causes the following output > >[1] "2015-10-01" >[1] 1 >[1] "2015-04-01" >[1] 2 >[1] "2014-10-01" >[1] 3 >[1] "2014-04-01" >[1] 4 > >My problem is how do I store these print outputs or while the loop is getting >executed, how do I save these to some data.frame say > >output_dat > >cashflow_tenure cashflow_nos > >1 2015-10-01 1 >2 2015-04-01 2 >3 2014-10-01 3 >4 2014-04-01 4 > >Kindly advise > >With regards > >Katherine > [[alternative HTML version deleted]] > > >__ >R-help@r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Conditional subtraction
Dear R forum I have following data.frame dat = data.frame(key = c("A", "B", "C", "D", "E", "E"), id = c("instru_A", "instru_B", "instru_B", "instru_B", "instru_C", "instru_C"), price = c(101.38, 3.9306, 3.7488, 92.9624, 5.15, 96.1908), adj_factor = c(2.08, 2.5217, 2.5217, 2.5217, 3.08, 3.08)) > dat key id price adj_factor 1 A instru_A 101.3800 2.0800 2 B instru_B 3.9306 2.5217 3 C instru_B 3.7488 2.5217 4 D instru_B 92.9624 2.5217 5 E instru_C 5.1500 3.0800 6 E instru_C 96.1908 3.0800 This is just a part of big database and ids can appear any no of times. # MY PROBLEM I need to subtract adj_factor from the price, however only from the first id only. In case of instru_A, there is only 1 id, so 2.08 should be subtracted from 101.38. The id "instru_B" is appearing 3 times. So in this case, adj_factor = 2.5217 should be subtracted from 3.9306 and rest should remain same. Similarly, id "instru_C" is appearing 2 times, hence the adj_factor = 3.08 should be subtracted from 5.15. Effectively I am looking for > dat_new key id price adj_factor adjusted_price 1 A instru_A 101.3800 2.0800 99.3000 # price adjusted 2 B instru_B 3.9306 2.5217 1.4089 # price adjusted 3 C instru_B 3.7488 2.5217 3.7488 4 D instru_B 92.9624 2.5217 92.9624 5 E instru_C 5.1500 3.0800 2.0700 # price adjusted 6 E instru_C 96.1908 3.0800 96.1908 I tried something like adj_price = function(id, price, adj_factor) { id_length = length(id) if(id_length == 1) { (adjusted_price = price-adj_factor) } if(id_length == 2) { (adjusted_price = c(price[1]-adj_factor[1], price[2])) } if(id_length > 2) { (adjusted_price = c(price[1]-adj_factor[1],price[2:id_length])) } return(adjusted_price) } (final_price = adj_price(dat$id, dat$price, dat$adj_factor)) > (final_price = adj_price(dat$id, dat$price, dat$adj_factor)) [1] 99.3000 3.9306 3.7488 92.9624 5.1500 96.1908 Kindly advise Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R equivalent functions of some EXCEL functions
Dear R forum, EXCEL has some standard functions e.g. (1) PRICE function : Returns the price per $100 face value of a security that pays periodic interest. (2) COUPDAYBS : Returns the number of days from the beginning of the coupon period to the settlement date. (3) COUPDAYS : Returns the number of days in the coupon period that contains settlement date. 4) COUPDAYSNC : Returns the number of days from the settlement date to the next coupon date. Kindly guide if R has some inbuilt functions giving the results same as obtained from Excel functions mentioned above. With regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Splitting vector elements
Dear R forum I have a vector as dat = c("ABC 1", "ABC 2", "ABC 3", "DEF 10", "DEF 20") > dat [1] "ABC 1" "ABC 2" "ABC 3" "DEF 10" "DEF 20" I need to split the names into two parts say p1 p2 ABC 1 ABC 2 ABC 3 DEF 10 DEF 20 Kindly guide Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Paper on Analytics using R
Dear R Forum, I am looking for some write-up or paper on Use of R for Analytics or why R should be preferred over others for Analytics purpose. Tried google but got some info about some commercial vendors using R for analytics. I am looking for some paper where no commercial flavor is given, I mean it deals with R strictly and doesn't talk about some product using R for analytics. Kindly share if you are aware of some writeups or paper. Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reversing the Equation to find value of variable
Dear R forum I have following variables - EAD = 1 LGD = 0.45 PD = 0.47 M = 3 # Equation 1 R = 0.12*(1-exp(-50*PD))/(1-exp(-50)) + 0.24*(1-(1-exp(-50*PD))/(1-exp(-50))) b = (0.11852 - 0.05478 * log(PD))^2 K = (LGD * pnorm((1 - R)^(-0.5) * qnorm(PD) + (R / (1 - R))^0.5 * qnorm(0.999)) - PD * LGD) * (1 - 1.5 * b)^(-1) * (1 + (M - 2.5) * b) RWA = K * 12.5 * EAD > RWA [1] 22845.07 # _ # MY Problem In the above part, knowing values of LGD, EAD, M and PD, the value of RWA was calculated. However, I need to go reverse way in the sense knowing the values of LGD, EAD, M and RWA, I need to find value of PD. So I have tried to use uniroot as (RWA - K * 12.5 * EAD and used the above equations i place of K and R) RWA = 22845.07 LGD = 0.45 EAD = 1 M = 3 f = function(x) RWA - (LGD*pnorm((1-(0.12*(1-exp(-50*x))/(1-exp(-50))+0.24*(1-(1-exp(-50*x))/(1-exp(-50)^(-0.5)*qnorm(x)+((0.12*(1-exp(-50*x))/(1-exp(-50))+0.24*(1-(1-exp(-50*x))/(1-exp(-50/(1-(0.12*(1-exp(-50*x))/(1-exp(-50))+0.24*(1-(1-exp(-50*x))/(1-exp(-50))^0.5*qnorm(0.999))-x*LGD) * (1-1.5*((0.11852-0.05478 * log(x))^2))^(-1)*(1+(M-2.5)*((0.11852-0.05478 * log(x))^2))*12.5*EAD uniroot(f, c(0,1), tol = 0.01) I get following error - > uniroot(f, c(0,1), tol = 0.01) Error in uniroot(f, c(0, 1), tol = 1e-10) : f.lower = f(lower) is NA Kindly guide as I am not sure if uniroot is the correct way of doing it or not. Ideally, I should be getting the PD value of 0.47. With regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reversing the Equation to find value of variable
Dear Sir, Thanks a lot for your wonderful guidance. It gave me a new vision to look at teh equations. Really appreciate. Thanks a lot once again. Katherine On Monday, 6 January 2014 5:31 PM, Frede Aakmann Tøgersen wrote: Hi Reading the error message carefully you can see that f() is not defined at 0: > uniroot(f, c(0, 1)) Error in uniroot(f, c(0, 1)) : f.lower = f(lower) is NA > f(0) [1] NaN If you plot f() in the interval (0,1) then you'll see there is two solutions: > uniroot(f, c(0.0001, 1)) Error in uniroot(f, c(1e-04, 1)) : f() values at end points not of opposite sign > uniroot(f, c(0.0001, 0.2)) $root [1] 0.1533901 $f.root [1] 0.3414232 $iter [1] 6 $estim.prec [1] 6.103516e-05 > uniroot(f, c(0.3, 1)) $root [1] 0.4699984 $f.root [1] -0.04112121 $iter [1] 8 $estim.prec [1] 6.103516e-05 > Yours sincerely / Med venlig hilsen Frede Aakmann Tøgersen Specialist, M.Sc., Ph.D. Plant Performance & Modeling Technology & Service Solutions T +45 9730 5135 M +45 2547 6050 fr...@vestas.com http://www.vestas.com Company reg. name: Vestas Wind Systems A/S This e-mail is subject to our e-mail disclaimer statement. Please refer to www.vestas.com/legal/notice If you have received this e-mail in error please contact the sender. > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On Behalf Of Katherine Gobin > Sent: 6. januar 2014 12:42 > To: r-help@r-project.org > Subject: [R] Reversing the Equation to find value of variable > > Dear R forum > > I have following variables - > > EAD = 1 > LGD = 0.45 > PD = 0.47 > M = 3 > > # Equation 1 > > R = 0.12*(1-exp(-50*PD))/(1-exp(-50)) + 0.24*(1-(1-exp(-50*PD))/(1-exp(- > 50))) > > b = (0.11852 - 0.05478 * log(PD))^2 > > K = (LGD * pnorm((1 - R)^(-0.5) * qnorm(PD) + (R / (1 - R))^0.5 * > qnorm(0.999)) - PD * LGD) * (1 - 1.5 * b)^(-1) * (1 + (M - 2.5) * b) > > RWA = K * 12.5 * EAD > > > > RWA > [1] 22845.07 > > # > __ > ___ > > # MY Problem > > In the above part, knowing values of LGD, EAD, M and PD, the value of RWA > was calculated. However, I need to go reverse way in the sense knowing the > values of LGD, EAD, M and RWA, I need to find value of PD. > > So I have tried to use uniroot as (RWA - K * 12.5 * EAD and used the above > equations i place of K and R) > > RWA = 22845.07 > LGD = 0.45 > EAD = 1 > M = 3 > > f = function(x) RWA - (LGD*pnorm((1-(0.12*(1-exp(-50*x))/(1-exp(- > 50))+0.24*(1-(1-exp(-50*x))/(1-exp(-50)^(-0.5)*qnorm(x)+((0.12*(1- > exp(-50*x))/(1-exp(-50))+0.24*(1-(1-exp(-50*x))/(1-exp(-50/(1-(0.12*(1- > exp(-50*x))/(1-exp(-50))+0.24*(1-(1-exp(-50*x))/(1-exp(- > 50))^0.5*qnorm(0.999))-x*LGD) * (1-1.5*((0.11852-0.05478 * > log(x))^2))^(-1)*(1+(M-2.5)*((0.11852-0.05478 * log(x))^2))*12.5*EAD > > uniroot(f, c(0,1), tol = 0.01) > > I get following error - > > > uniroot(f, c(0,1), tol = 0.01) > Error in uniroot(f, c(0, 1), tol = 1e-10) : f.lower = f(lower) is NA > > Kindly guide as I am not sure if uniroot is the correct way of doing it or > not. > Ideally, I should be getting the PD value of 0.47. > > With regards > > Katherine > [[alternative HTML version deleted]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Running the Loop
Dear R forum, I have following data.frames dat = data.frame(id = c(1:3), root = c(0.10, 0.20, 0.74), maturity_period = c(20, 155, 428), mtm = c(1000, 1, 10), curve = c("USD", "USD", "USD")) > dat id root maturity_period mtm curve 1 1 0.10 20 1e+03 USD 2 2 0.20 155 1e+04 USD 3 3 0.74 428 1e+05 USD standard_tenors = data.frame(T = c("1m", "3m", "6m", "12m", "5yr"), D = c(30, 91, 182, 365, 1825)) > standard_tenors T D 1 1m 30 2 3m 91 3 6m 182 4 12m 365 5 5yr 1825 # . library(plyr) T = standard_tenors$T D = standard_tenors$D n = length(standard_tenors$T) mtm_split_function = function(maturity_period, curve, root, mtm) { for(i in 1:(n-1)) { if (maturity_period < D[i]) { N1 = paste(curve, T[i], sep ="_") N2 = paste(curve, T[i], sep ="_") PV1 = mtm PV2 = 0 }else if (maturity_period > D[i] & maturity_period < D[i+1]) { N1 = paste(curve, T[i], sep ="_") N2 = paste(curve, T[1+1], sep ="_") PV1 = (mtm)*root PV2 = (mtm)*(1-root) }else if (maturity_period > D[i+1]) { N1 = paste(curve, T[i], sep ="_") N2 = paste(curve, T[i], sep ="_") PV1 = 0 PV2 = mtm } } return(data.frame(Risk_factor1 = N1, Risk_factor2 = N2, Risk_factor1_mtm = PV1, Risk_factor2_mtm = PV2)) } # . splitted_mtm <- ddply(.data = dat, .variables = "id", .fun=function(x) mtm_split_function(maturity_period = x$maturity_period, curve = x$curve, root = x$root, mtm = x$mtm)) # OUTPUT I am getting id Risk_factor1 Risk_factor2 Risk_factor1_mtm Risk_factor2_mtm 1 1 USD_12m USD_12m 1000 0 2 2 USD_12m USD_12m 1 0 3 3 USD_12m USD_3m 74000 26000 # My PROBLEM However, My OUTPUT should be id Risk_factor1 Risk_factor2 Risk_factor1_mtm Risk_factor2_mtm 1 1 USD_1m USD_1m 1000 0 2 2 USD_3m USD_6m 2000 8000 3 3 USD_12m USD_5yr 74000 26000 Kindly guide With warm regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to write an error to output
Dear R forum, The example below is just an indicative one and I have constructed it. My real life data and conditions are different. I have a data.frame as given below mydat = data.frame(A = c(19, 20, 19, 19, 19, 18, 16, 18, 19, 20), B = c(19, 20, 20, 19, 20, 18, 19, 18, 17, 16)) if (length(mydat$A) > 10) { stop("A has length more than 10") }else if (max(mydat$B) > 18) { stop("max B exceeds limit") }else {result = mydat$A + mydat$B if (length(result) > 0) { write.csv(data.frame(result = result), 'result.csv', row.names = FALSE) } } # - When i execute above code, I get message Error: max B exceeds limit If all conditions are met, obviously I am getting an output as result.csv If result.csv is generated, I am able to capture and show the output in front end. However, if the process couldn't be run owing to the violation of conditions, the error is produced. How do I capture this error (and express it as csv file) so that I can show it as a comment in front end. Kindly guide. Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to write an error to output
Dear sir, Thanks a lot for your wonderful suggestion. Regards Katherine On Wednesday, 16 October 2013 5:28 PM, jim holtman wrote: Will this work for you: mydat = data.frame(A = c(19, 20, 19, 19, 19, 18, 16, 18, 19, 20), B = c(19, 20, 20, 19, 20, 18, 19, 18, 17, 16)) if (length(mydat$A) > 10) { write.csv(data.frame(error = "A has length more than 10"), 'result.csv', row.names = FALSE) stop("A has length more than 10") }else if (max(mydat$B) > 18) { write.csv(data.frame(error = "max B exceeds limit"), 'result.csv', row.names = FALSE) stop("max B exceeds limit") }else {result = mydat$A + mydat$B if (length(result) > 0) { write.csv(data.frame(result = result), 'result.csv', row.names = FALSE) } } Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Wed, Oct 16, 2013 at 7:01 AM, Katherine Gobin wrote: > Dear R forum, > > The example below is just an indicative one and I have constructed it. My > real life data and conditions are different. > > I have a data.frame as given below > > mydat = data.frame(A = c(19, 20, 19, 19, 19, 18, 16, 18, 19, 20), B = c(19, > 20, 20, 19, 20, 18, 19, 18, 17, 16)) > > if (length(mydat$A) > 10) > > { > stop("A has length more than 10") > }else > > if (max(mydat$B) > 18) > { > stop("max B exceeds limit") > }else > > {result = mydat$A + mydat$B > > if (length(result) > 0) > > { > write.csv(data.frame(result = result), 'result.csv', row.names = >FALSE) > } > } > > # - > > When i execute above code, I get message > > Error: max B exceeds limit > > If all conditions are met, obviously I am getting an output as result.csv > > If result.csv is generated, I am able to capture and show the output in front > end. However, if the process couldn't be run owing to the violation of > conditions, the error is produced. How do I capture this error (and express > it as csv file) so that I can show it as a comment in front end. > > Kindly guide. > > > Katherine > [[alternative HTML version deleted]] > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Subseting a data.frame
Dear Forum, I have a data frame as mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency = c(0.15, 0.07, 0.03, 0.001)) > mydat basel_asset_class defa_frequency 1 2 0.150 2 8 0.070 3 8 0.030 4 8 0.001 I need to get the subset of this data.frame where no of records for the given basel_asset_class is > 2, i.e. I need to obtain subset of above data.frame as (since there is only 1 record, against basel_asset_class = 2, I want to filter it) > mydat_a basel_asset_class defa_frequency 1 8 0.070 2 8 0.030 3 8 0.001 Kindly guide Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subseting a data.frame
I am sorry perhaps was not able to put the question properly. I am not looking for the subset of the data.frame where the basel_asset_class is > 2. I do agree that would have been a basic requirement. Let me try to put the question again. I have a data frame as mydat = data.frame(basel_asset_class = c(4, 8, 8 ,8), defa_frequency = c(0.15, 0.07, 0.03, 0.001)) # Please note I have changed the basel_asset_class to 4 from 2, to avoid confusion. > mydat basel_asset_class defa_frequency 1 4 0.150 2 8 0.070 3 8 0.030 4 8 0.001 This is just an representative example. In reality, I may have no of basel asset classes. 4, 8 etc are the IDs can be anything thus I cant hard code it as subset(mydat, mydat$basel_asset_class > 2). What I need is to select only those records for which there are more than two default frequencies (defa_frequency), Thus, there is only one default frequency = 0.150 w.r.t basel_asset_class = 4 whereas there are default frequencies w.r.t. basel aseet class 4, similarly there could be another basel asset class having say 5 default frequncies. Thus, I need to take subset of the data.frame s.t. the no of corresponding defa_frequencies is greater than 2. The idea is we try to fit exponential curve Y = A exp( BX ) for each of the basel asset classes and to estimate values of A and B, mathematically one needs to have at least two values of X. I hope I may be able to express my requirement. Its not that I need the subset of mydat s.t. basel asset class is > 2 (now 4 in revised example), but sbuset s.t. no of default frequencies is greater than or equal to 2. This 2 is not same as basel asset class 2. Kindly guide With warm regards Katherine Gobin On Thursday, 17 October 2013 9:33 PM, Bert Gunter wrote: "Kindly guide" ... This is a very basic question, so the kindest guide I can give is to read an Introduction to R (ships with R) or a R web tutorial of your choice so that you can learn how R works instead of posting to this list. Cheers, Bert On Wed, Oct 16, 2013 at 11:55 PM, Katherine Gobin wrote: Dear Forum, > >I have a data frame as > >mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency = c(0.15, >0.07, 0.03, 0.001)) > >> mydat > basel_asset_class defa_frequency >1 2 0.150 >2 8 0.070 >3 8 0.030 >4 8 0.001 > > >I need to get the subset of this data.frame where no of records for the given >basel_asset_class is > 2, i.e. I need to obtain subset of above data.frame as >(since there is only 1 record, against basel_asset_class = 2, I want to filter >it) > >> mydat_a > basel_asset_class defa_frequency >1 8 0.070 >2 8 0.030 >3 8 0.001 > >Kindly guide > >Katherine > [[alternative HTML version deleted]] > > >__ >R-help@r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. > > -- Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subseting a data.frame
Correction. (2nd para first three lines) Pl read following line What I need is to select only those records for which there are more than two default frequencies (defa_frequency), Thus, there is only one default frequency = 0.150 w.r.t basel_asset_class = 4 whereas there are default frequencies w.r.t. basel aseet class 4, as What I need is to select only those records for which there are more than two default frequencies (defa_frequency), Thus, there is only one default frequency = 0.150 w.r.t basel_asset_class = 4 whereas there are THREE default frequencies w.r.t. basel aseet class 8, I alpologize for the incovenience. Regards KAtherine On , Katherine Gobin wrote: I am sorry perhaps was not able to put the question properly. I am not looking for the subset of the data.frame where the basel_asset_class is > 2. I do agree that would have been a basic requirement. Let me try to put the question again. I have a data frame as mydat = data.frame(basel_asset_class = c(4, 8, 8 ,8), defa_frequency = c(0.15, 0.07, 0.03, 0.001)) # Please note I have changed the basel_asset_class to 4 from 2, to avoid confusion. > mydat basel_asset_class defa_frequency 1 4 0.150 2 8 0.070 3 8 0.030 4 8 0.001 This is just an representative example. In reality, I may have no of basel asset classes. 4, 8 etc are the IDs can be anything thus I cant hard code it as subset(mydat, mydat$basel_asset_class > 2). What I need is to select only those records for which there are more than two default frequencies (defa_frequency), Thus, there is only one default frequency = 0.150 w.r.t basel_asset_class = 4 whereas there are default frequencies w.r.t. basel aseet class 4, similarly there could be another basel asset class having say 5 default frequncies. Thus, I need to take subset of the data.frame s.t. the no of corresponding defa_frequencies is greater than 2. The idea is we try to fit exponential curve Y = A exp( BX ) for each of the basel asset classes and to estimate values of A and B, mathematically one needs to have at least two values of X. I hope I may be able to express my requirement. Its not that I need the subset of mydat s.t. basel asset class is > 2 (now 4 in revised example), but sbuset s.t. no of default frequencies is greater than or equal to 2. This 2 is not same as basel asset class 2. Kindly guide With warm regards Katherine Gobin On Thursday, 17 October 2013 9:33 PM, Bert Gunter wrote: "Kindly guide" ... This is a very basic question, so the kindest guide I can give is to read an Introduction to R (ships with R) or a R web tutorial of your choice so that you can learn how R works instead of posting to this list. Cheers, Bert On Wed, Oct 16, 2013 at 11:55 PM, Katherine Gobin wrote: Dear Forum, > >I have a data frame as > >mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency = c(0.15, >0.07, 0.03, 0.001)) > >> mydat > basel_asset_class defa_frequency >1 2 0.150 >2 8 0.070 >3 8 0.030 >4 8 0.001 > > >I need to get the subset of this data.frame where no of records for the given >basel_asset_class is > 2, i.e. I need to obtain subset of above data.frame as >(since there is only 1 record, against basel_asset_class = 2, I want to filter >it) > >> mydat_a > basel_asset_class defa_frequency >1 8 0.070 >2 8 0.030 >3 8 0.001 > >Kindly guide > >Katherine > [[alternative HTML version deleted]] > > >__ >R-help@r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. > > -- Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subseting a data.frame
Dear sir, Thanks a lot for your guidance. I have been benefited immensely by this discussion. Thanks again. Regards Katherine On Friday, 18 October 2013 2:50 AM, Bert Gunter wrote: Thanks, Bill. But ?ave specifically says: ave(x, ..., FUN = mean) Arguments: x A numeric. So that it should not be expected to work properly if the argument is not (coercible to) numeric. Nevertheless, defensive programming is always wise. Cheers, Bert On Thu, Oct 17, 2013 at 1:34 PM, William Dunlap wrote: > May I ask why: > count_by_class <- with(dat, ave(numeric(length(basel_asset_class)), > basel_asset_class, FUN=length)) > > should not be more simply done as: > count_by_class <- with(dat, ave(basel_asset_class, basel_asset_class, > FUN=length)) > > The way I did it would work if basel_asset_class were non-numeric. > > In ave(x, group, FUN=FUN), FUN's return value should be the same type as x > (or > > you can get some odd type conversions). E.g., > > > > > num <- c(2,3,2,2) ; char <- c("Two","Three","Two","Two") > > > ave(num, num, FUN=length) # good > > [1] 3 1 3 3 > > > ave(char, char, FUN=length) # bad > > [1] "3" "1" "3" "3" > > > fac <- factor(char, levels=c("One","Two","Three")) > > > ave(fac, fac, FUN=length) > > [1] > > Levels: One Two Three > > Warning messages: > > 1: In `[<-.factor`(`*tmp*`, i, value = 0L) : > > invalid factor level, NA generated > > 2: In `[<-.factor`(`*tmp*`, i, value = 3L) : > > invalid factor level, NA generated > > 3: In `[<-.factor`(`*tmp*`, i, value = 1L) : > > invalid factor level, NA generated > > but x=integer(length(group)) works in all cases: > > > ave(integer(length(fac)), fac, FUN=length) > > [1] 3 1 3 3 > > > ave(integer(length(char)), char, FUN=length) > > [1] 3 1 3 3 > > > > Bill Dunlap > > Spotfire, TIBCO Software > > wdunlap tibco.com > > > > From: Bert Gunter [mailto:gunter.ber...@gene.com] > Sent: Thursday, October 17, 2013 1:06 PM > To: William Dunlap > Cc: Katherine Gobin; r-help@r-project.org > Subject: Re: [R] Subseting a data.frame > > > > May I ask why: > > count_by_class <- with(dat, ave(numeric(length(basel_ > > asset_class)), basel_asset_class, FUN=length)) > > should not be more simply done as: > > count_by_class <- with(dat, ave(basel_asset_class, basel_asset_class, > FUN=length)) > > ? > > -- Bert > > > > On Thu, Oct 17, 2013 at 12:36 PM, William Dunlap wrote: > >> What I need is to select only those records for which there are more than >> two default >> frequencies (defa_frequency), > > Here is one way. There are many others: > > dat <- data.frame( # slightly less trivial example > basel_asset_class=c(4,8,8,8,74,3,74), > defa_frequency=(1:7)/8) > > count_by_class <- with(dat, ave(numeric(length(basel_asset_class)), > basel_asset_class, FUN=length)) > > cbind(dat, count_by_class) # see what we just computed > basel_asset_class defa_frequency count_by_class > 1 4 0.125 1 > 2 8 0.250 3 > 3 8 0.375 3 > 4 8 0.500 3 > 5 74 0.625 2 > 6 3 0.750 1 > 7 74 0.875 2 > > mydat[count_by_class>1, ] # I think this is what you are asking for > basel_asset_class defa_frequency > 2 8 0.250 > 3 8 0.375 > 4 8 0.500 > 5 74 0.625 > 7 74 0.875 > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > >> -Original Message- >> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] >> On Behalf >> Of Katherine Gobin >> Sent: Thursday, October 17, 2013 11:05 AM >> To: Bert Gunter >> Cc: r-help@r-project.org >> Subject: Re: [R] Subseting a data.frame >> >> Correction. (2nd para first three lines) >> >> Pl read following line >> >> What I need is to select only those records for which there are more than >> two default >> frequencies (defa_frequency), Thus, there is only one default frequency = >> 0.150 w.r.t >> basel_asse
[R] Yield to maturity in R
Dear R forum, Just want to know if there is any function / package in R which will calculate Yield to Maturity in R for a given bond? Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Readjusting frequencies
Dear Forum, I have following data.frame as fraud_data = data.frame(no_of_frauds = c(1, 2, 4, 6, 7, 9, 10), frequency = c(3, 1, 7, 11, 13, 1, 4)) > fraud_data no_of_frauds frequency 1 1 3 2 2 1 3 4 7 4 6 11 5 7 13 6 9 1 7 10 4 I need to regroup the data in such a way that if the frequency is less than 5, the corresponding class data gets merged to next class i.e. the frequencies get added added till the added frequencies exceed 5. Thus, in above data.frame since frequencies pertaining to no_of_frauds 1 and 2 are 3 and 1 respectively, these get added to class 4 and the frequency of this class now becomes 3+1+7 = 11. Likewise, frequency of classes 9 and 10 are 1 and 4 and when these are added still it is 5 i.e. doesn't exceed 5. Thus, these should get added to the previous class i.e. 7. Thus I need to have no_of_frauds frequency 4 11 # ( 3 + 1 + 7) 6 11 7 18 # (13 + 1 + 4) Kindly guide Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lmom package
Dear R Forum I have a set of data say as given below and as an exercise of trying to fit statistical distribution to this data, I am estimating parameters. amounts = c(38572.5599129508,11426.6705314315,21974.1571641187,118530.32782443,3735.43055996748,66309.5211176106,72039.2934132668,21934.8841708626,78564.9136114375,1703.65825161293,2116.89180930203,11003.495671332,19486.3296339113,1871.35861218795,6887.53851253407,148900.978055447,7078.56497101651,79348.1239806592,20157.6241066905,1259.99802108593,3934.45912233674,3297.69946631591,56221.1154121067,13322.0705174134,45110.2498756567,31910.3686613912,3196.71168501252,32843.0140437202,14615.1499458453,13013.9915051561,116104.176753387,7229.03056392023,9833.37962177814,2882.63239493673,165457.372543821,41114.066453219,47188.1677766245,25708.5883755617,82703.7378298092,8845.04197017415,844.28834047836,35410.8486123933,19446.3808445684,17662.2398792892,11882.8497070776,4277181.17817307,30239.0371267968,45165.7512343364,22102.8513746687,5988.69296597127,51345.0146170238,1275658.35495898,15260.4892854214,8861.76578480635,37647.1638704867,4979.53544046949,7012.48134772332,3385.20612391205,1911.03114395959,66886.5036605189,2223.47536156462,814.947809578378,234.028589468841,5397.4347625133,13346.3226579065,28809.3901352898,6387.69226236731,5639.42730553242,2011100.92675507,4150.63707173462,34098.7514446498,3437.10672573502,289710.315303182,8664.66947305203,13813.3867161134,208817.521491857,169317.624400274,9966.78447705792,37811.1721605562,2263.19211279927,80434.5581206454,19057.8093104899,24664.5067589624,25136.5042354789,3582.85741610706,6683.13898432794,65423.9991390846,134848.302304064,3018.55371579808,546249.641168158,172926.689143006,3074.15064180208,1521.70624812788,59012.4248281661,21226.928522236,17572.5682970983,226.646947337851,56232.2982652019,14641.0043361533,6997.94414914865) library(lmom)lmom <- samlmu(amounts) # # Normal distribution parameters_of_NOR <- pelnor(lmom); parameters_of_NOR > parameters_of_NOR <- pelnor(lmom); parameters_of_NOR mu sigma > 115148.4 175945.8 # Minitab and SPSS parameter values Location Scale Minitab 115148.4 485173SPSS 115148.4 485173 # __ # Log normal 3 parameter distribution parameters_of_LN3 <- pelln3(lmom); parameters_of_LN3 > parameters_of_LN3 <- pelln3(lmom); parameters_of_LN3 zeta mu sigma 3225.798890 9.114879 2.240841 Location Scale ShapeMinitab 9.73361 1.76298 75.51864SPSS 9.7336 1.763 75.519 Similarly besides Generalized extreme Value distribution, all the parameter values vary significantly than parameter values obtained using Minitab and SPSS. In case of Normal distribution, the dispersion parameter is simply sample standard deviation and excel also gives the parameter value 485172.8 and varies significantly than what we get from R. And parameter values do differ even for many other distributions too viz. Gamma distribution etc. Is there any different algorithm or logic used in R? Can someone please guide.? Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lmom package - Resending the email
Dear R forum I sincerely apologize as my earlier mail with the captioned subject, since all the values got mixed up and the email is not readable. I am trying to write it again. My problem is I have a set of data and I am trying to fit some distributions to it. As a part of this exercise, I need to find out the parameter values of various distributions e.g. Normal distribution, Log normal distribution etc. I am using lmom package to do the same, however the parameter values obtained using lmom pacakge differ to a large extent from the parameter values obtained using say MINITAB and SPSS as given below - _ amounts = c(38572.5599129508,11426.6705314315,21974.1571641187,118530.32782443,3735.43055996748,66309.5211176106,72039.2934132668,21934.8841708626,78564.9136114375,1703.65825161293,2116.89180930203,11003.495671332,19486.3296339113,1871.35861218795,6887.53851253407,148900.978055447,7078.56497101651,79348.1239806592,20157.6241066905,1259.99802108593,3934.45912233674,3297.69946631591,56221.1154121067,13322.0705174134,45110.2498756567,31910.3686613912,3196.71168501252,32843.0140437202,14615.1499458453,13013.9915051561,116104.176753387,7229.03056392023,9833.37962177814,2882.63239493673,165457.372543821,41114.066453219,47188.1677766245,25708.5883755617,82703.7378298092,8845.04197017415,844.28834047836,35410.8486123933,19446.3808445684,17662.2398792892,11882.8497070776,4277181.17817307,30239.0371267968,45165.7512343364,22102.8513746687,5988.69296597127,51345.0146170238,1275658.35495898,15260.4892854214,8861.76578480635,37647.1638704867,4979.53544046949,7012.48134772332,3385.20612391205,1911.03114395959,66886.5036605189,2223.47536156462,814.947809578378,234.028589468841,5397.4347625133,13346.3226579065,28809.3901352898,6387.69226236731,5639.42730553242,2011100.92675507,4150.63707173462,34098.7514446498,3437.10672573502,289710.315303182,8664.66947305203,13813.3867161134,208817.521491857,169317.624400274,9966.78447705792,37811.1721605562,2263.19211279927,80434.5581206454,19057.8093104899,24664.5067589624,25136.5042354789,3582.85741610706,6683.13898432794,65423.9991390846,134848.302304064,3018.55371579808,546249.641168158,172926.689143006,3074.15064180208,1521.70624812788,59012.4248281661,21226.928522236,17572.5682970983,226.646947337851,56232.2982652019,14641.0043361533,6997.94414914865) library(lmom) lmom = samlmu(amounts) # __ # Normal Distribution parameters parameters_of_NOR <- pelnor(lmom); parameters_of_NOR mu sigma 115148.4 175945.8 Location Scale Minitab 115148.4 485173SPSS 115148.4 485173 # __ # Log Normal (3 Parameter) Distribution parameters zeta mu sigma 3225.798890 9.114879 2.240841 Location Scale Shape MINITAB 9.73361 1.76298 75.51864SPSS 9.7336 1.763 75.519 # __ Besides Genaralized extreme Value distributions, all the other distributions e.g. Gamma, Exponential (2 parameter) distributions etc give different results than MINITAB and SPSS. Can some one guide me? Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.