Re: [R] lmom package - Resending the email

2014-12-04 Thread Katherine Gobin
Dear Dalgaard sir,


Thanks a lot for detailed clarification. It indeed is very enlightening and 
will be very useful for me in future.

And your suggestion is well taken.

Thanks again.

Regards

Katherine


On Thu, 4/12/14, peter dalgaard  wrote:

 Subject: Re: [R] lmom package - Resending the email
 To: "Simon Zehnder" 


 Date: Thursday, 4 December, 2014, 2:04 PM

 lmom is based on
 L-moments, which are different from ordinary moments, except
 for the 1st one. It would be truly miraculous if it gave the
 same result as the ordinary method of moments or maximum
 likelihood. 

 Estimates of
 any distributional parameter requires that the model
 actually fits the data, and in your case a qqnorm(amounts)
 shows that they are certainly not normal. In such cases, the
 L-moment estimator of the std.dev. is not necessarily an
 estimate of the std.dev. of the actual distribution.

 A lognormal distribution seems
 to fit the data better. However, the L-moments suggest a
 value for zeta (the lower bound) of 3226 which is well
 inside the range of the actual data. In fact there are 16
 observations that are less than 3226. Maximum likelihood
 would never do that, but the same sort of effect is
 well-known for the ordinary method of moments.

 In short, you need to study
 the theory before you appply its results.

 - Peter D.


 On 03 Dec 2014, at 10:57 ,
 Simon Zehnder 
 wrote:

 > Katherine,
 > 
 > for a deeper
 understanding of differing values it makes sense to provide
 the list at least with an online description of the
 corresponding functions used in Minitab and SPSS…
 > 
 > Best 
 > Simon
 > On 03 Dec 2014,
 at 10:45, Katherine Gobin via R-help 
 wrote:
 > 
 >> Dear R
 forum
 >> I sincerely apologize as my
 earlier mail with the captioned subject, since all the
 values got mixed up and the email is not readable. I am
 trying to write it again. 
 >> My
 problem is I have a set of data and I am trying to fit some
 distributions to it. As a part of this exercise, I need to
 find out the parameter values of various distributions e.g.
 Normal distribution, Log normal distribution etc. I am using
 lmom package to do the same, however the parameter values
 obtained using lmom pacakge differ to a large extent from
 the parameter values obtained using say MINITAB and SPSS as
 given below -
 >>
 _
 >> 
 >> amounts = 
 
c(38572.5599129508,11426.6705314315,21974.1571641187,118530.32782443,3735.43055996748,66309.5211176106,72039.2934132668,21934.8841708626,78564.9136114375,1703.65825161293,2116.89180930203,11003.495671332,19486.3296339113,1871.35861218795,6887.53851253407,148900.978055447,7078.56497101651,79348.1239806592,20157.6241066905,1259.99802108593,3934.45912233674,3297.69946631591,56221.1154121067,13322.0705174134,45110.2498756567,31910.3686613912,3196.71168501252,32843.0140437202,14615.1499458453,13013.9915051561,116104.176753387,7229.03056392023,9833.37962177814,2882.63239493673,165457.372543821,41114.066453219,47188.1677766245,25708.5883755617,82703.7378298092,8845.04197017415,844.28834047836,35410.8486123933,19446.3808445684,17662.2398792892,11882.8497070776,4277181.17817307,30239.0371267968,45165.7512343364,22102.8513746687,5988.69296597127,51345.0146170238,1275658.35495898,15260.4892854214,8861.76578480635,37647.1638704867,4979.53544046949,7012.48134772332
,3385.20612391205,1911.03114395959,66886.5036605189,2223.47536156462,814.947809578378,234.028589468841,5397.4347625133,13346.3226579065,28809.3901352898,6387.69226236731,5639.42730553242,2011100.92675507,4150.63707173462,34098.7514446498,3437.10672573502,289710.315303182,8664.66947305203,13813.3867161134,208817.521491857,169317.624400274,9966.78447705792,37811.1721605562,2263.19211279927,80434.5581206454,19057.8093104899,24664.5067589624,25136.5042354789,3582.85741610706,6683.13898432794,65423.9991390846,134848.302304064,3018.55371579808,546249.641168158,172926.689143006,3074.15064180208,1521.70624812788,59012.4248281661,21226.928522236,17572.5682970983,226.646947337851,56232.2982652019,14641.0043361533,6997.94414914865)
 >> 
 >>
 library(lmom)
 >> lmom  = 
 samlmu(amounts)
 >> #
 __
 >> # Normal Distribution parameters
 >> parameters_of_NOR  <-
 pelnor(lmom); parameters_of_NOR
 >> 
 >>      mu          sigma
 115148.4    175945.8
 >>       
               Location       Scale 
    Minitab         115148.4 
    485173SPSS           115148.4 
    485173
 >> #
 __
 >> # Log Normal (3 Parameter)
 Distribution parameters
 >>   
    zeta                mu         
      sigma 3225.798890    9.114879     
 2.240841
 >>                 
             Location            Scale     

[R] VGAM package : Frechet distribution - 2 parameter estimation

2014-11-06 Thread Katherine Gobin
Dear R forum,

I am trying to execute following code (Page no 259 - VGAM.pdf)

# 
.

library(VGAM)

set.seed(123)
fdata <- data.frame(y1 = rfrechet(nn <- 1000, shape = 2 + exp(1)))
with(fdata, hist(y1))
fit2 <- vglm(y1 ~ 1, frechet, data = fdata, trace = TRUE)

# 
.


However, I receive following error 

Error in vglm(y1 ~ 1, frechet, data = fdata, trace = TRUE) : 
  object 'frechet' not found


Earlier there used to be a function called "frechet3" which I guess has been 
withdrawn by VGAM. 

Kindly guide 

Katherine
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] VGAM package : Frechet distribution - 2 parameter estimation

2014-11-06 Thread Katherine Gobin
Dear Mr Michael,

Thanks a lot for your guidance. The pdf file describing VGAM package has 
mentioned 'frechet' in the example, so I got the error.

Regards
Katherine


On Thursday, 6 November 2014 2:54 PM, Michael Dewey  
wrote:
 




On 06/11/2014 06:04, Katherine Gobin wrote:
> Dear R forum,
>
> I am trying to execute following code (Page no 259 - VGAM.pdf)
>
> # 
> .
>
> library(VGAM)
>
> set.seed(123)
> fdata <- data.frame(y1 = rfrechet(nn <- 1000, shape = 2 + exp(1)))
> with(fdata, hist(y1))
> fit2 <- vglm(y1 ~ 1, frechet, data = fdata, trace = TRUE)
>
> # 
> .
>

Is it not called frechet2?

>
> However, I receive following error
>
> Error in vglm(y1 ~ 1, frechet, data = fdata, trace = TRUE) :
>object 'frechet' not found
>
>
> Earlier there used to be a function called "frechet3" which I guess has been 
> withdrawn by VGAM.
>
> Kindly guide
>
> Katherine
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> -
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2015.0.5557 / Virus Database: 4189/8518 - Release Date: 11/05/14
>
>

-- 
Michael
http://www.dewey.myzen.co.uk
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Double return statement

2013-08-14 Thread Katherine Gobin
Dear R forum,

I have a function which generates say two outputs, say output_1 and output_2. 
Output_1 is a single row output whereas Output_2 is a dataframe having multiple 
records. Is it possible to use two return statements in function. Output_2 uses 
some records from output_1, hence I need to have these outputs generated from 
the same function. 

Thanking in advance

With warm regards

Katherine
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Display of data points in the Scatterplot

2012-12-19 Thread Katherine Gobin
Respected R forum

I am learning R and relatively quite new to R. I am generating a scatter-plot 
as given below. (My actual table is much larger).



# Sample data frame


y = c(20, 23, 17, 31, 68)
x = c(200, 300, 400, 500, 600)

plot(x, y, type = 'l')

If I plot this scatter-plot in excel, the data values are displayed if I place 
the cursor at some desired place of the graph. E.g. if I place the cursor say 
at the point (400, 31), then the value (400, 31) is displayed.

My question is 


(A) once I plot a graph in R, is it possible to display a particular (x, y) 
co-ordinate by placing the cursor there?

(B) Suppose I have 100 pairs of (x, y ). then is it possible to display  in the 
graph (irrespective of the curosr position) the values of (x,  y) corresponding 
to say 10th, 20th, 30th, 40th etc. observations in the graph.


Regards

Katherine
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to count the nos. in a range?

2012-12-20 Thread Katherine Gobin
Dear R forum


I have a following vector of random no.s

x = runif(100, 0.01, 0.99)


 [1] 0.47212037 0.77867992 0.33947474 0.93369035
  [5] 0.03720073 0.79307831 0.81801835 0.92710688
.


I need to count the random no. falling in the range (0 - 0.10), (0.10 - 0.20), 
(0.20 - 0.30)..upto (0.90 - 1) 


Thus, I
 need to have a data frame as

range   
 frequency
0 - 0.10                  ...
0.10 - 0.20
 ...
..


0.90 - 1 .

I understand I need to write my code and ask for some help if the need be. But 
I am simply clueless at the moment.

Kindly guide.

Katherine
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Can data.frame be saved as image?

2012-12-21 Thread Katherine Gobin
Dear R forum

I have one stupid question, but I have no other solution to it in sight?

Suppose some R process creates graphs etc alongwith main output as data.frame 
e.g 

output1 = data.frame(bands = c("A", "B", "C"), results = c(74, 108,  65))

I normally save this output as some csv file.

But I need to save this output as some image (I understand this is weird, but I 
need to find out some way to do so) e.g. for graph, I use 'png' as

png("histogram.png", width=480,height=480)

.

..

dev.off()

Please advise.

Regards

Katherine


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can data.frame be saved as image?

2012-12-21 Thread Katherine Gobin
Dear Sir,

Thanks a lot for your suggestion. In the meantime I came across 

http://stackoverflow.com/questions/10587621/how-to-print-to-paper-a-nicely-formatted-data-frame

and got to know about the package "gridExtra"

So I used following code 

png(filename = "output1.png", width=480,height=480)
grid.table(output1)

dev.off()

And that solved it. 

Thanks again Sir for suggesting other two  pacakages 'lattice' or 'ggplot2' as 
definitely I will like to decipher these two.

I understand before posting the mail to the forum, I should have tried old 
mails etc, but I was bit desperate to know the solution and somehow I felt it's 
a stupid thing to do so. I will remember it next time.

Regards

Katherine



--- On Fri, 21/12/12, jim holtman  wrote:

From: jim holtman 
Subject: Re: [R] Can data.frame be saved as image?
To: "Katherine Gobin" 
Cc: r-help@r-project.org
Date: Friday, 21 December, 2012, 2:39 PM

do you want to save the dataframe used in the plot and then the plot
itself?  If so consider using 'lattice' or 'ggplot2' which create an
object for "print" and this would allow you to use 'save' to save both
objects in a file.

If you want to generate the 'png' file, the you would have to 'save'
the dataframe and then 'zip' the .RData and png file into a new file.

So what is it that you intend to do with the data that is saved in the
common file?

On Fri, Dec 21, 2012 at 8:59 AM, Katherine Gobin
 wrote:
> Dear R forum
>
> I have one stupid question, but I have no other solution to it in sight?
>
> Suppose some R process creates graphs etc alongwith main output as data.frame 
> e.g
>
> output1 = data.frame(bands = c("A", "B", "C"), results = c(74, 108,  65))
>
> I normally save this output as some csv file.
>
> But I need to save this output as some image (I understand this is weird, but 
> I need to find out some way to do so) e.g. for graph, I use 'png' as
>
> png("histogram.png", width=480,height=480)
>
> .
>
> ..
>
> dev.off()
>
> Please advise.
>
> Regards
>
> Katherine
>
>
>         [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lmomco package - Random number generation using Wakeby distribution

2013-01-21 Thread Katherine Gobin
Dear R forum

>From the given data, I have estimated the parameters of Wakeby distribution 
>using lmomco package as

library(lmomco)

(amounts <- read.csv("input_S.csv")$amount)

# ___

# Wakeby distribution - Parameter estimation

N  =
 length(amounts)
lmr    = lmom.ub(amounts)
parameters_of_Wakeby   = parwak(lmr)


> parameters_of_Wakeby

$type
[1]
 "wak"

$para
  xi    alpha 
1.18813927666405e+04 0.00e+00 
    beta    gamma 
0.00e+00 8.11391042554567e+04 
   delta 
9.57554297149062e-01 

This means the scale parameters are 0.

However, assuming, all the five parameters of Wakeby distribution (viz. 
location parameter m (xi), the scale parameters a, b, and shape parameters  g 
and d are available. 

Then, how do I generate say 100 random no.s using Wakeby distribution w.r.t. 
these
 5 available parameters.

I couldn't find any information about this in lmomco. Kindly guide if random 
no.s can be generated or not and if yes, how it can be done in r.

Thanking in advance

Katherine

 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lmomco package - Random number generation using Wakeby distribution

2013-01-21 Thread Katherine Gobin
Dear Sir,
Thanks a lot for your eye-opener reply. I was just thinking of our usual 
commands like rnorm, runif etc. So I was wondering if there exists something 
like rwakeby etc. 
And lastly, I have calculated the parameters using 
> lmr                    = lmom.ub(amounts)
> parameters_of_Wakeby   = parwak(lmr)
whereas you have mentioned  lmom2par(), Will it create different set of 
parameters? Actually I am travelling and don't have R installed on the laptop I 
am carrying with me to verify ther results.
Regards
Katherine


--- On Mon, 21/1/13, David Winsemius  wrote:

From: David Winsemius 
Subject: Re: [R] lmomco package - Random number generation using Wakeby 
distribution
To: "Katherine Gobin" 
Cc: r-help@r-project.org
Date: Monday, 21 January, 2013, 7:46 PM


On Jan 21, 2013, at 10:30 AM, Katherine Gobin wrote:

> Dear R forum
> 
>> From the given data, I have estimated the parameters of Wakeby distribution 
>> using lmomco package as
> 
> library(lmomco)
> 
> (amounts <- read.csv("input_S.csv")$amount)
> 
> # ___
> 
> # Wakeby distribution - Parameter estimation
> 
> N                      =
> length(amounts)
> lmr                    = lmom.ub(amounts)
> parameters_of_Wakeby   = parwak(lmr)

It appears you have a) not included the code that produced that output and b) 
failed to read the Index page for that package

help(package="lmomco")

help(package="lmomco")

?rlmomco    #  Random Deviates of a Distribution

So on the assumption that you have an object in your workspace named 
"parameters_of_Wakeby" and it is an lmomco produced object like that returned 
by lmom2par() I would try:

rlmomco(100, parameters_of_Wakeby) 


> 
>> parameters_of_Wakeby
> 
> $type
> [1]
> "wak"
> 
> $para
>                   xi                alpha 
> 1.18813927666405e+04 0.00e+00 
>                 beta                gamma 
> 0.00e+00 8.11391042554567e+04 
>                delta 
> 9.57554297149062e-01 
> 
> This means the scale parameters are 0.
> 
> However, assuming, all the five parameters of Wakeby distribution (viz. 
> location parameter m (xi), the scale parameters a, b, and shape parameters  g 
> and d are available. 
> 
> Then, how do I generate say 100 random no.s using Wakeby distribution w.r.t. 
> these
> 5 available parameters.
> 
> I couldn't find any information about this in lmomco. Kindly guide if random 
> no.s can be generated or not and if yes, how it can be done in r.

You should have been able to find this with:

help.search("random", package="lmomco")

-- 

David Winsemius
Alameda, CA, USA


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to extract values of results in gamlss.tr

2013-01-23 Thread Katherine Gobin
Dear R helpers,

I have following loss data and I need to fit LEFT truncated Log Normal 
distribution to this data which is Truncated at 100.

dat = 
c(1333834,5710254,9987567,7809469,6940935,3473671,1270209,1102523,1124002, 
5830159,4302300,3925242,2638409,2324421,7238436,9088709,7439250,4976551,4864319,
 8741334,1863770,7098310,4942288,4971829,4986372)

library(gamlss.tr)

gen.trun(5, LOGNO)

result <- gamlss(dat~1, family=LOGNOtr)


# THIS GIVES

> result

Family:  c("LOGNOtr", "left truncated Log Normal")
 
Fitting method: RS() 

Call:  gamlss(formula = dat ~ 1, family = LOGNOtr) 

Mu Coefficients:
(Intercept)  
  15.23  
Sigma Coefficients:
(Intercept)  
    -0.3977  

 Degrees of Freedom for the fit: 2 Residual Deg. of Freedom   23 
Global Deviance: 812.568 
    AIC: 816.568 
    SBC: 819.006 

My problem is how do I extract these values of Mu Coefficients and Sigma 
Coefficients, if I want to use these values for further analyses?

Kindly guide

Katherine Gobin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Counting various elemnts in a vactor

2013-03-26 Thread Katherine Gobin
Dear R forum

I have a vector say as given below

df = c("F", "C", "F", "B", "D", "A", "D", "D", "A", "F", "D", "F", "B",    "C")

I need to find 

(1) how many times each element occurs? e.g. in above vector F occurs 4 times, 
C occurs 2 times etc.

(2) Depending on the number of occurrences, I need to repeat the element 100 
times of the occurrences e.g. I need to repeat F 6 * 100 = 600 times, C 2*100 = 
200 times.

I can manage the second part i.e. repeating but I am not able to count the 
number of times the element is appearing in a given vector.

Kindly guide 
 
Katherine











[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Counting various elemnts in a vactor

2013-03-26 Thread Katherine Gobin
Dear Sir,

Thanks a lot for your great help. I couldn't have figured it out. 

Thanks again.

Regards

Katherine

--- On Tue, 26/3/13, D. Rizopoulos  wrote:

From: D. Rizopoulos 
Subject: Re: [R] Counting various elemnts in a vactor
To: "Katherine Gobin" 
Cc: "r-help@r-project.org" 
Date: Tuesday, 26 March, 2013, 8:23 AM

try this:

df <- c("F", "C", "F", "B", "D", "A", "D", "D", "A", "F", "D", "F", "B", 
    "C")

tab <- table(df)
tab
rep(names(tab), 100 * tab)


I hope it helps.

Best,
Dimitris


On 3/26/2013 9:12 AM, Katherine Gobin wrote:
> Dear R forum
>
> I have a vector say as given below
>
> df = c("F", "C", "F", "B", "D", "A", "D", "D", "A", "F", "D", "F", "B",    
> "C")
>
> I need to find
>
> (1) how many times each element occurs? e.g. in above vector F occurs 4 
> times, C occurs 2 times etc.
>
> (2) Depending on the number of occurrences, I need to repeat the element 100 
> times of the occurrences e.g. I need to repeat F 6 * 100 = 600 times, C 2*100 
> = 200 times.
>
> I can manage the second part i.e. repeating but I am not able to count the 
> number of times the element is appearing in a given vector.
>
> Kindly guide
>
> Katherine
>
>
>
>
>
>
>
>
>
>
>
>     [[alternative HTML version deleted]]
>
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web: http://www.erasmusmc.nl/biostatistiek/
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Archieve of mails from R forum

2013-03-27 Thread Katherine Gobin
Dear R helpers,

Everyday I do receive many many mails from R forum and after some period of 
times, INBOX is filled with numerous mails. At times if for some period of 
time, I haven't accessed mails, it becomes difficult to keep track of mails and 
many times simply due to the volume (and owing to the lack of time due to 
office constraints), I have to simply delete the mails without opening them and 
I understand this is a huge loss.

If in case I wish to refer to all the old emails that have been appeared in the 
R forum, where do I get these? Is there any list where I will get subject-wise 
of thread-wise archive of old emails? I understand that will be an ocean of 
quality information and one can learn a lot from these old mails and I don't 
need to keep track of my emails all the time.
 

Kindly guide.

Regards

Katherine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Archieve of mails from R forum

2013-03-27 Thread Katherine Gobin
Dear Sir,

Thanks a lot for the input. I am sure it will go a long way for me to 
understand R.

Thanks again.

Regards

Katherine

--- On Wed, 27/3/13, Mason  wrote:

From: Mason 
Subject: Re: [R] Archieve of mails from R forum
To: "Marc Schwartz" 
Cc: "Katherine Gobin" , "r-help@r-project.org help" 

Date: Wednesday, 27 March, 2013, 7:30 PM

http://r-help.markmail.org/ has a nice interface for searching the archives, 
too.

On Wed, Mar 27, 2013 at 12:18 PM, Marc Schwartz  wrote:



On Mar 27, 2013, at 1:58 PM, Katherine Gobin  wrote:



> Dear R helpers,

>

> Everyday I do receive many many mails from R forum and after some period of 
> times, INBOX is filled with numerous mails. At times if for some period of 
> time, I haven't accessed mails, it becomes difficult to keep track of mails 
> and many times simply due to the volume (and owing to the lack of time due to 
> office constraints), I have to simply delete the mails without opening them 
> and I understand this is a huge loss.


>

> If in case I wish to refer to all the old emails that have been appeared in 
> the R forum, where do I get these? Is there any list where I will get 
> subject-wise of thread-wise archive of old emails? I understand that will be 
> an ocean of quality information and one can learn a lot from these old mails 
> and I don't need to keep track of my emails all the time.


>

>

> Kindly guide.

>

> Regards

>

> Katherine

>



The official archives for R-Help are here:



  https://stat.ethz.ch/pipermail/r-help/



and these are mirrored in various locations, such as:



  http://www.mail-archive.com/r-help@stat.math.ethz.ch/

  http://dir.gmane.org/gmane.comp.lang.r.general



You can also search the archives for all R lists at:



  http://rseek.org/

  http://finzi.psych.upenn.edu/search.html

  http://tolstoy.newcastle.edu.au/R/





My recommendation would be to set up a mail filter or rule (using 
r-help@r-project.org in the the sender and cc: address fields) so that the list 
e-mails are automatically moved from your main inbox to a folder just for these 
e-mails and you can then browse them as your schedule permits, rather than 
having them interspersed with other e-mails in the same location. I do this 
with a number of the R related lists and have a folder for each one to keep 
them separated. Most e-mail clients and/or online services have some type of 
filtering or rule configuration available to do this.




Regards,



Marc Schwartz



__

R-help@r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to delete Identical columns

2013-03-28 Thread Katherine Gobin
Dear R forum

Suppose I have a data.frame 

df = data.frame(id = c(1:6), x = c(15, 21, 14, 21, 14, 38), y = c(36, 38, 55, 
11, 5, 18), x.1 = c(15, 21, 14, 21, 14, 38), z = c("D", "B", "A", "F", "H", 
"P"))


> df
  id  x  y    x.1 z
1  1 15 36  15 D
2  2 21 38  21 B
3  3 14 55  14 A
4  4 21 11  21 F
5  5 14  5  14 H
6  6 38 18  38 P


Clearly columns x and x.1 are identical. In reality, I have a large data.frame 
and can't make out which columns are identical, but I am sure that column with 
name say x is repeated as x.1, x.2 etc.

How to automatically identify and retain only one column (in this example 
column x) among the identical columns besides other non-identical columns (viz. 
id, y and z).


Regards

Katherine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to delete Identical columns

2013-03-28 Thread Katherine Gobin
Dear Sir,

Thanks a lot for your wonderful solution. When I applied it my data.frame, 
however, it was deleting many other columns also having repeated type of column 
names i.e. suppose I wanted only to delete say ABC.1, ABC.2 etc. and retain 
XYZ, XYZ.1, XYZ2 etc. But this was not happening and alongwith ABC series, it 
was deleting XYZ series too. So I changed the command you had given as -

df[ -grep( "\\.", names( df))] to

df[ -grep( "XYZ\\.", names( df))]

And it lead me to the desired result.

Thanks again sir.

Regards

Katherine




--- On Thu, 28/3/13, Gerrit Eichner  wrote:

From: Gerrit Eichner 
Subject: Re: [R] How to delete Identical columns
To: "Katherine Gobin" 
Cc: r-help@r-project.org
Date: Thursday, 28 March, 2013, 8:58 AM

Hi, Katherine,

IF the naming scheme of the columns of your data frame is consistently 
 and  if duplicated columns 
appear THEN (something like)

df[ -grep( "\\.", names( df))]

could help. (But it's maybe more efficient to avoid - a priori - producing 
duplicated columns, if the data frame is large, as you say.)

  Regards -- Gerrit


On Thu, 28 Mar 2013, Katherine Gobin wrote:

> Dear R forum
>
> Suppose I have a data.frame
>
> df = data.frame(id = c(1:6), x = c(15, 21, 14, 21, 14, 38), y = c(36, 38, 55, 
> 11, 5, 18), x.1 = c(15, 21, 14, 21, 14, 38), z = c("D", "B", "A", "F", "H", 
> "P"))
>
>
>> df
>   id  x  y    x.1 z
> 1  1 15 36  15 D
> 2  2 21 38  21 B
> 3  3 14 55  14 A
> 4  4 21 11  21 F
> 5  5 14  5  14 H
> 6  6 38 18  38 P
>
>
> Clearly columns x and x.1 are identical. In reality, I have a large 
> data.frame and can't make out which columns are identical, but I am sure that 
> column with name say x is repeated as x.1, x.2 etc.
>
> How to automatically identify and retain only one column (in this example 
> column x) among the identical columns besides other non-identical columns 
> (viz. id, y and z).
>
>
> Regards
>
> Katherine
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Better way of writing R code

2013-04-03 Thread Katherine Gobin
Dear R forum,

(Pl note this is not a finance problem)

I have two data.frames as 

currency_df = data.frame(current_date = c("3/4/2013", "3/4/2013", "3/4/2013", 
"3/4/2013"), issue_date = c("27/11/2012", "9/12/2012", "14/01/2013", 
"28/02/2013"), maturity_date = c("27/04/2013", "3/5/2013", "14/6/2013", 
"28/06/2013"), currency = c("USD", "USD", "GBP", "SEK"), other_currency = 
c("EURO", "CAD", "CHF", "USD"), transaction = c("Buy", "Buy", "Sell", "Buy"), 
units_currency = c(10, 25000, 15, 4), units_other_currency = 
c(78000, 25350, 99200, 6150)) 

rate_df = 
data.frame(date = 
c("28/3/2013","27/3/2013","26/3/2013","25/3/2013","28/3/2013","27/3/2013","26/3/2013",
 
"25/3/2013","28/3/2013","27/3/2013","26/3/2013","25/3/2013","28/3/2013","27/3/2013","26/3/2013",
 
"25/3/2013","28/3/2013","27/3/2013","26/3/2013","25/3/2013","28/3/2013","27/3/2013","26/3/2013",
 
"25/3/2013","28/3/2013","27/3/2013","26/3/2013","25/3/2013","28/3/2013","27/3/2013","26/3/2013",
 "25/3/2013","28/3/2013","27/3/2013","26/3/2013","25/3/2013"), 

currency =  c("USD","USD","USD","USD", "USD", "USD", "USD","USD","USD","USD", 
"USD","USD", "GBP","GBP","GBP","GBP","GBP","GBP","GBP","GBP", "GBP","GBP", 
"GBP","GBP", "EURO","EURO","EURO","EURO","EURO","EURO","EURO", "EURO", 
"EURO","EURO", "EURO","EURO"), 

tenor = c("1 day","1 day","1 day","1 day","1 week","1 week","1 week","1 
week","2 weeks","2 weeks","2 weeks","2 weeks","1 day","1 day","1 day","1 
day","1 week","1 week","1 week","1 week","2 weeks","2 weeks","2 weeks","2 
weeks","1 day","1 day","1 day","1 day","1 week","1 week","1 week","1 week","2 
weeks","2 weeks","2 weeks","2 weeks"), 

rate = 
c(0.156,0.157,0.157,0.155,0.1752,0.1752,0.1752,0.1752,0.1752,0.1752,0.1752,
 0.1752,0.48625, 
0.485,0.48625,0.4825,0.49,0.49125,0.4925,0.49,0.49375,0.49125,0.4925, 
0.49125,0.02643,0.02214, 
0.02214,0.01929,0.034,0.034,0.034125,0.034,0.044,0.044, 0.041,0.045))

# ___

# 1st data.frame
 
> currency_df
  current_date issue_date maturity_date currency
1 3/4/2013 27/11/2012    27/04/2013  USD
2 3/4/2013  9/12/2012  3/5/2013  USD
3 3/4/2013 14/01/2013 14/6/2013  GBP
4 3/4/2013 28/02/2013    28/06/2013  SEK
  other_currency transaction units_currency
1  
 EURO Buy 10
2    CAD Buy  25000
3    CHF    Sell 15
4    USD Buy  4
  units_other_currency
1    78000
2   
 25350
3    99200
4 6150

# 
...

# 2nd data.frame

> rate_df
    date currency   tenor rate
1  28/3/2013  USD   1 day 0.156000
2  27/3/2013  USD   1 day 0.157000
3  26/3/2013  USD   1 day 0.157000
4  25/3/2013  USD   1 day 0.155000
5  28/3/2013  USD  1 week 0.175200
6  27/3/2013  USD  1 week
 0.175200
7  26/3/2013  USD  1 week 0.175200
8  25/3/2013  USD  1 week 0.175200
9  28/3/2013  USD 2 weeks 0.175200
10 27/3/2013  USD 2 weeks 0.175200
11 26/3/2013  USD 2 weeks 0.175200
12 25/3/2013  USD 2 weeks 0.175200
13 28/3/2013  GBP   1 day 0.486250
14 27/3/2013  GBP   1 day 0.485000
15 26/3/2013  GBP   1 day 0.486250
16 25/3/2013  GBP   1 day 0.482500
17 28/3/2013  GBP  1 week 0.49
18 27/3/2013  GBP  1 week 0.491250
19 26/3/2013  GBP  1 week 0.492500
20
 25/3/2013  GBP  1 week 0.49
21 28/3/2013  GBP 2 weeks 0.493750
22 27/3/2013  GBP 2 weeks 0.491250
23 26/3/2013  GBP 2 weeks 0.492500
24 25/3/2013  GBP 2 weeks 0.491250
25 28/3/2013 EURO   1 day 0.026430
26 27/3/2013 EURO   1 day 0.022140
27 26/3/2013 EURO   1 day 0.022140
28 25/3/2013 EURO   1 day 0.019290
29 28/3/2013 EURO  1 week 0.034000
30 27/3/2013 EURO  1 week 0.034000
31 26/3/2013 EURO  1 week 0.034125
32 25/3/2013 EURO  1 week 0.034000
33 28/3/2013 EURO 2 weeks 0.044000
34
 27/3/2013 EURO 2 weeks 0.044000
35 26/3/2013 EURO 2 weeks 0.041000
36 25/3/2013 EURO 2 weeks 0.045000

# ___

Using plyr and reshape libraries, I have converted the rate_df into tabular 
form as

   date   USD_1 day USD_1 week USD_2 weeks GBP_1 day
1 25/3/2013 0.155 0.1752  0.1752   0.48250
2 26/3/2013 0.157 0.1752  0.1752   0.48625
3 27/3/2013 0.157 0.1752  0.1752   0.48500
4 28/3/2013 0.156 0.1752  0.1752   0.48625
 
 GBP_1 week GBP_2 weeks EURO_1 day EURO_1 week
1    0.49000 0.49125    0.01929    0.034000
2    0.49250 0.49250    0.02214    0.034125
3    0.49125 0.49125    0.02214    0.034000
4    0.49000 0.49375    0.02643    0.034000
  EURO_2 weeks
1    0.045
2    0.041
3    0.044
4    0.044

# __

Depending on the maturity period, I hav

Re: [R] Better way of writing R code

2013-04-04 Thread Katherine Gobin
Dear Sirs,

I sincerely apologize for the blunder at my end. Problem is I was told that one 
cannot or should not send any ATTACHMENTS. In the past, when I had tried to 
attach some files and the message was displayed less the attachment. Also, at 
times it becomes very difficult to attach the csv file. 

As my input files contain the csv files and since I was under the impression 
that we cannot attach the files to this forum.

I once again apologize to all of you for the inconvenience caused.

Regards

Katherine



--- On Thu, 4/4/13, Gabor Grothendieck  wrote:

From: Gabor Grothendieck 
Subject: Re: [R] Better way of writing R code
To: "Adams, Jean" 
Cc: "Katherine Gobin" , "R help" 

Date: Thursday, 4 April, 2013, 2:48 PM

On Thu, Apr 4, 2013 at 9:32 AM, Adams, Jean  wrote:
> Katherine,
>
> You should cc the R-help on all correspondence.
> The more eyes that see your query, the quicker and probably the better the
> response will be.
> Send your message as plain text with no attachments ... so, include your
> code, and use dput() to share some example data.
>

Although many types of attachments are not allowed it seems that .txt,
.R, .png, .pdf and possibly certain other types are accepted.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Better way of writing R code

2013-04-06 Thread Katherine Gobin
Dear Sir,

Thanks a lot for your great help. Do appreciate it a lot. In my earlier mail, 
where I had attached some files, 

I have realized yesterday that instead of sending the R code customized by me 
based on your guidance, I had by mistake attached the contents of email. I do 
apologize to you for the same. Thanks once again and sorry for the 
inconvenience caused by me.

Regards

Katherine

--- On Fri, 5/4/13, Adams, Jean  wrote:

From: Adams, Jean 
Subject: Re: [R] Better way of writing R code
To: "Katherine Gobin" 
Cc: "R help" 
Date: Friday, 5 April, 2013, 2:40 PM

Katherine,
To preserve the original order, you could create a new variable for the 
currency data frame (BEFORE the merges), then use this variable to reorder at 
the end.

     currency_df$orig.order <- 1:dim(currency_df)[1]
You can do another merge for the other currency, you just need to specify the 
columns that you want to merge by.  The rate information will be called rate.x 
for the first currency (from the first merge) and rate.y for the other currency 
(from the second merge).

     both2 <- merge(both, rate_df, by.x=c("other_currency", "tenor"), 
by.y=c("currency", "tenor"), all.x=TRUE)

Then reorder.

     both2 <- both2[order(both2$orig.order), ]
Jean



On Thu, Apr 4, 2013 at 3:19 AM, Katherine Gobin  
wrote:


Dear Mr Adams,



I sincerely apologize for taking the liberty of writing to you. I 
wholeheartedly thank you for the wonderful solution you had provided me 
yesterday. I have customized the R code you had provided and it's yielding the 
results. I can't imagine me repeating the 1 lines code after receving such 
a powerful solution from you. In future it will save lots of efforts from my 
side as I always deal with such situation. 



There is one small problem though - 

I am dealing with pair of currencies 

e.g. currency    other_currency    transaction
  USD   EURO               Buy
      USD           CAD            
     Buy
      GBP           CHF                 Sell
      SEK           USD                 Buy


The R code gives me the currency rates (w.r.t. appropriate "tenor"), however, I 
need the corresponding rates pertaining to the other currency too i.e. in the 
first case, the maturity period applicable is one month so the R - code gives 
me one month LIBOR wr.t. USD, but I need the corresponding one month LIBOR 
w.r.t. the other currency i.e. EURO in this case.



I tried to improve upon the merge statement and used "?merge", but couldn't.  
Another problem is the order of the original portfolio is not mainteained , but 
I think I can manage the order.

With warm regards



Katherine








--- On Wed, 3/4/13, Adams,
 Jean  wrote:



From: Adams, Jean 
Subject: Re: [R] Better way of writing R code
To: "Katherine Gobin" 


Cc: "R help" 
Date: Wednesday, 3 April, 2013, 2:08 PM

Katherine,


You don't need to convert rate_df into tabular form.  You just need to 
categorize each row in currency_df into a "tenor".  Then you can merge the two 
data frames (by currency and tenor).  For example ...




# convert dates to R dates, to calculate the number of days to maturity# I am 
assuming this is the number of days from the current date to the maturity date

currency_df$maturity <- as.Date(currency_df$maturity_date, 
"%d/%m/%Y")currency_df$current <- as.Date(currency_df$current_date, 
"%d/%m/%Y")currency_df$days2mature <- as.numeric(currency_df$maturity - 
currency_df$current)




# categorize the number of days to maturity as you wish# you may need to change 
the breaks= option to suit your needs# read about the cut function to make sure 
you get the cut points included in the proper category, ?cut



currency_df$tenor <- cut(currency_df$days2mature, breaks=c(0, 1, 7, 14, 
seq(from=30.5, length=12, by=30.5)),labels=c("1 day", "1 week", "2 weeks", 
"1 month", paste(2:12, "months")))




# merge the currency_df and rate_df# this will work better with real data, 
since the example data you provided didn't have matching tenorsboth <- 
merge(currency_df, rate_df, all.x=TRUE)




Jean


On Wed, Apr 3, 2013 at 5:21 AM, Katherine Gobin  
wrote:




Dear R forum,



(Pl note this is not a finance problem)



I have two data.frames as



currency_df = data.frame(current_date = c("3/4/2013", "3/4/2013", "3/4/2013", 
"3/4/2013"), issue_date = c("27/11/2012", "9/12/2012", "14/01/2013", 
"28/02/2013"), maturity_date = c("27/04/2013", "3/5/2013", "14/6/2013", 
"28/06/2013"), currency = c("USD", "USD", "GBP", "SEK"), other_currency = 
c("EURO", "CAD", "CHF", "USD"), transactio

[R] lmomco - Three-Parameter Pearson 5 Distribution

2013-04-06 Thread Katherine Gobin
Dear R forum,

I am bit confused and please guide me -

(1) Is "Pearson Type III Distribution" as given in lmomco package same as Three 
Parameter Pearson 5 Distribution?

If not, how do I estimate the parameters of Three Parameter Pearson 5 
Distribution?

(2) Is there any other R forum dealing with only Statistical queries?

Kindly guide

Regards

Katherine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Package ‘FAdist’ - Log-Pearson Type III Distribution

2013-04-06 Thread Katherine Gobin
Dear Sir,

I am referring to your package "FAdist". I wish to know how to estimate the 
parameters of the distribution - "Log-Pearson Type III Distribution"?

Will it be possible for you to guide me or inform the package in R, I can use 
to estimate the parameters.

Regards

Katherine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sorting data.frame and again sorting within data.frame

2013-04-14 Thread Katherine Gobin
Dear R forum,

I have a data.frame as defied below - 

df = data.frame(names = c("C", "A", "A", "B", "C", "B", "A", "B", "C"), dates = 
c("4/15/2013", "4/13/2013", "4/15/2013", "4/13/2013", "4/13/2013", "4/15/2013", 
"4/14/2013", "4/14/2013","4/14/2013" ),values = c(10, 31, 31, 17, 11, 34, 102, 
47, 29))

> df
  names dates values
1 C 4/15/2013 10
2 A 4/13/2013 31
3 A 4/15/2013 31
4 B 4/13/2013 17
5 C 4/13/2013 11
6 B
 4/15/2013 34
7 A 4/14/2013    102
8 B 4/14/2013 47
9 C 4/14/2013 29

I need to sort df first on "names" in increasing order and then further on 
"dates" in a decreasing order i.e. I need

names    dates    values
A    4/15/2013  31
A    4/14/2013 102
A    4/13/2013  31
B    4/15/2013 
 34
B    4/14/2013  47
B    4/13/2013  17
C    4/15/2013  10
C    4/14/2013  29
C    4/13/2013  11

I tried

df_sorted = df[order(df$names, (as.Date(df$dates, "%m/%d/%Y")), decreasing = 
TRUE),]

> df_sorted
  names dates values
1 C 4/15/2013 10
9 C 4/14/2013 29
5 C 4/13/2013
 11
6 B 4/15/2013 34
8 B 4/14/2013 47
4 B 4/13/2013 17
3 A 4/15/2013 31
7 A 4/14/2013    102
2 A 4/13/2013 31


I need A to appear first with all three corresponding dates in decreasing 
order, then B and so on.

Please guide.

With regards

Katherine


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting data.frame and again sorting within data.frame

2013-04-16 Thread Katherine Gobin
Dear Sir,

Thanks a lot for your valuable input and guidance.

Regards

Katherine

--- On Mon, 15/4/13, Jeff Newmiller  wrote:

From: Jeff Newmiller 
Subject: Re: [R] Sorting data.frame and again sorting within data.frame
To: "David Winsemius" , "Katherine Gobin" 

Cc: r-help@r-project.org
Date: Monday, 15 April, 2013, 5:33 PM

Yes, that would be because she converted to Date on the fly in her example, and 
so apparently did not need this reminder.
---
Jeff Newmiller                        The     .       .  Go Live...
DCN:        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
---
Sent from my phone. Please excuse my brevity.

David Winsemius  wrote:

>
>On Apr 14, 2013, at 11:01 PM, Katherine Gobin wrote:
>
>> Dear R forum,
>> 
>> I have a data.frame as defied below - 
>> 
>> df = data.frame(names = c("C", "A", "A", "B", "C", "B", "A", "B",
>"C"), dates = c("4/15/2013", "4/13/2013", "4/15/2013", "4/13/2013",
>"4/13/2013", "4/15/2013", "4/14/2013", "4/14/2013","4/14/2013" ),values
>= c(10, 31, 31, 17, 11, 34, 102, 47, 29))
>> 
>>> df
>>   names     dates values
>> 1     C 4/15/2013     10
>> 2     A 4/13/2013     31
>> 3     A 4/15/2013     31
>> 4     B 4/13/2013     17
>> 5     C 4/13/2013     11
>> 6     B
>> 4/15/2013     34
>> 7     A 4/14/2013    102
>> 8     B 4/14/2013     47
>> 9     C 4/14/2013     29
>> 
>> I need to sort df first on "names" in increasing order and then
>further on "dates" in a decreasing order i.e. I need
>> 
>
>So far no one has pointed out that these are not really "Dates" in the
>R sense and will not sort correctly if any of the proposed methods are
>applied to sequences that extend beyond6 months, i.e, until October
>forward. You would be advised to convert to real Date-classed
>variables.
>
>?strptime
>?as.Date


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Splitting the Elements of character vector

2013-04-16 Thread Katherine Gobin
Dear R forum

I have a data.frame 

df = data.frame(currency_type = c("EURO_o_n", "EURO_o_n", "EURO_1w", "EURO_1w", 
"USD_o_n", "USD_o_n", "USD_1w", "USD_1w"), rates = c(0.47, 0.475, 0.461, 0.464, 
1.21, 1.19, 1.41, 1.43))

  currency_type rates
1  EURO_o_n   0.470
2  EURO_o_n   0.475
3   EURO_1w   0.461
4   EURO_1w   0.464
5   USD_o_n    1.210
6   USD_o_n    1.190
7    USD_1w    1.410
8    USD_1w    1.430


I need to split the values appearing under currency_type to obtain following 
data.frame in the "original order"

currency tenor   rates
EURO o_n     0.470
EURO o_n     0.475
EURO 1w  0.461 
EURO 1w  0.464
USD   o_n 1.210
USD   o_n 1.190
USD           1w      1.410
USD           1w      1.430    

Basically I need to split the currency name and tenors.

I tried

strsplit(df$currency_type, "_")
Error in strsplit(df$currency_type, "_") : non-character argument

Kindly guide

Katherine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Creating a vector with repeating dates

2013-04-17 Thread Katherine Gobin
Dear R forum

I have a data.frame

df = data.frame(dates = c("4/15/2013", "4/14/2013", "4/13/2013", "4/12/2013"), 
values = c(47, 38, 56, 92))

I need to to create a vector by repeating the dates as 

"Current_date", 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013,  "Current_date", 
4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, Current_date, 4/15/2013, 4/14/2013, 
4/13/2013, 4/12/2013

i.e. I need to create a new vector as given below which I need to use for some 
other purpose.

Current_date
4/15/2013
4/14/2013
4/13/2013
4/12/2013
Current_date
4/15/2013
4/14/2013
4/13/2013
4/12/2013
Current_date
4/15/2013
4/14/2013
4/13/2013
4/12/2013

Is it possible to construct such a
 column?

Regards

Katherine



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating a vector with repeating dates

2013-04-17 Thread Katherine Gobin
Dear Andrija Djurovic,

Thanks for the suggestion. Ia m aware of "rep". However, here I need to repeat 
not only dates, but a string "Current_date". Thus, I need to create a vector ( 
to be included in some other data.frame) with the name say "dt" which will 
contain

dt
Current_date

4/15/2013

4/14/2013

4/13/2013

4/12/2013

Current_date

4/15/2013

4/14/2013

4/13/2013

4/12/2013

Current_date

4/15/2013

4/14/2013

4/13/2013

4/12/2013

So this is combination of dates and a string. Hence, I am just wondering if it 
is possible to create such a vector or not?

Regards

Katherine


--- On Wed, 17/4/13, andrija djurovic  wrote:

From: andrija djurovic 
Subject: Re: [R] Creating a vector with repeating dates
To: "Katherine Gobin" 
Cc: "r-help@r-project.org" 
Date: Wednesday, 17 April, 2013, 10:14 AM

?rep

On Wed, Apr 17, 2013 at 11:11 AM, Katherine Gobin  
wrote:

Dear R forum



I have a data.frame



df = data.frame(dates = c("4/15/2013", "4/14/2013", "4/13/2013", "4/12/2013"), 
values = c(47, 38, 56, 92))



I need to to create a vector by repeating the dates as



"Current_date", 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013,  "Current_date", 
4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, Current_date, 4/15/2013, 4/14/2013, 
4/13/2013, 4/12/2013



i.e. I need to create a new vector as given below which I need to use for some 
other purpose.



Current_date

4/15/2013

4/14/2013

4/13/2013

4/12/2013

Current_date

4/15/2013

4/14/2013

4/13/2013

4/12/2013

Current_date

4/15/2013

4/14/2013

4/13/2013

4/12/2013



Is it possible to construct such a

 column?



Regards



Katherine







        [[alternative HTML version deleted]]




__

R-help@r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating a vector with repeating dates

2013-04-17 Thread Katherine Gobin
Dear Sir,

Thanks a lot for your valuable suggestions and help. 

Regards

Katherine


--- On Wed, 17/4/13, Jim Lemon  wrote:

From: Jim Lemon 
Subject: Re: [R] Creating a vector with repeating dates
To: "Katherine Gobin" 
Cc: r-help@r-project.org
Date: Wednesday, 17 April, 2013, 10:35 AM

On 04/17/2013 07:11 PM, Katherine Gobin wrote:
> Dear R forum
>
> I have a data.frame
>
> df = data.frame(dates = c("4/15/2013", "4/14/2013", "4/13/2013", 
> "4/12/2013"), values = c(47, 38, 56, 92))
>
> I need to to create a vector by repeating the dates as
>
> "Current_date", 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013,  "Current_date", 
> 4/15/2013, 4/14/2013, 4/13/2013, 4/12/2013, Current_date, 4/15/2013, 
> 4/14/2013, 4/13/2013, 4/12/2013
>
> i.e. I need to create a new vector as given below which I need to use for 
> some other purpose.
>
> Current_date
> 4/15/2013
> 4/14/2013
> 4/13/2013
> 4/12/2013
> Current_date
> 4/15/2013
> 4/14/2013
> 4/13/2013
> 4/12/2013
> Current_date
> 4/15/2013
> 4/14/2013
> 4/13/2013
> 4/12/2013
>
> Is it possible to construct such a
>   column?
>
Hi Katherine,
How about:

rep(c("Current date",paste(4,15:12,2013,sep="/")),3)

Jim


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error with function

2013-04-22 Thread Katherine Gobin
Dear R forum,

I have a data.frame as given below:

df = data.frame(tran = c("tran1", "tran2", "tran3", "tran4"), tenor = c("2w", 
"1m", "7m", "3m"))  

Also, I define

libor_tenor_labels = as.character(c("o_n", "1w", "2w", 
"1m", "2m", "3m", "4m", "5m", "6m", "7m", "8m", "9m", "10m", "11m", 
"12m"))

# 

> df
   tran tenor
1 tran1    2w
2 tran2    1m
3 tran3    7m
4 tran4    3m

# __

# libor_tenor_labels can be anything and need not be 15. Also, df need not be 
consisting of only 4 record. Basically, I can't HARD CODE anything.

In df, first tenor is 2w. So I need to define a previous tenor as "1w" and nest 
tenor as "1m" i.e. I need the output

> df_new
   tran tenor prev_tenor nxt_tenor
1 tran1    2w 1w    1m
2 tran2    1m 2w    2m
3 tran3    7m 6m    8m
4 tran4    3m 2m    4m

# ___

# I have two special cases also. If the tenor is "o_n" or "12m" i.e. extremes, 
I needed to adjust the rates as given in code.

# My code
# ==


tenor_function = function(tran, tenor)

{

if (tenor == libor_tenor_labels[1])
   
   {
 prev_tenor = libor_tenor_labels[1]
 nxt_tenor = libor_tenor_labels[2]
   }

for (i in 2:(length(libor_tenor_labels)-1))

   {
if (tenor == libor_tenor_labels[i])
         {
 prev_tenor = libor_tenor_labels[i-1]
 nxt_tenor = libor_tenor_labels[i+1]
   }
  }

 if (tenor == libor_tenor_labels[length(libor_tenor_labels)])
 {
 prev_tenor = libor_tenor_labels[(length(libor_tenor_labels)-1)]
 nxt_tenor = libor_tenor_labels[length(libor_tenor_labels)]
 }

   return(data.frame(tran = tran, prev_tenor = prev_tenor, tenor = tenor, 
nxt_tenor = nxt_tenor)

}

(tenor_libors = ddply(.data = df, .variables = "tran", .fun = function(x) 
tenor_function(tran = x$tran, tenor = x$tenor)))

# __

# ERROR - I get following error


Error: unexpected '}' in:
"
}"
> 
> (tenor_libors = ddply(.data = df, .variables = "tran", .fun = function(x) 
> tenor_function(tran = x$tran, tenor = x$tenor)))
Error in .fun(piece, ...) : could not find function "tenor_function"

# __

Kindly guide

With warn regards

Katherine
















[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fw: Error with function - USING library(plyr)

2013-04-22 Thread Katherine Gobin
Dear R forum,

Please refer to my query regarding "Error with function". I forgot to mention 
that I am using "plyr" library. 

Sorry for inconvenience.

Regards

Katherine



--- On Tue, 23/4/13, Katherine Gobin  wrote:

From: Katherine Gobin 
Subject: [R] Error with function
To: r-help@r-project.org
Date: Tuesday, 23 April, 2013, 7:06 AM

Dear R forum,

I have a data.frame as given below:

df = data.frame(tran = c("tran1", "tran2", "tran3", "tran4"), tenor = c("2w", 
"1m", "7m", "3m"))  

Also, I define

libor_tenor_labels = as.character(c("o_n", "1w", "2w", 
"1m", "2m", "3m", "4m", "5m", "6m", "7m", "8m", "9m", "10m", "11m", 
"12m"))

# 

> df
   tran tenor
1 tran1    2w
2 tran2    1m
3 tran3    7m
4 tran4    3m

# __

# libor_tenor_labels can be anything and need not be 15. Also, df need not be 
consisting of only 4 record. Basically, I can't HARD CODE anything.

In df, first tenor is 2w. So I need to define a previous tenor as "1w" and nest 
tenor as "1m" i.e. I need the output

> df_new
   tran tenor prev_tenor nxt_tenor
1 tran1    2w 1w    1m
2 tran2    1m 2w    2m
3 tran3    7m 6m    8m
4 tran4    3m 2m    4m

# ___

# I have two special cases also. If the tenor is "o_n" or "12m" i.e. extremes, 
I needed to adjust the rates as given in code.

# My code
# ==


tenor_function = function(tran, tenor)

{

if (tenor == libor_tenor_labels[1])
   
   {
 prev_tenor = libor_tenor_labels[1]
 nxt_tenor = libor_tenor_labels[2]
   }

for (i in 2:(length(libor_tenor_labels)-1))

   {
if (tenor == libor_tenor_labels[i])
         {
 prev_tenor = libor_tenor_labels[i-1]
 nxt_tenor = libor_tenor_labels[i+1]
   }
  }

 if (tenor == libor_tenor_labels[length(libor_tenor_labels)])
 {
 prev_tenor = libor_tenor_labels[(length(libor_tenor_labels)-1)]
 nxt_tenor = libor_tenor_labels[length(libor_tenor_labels)]
 }

   return(data.frame(tran = tran, prev_tenor = prev_tenor, tenor = tenor, 
nxt_tenor = nxt_tenor)

}

(tenor_libors = ddply(.data = df, .variables = "tran", .fun = function(x) 
tenor_function(tran = x$tran, tenor = x$tenor)))

# __

# ERROR - I get following error


Error: unexpected '}' in:
"
}"
> 
> (tenor_libors = ddply(.data = df, .variables = "tran", .fun = function(x) 
> tenor_function(tran = x$tran, tenor = x$tenor)))
Error in .fun(piece, ...) : could not find function "tenor_function"

# __

Kindly guide

With warn regards

Katherine
















    [[alternative HTML version deleted]]


-Inline Attachment Follows-

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fw: " PROBLEM SOLVED" - Error with function

2013-04-22 Thread Katherine Gobin
Dear R forum

Please refer to my query captioned Error with function.

I had missed in bracket ")" in the return statement and hence I was getting the 
error. I has struggled for more than 2 hours to find out the problem and only 
then has posted to the forum. I sincerely apologize to all for consuming your 
valuable time.

Thanks for the efforts at your end.

Regards

Katherine





--- On Tue, 23/4/13, Katherine Gobin  wrote:

From: Katherine Gobin 
Subject: [R] Error with function
To: r-help@r-project.org
Date: Tuesday, 23 April, 2013, 7:06 AM

Dear R forum,

I have a data.frame as given below:

df = data.frame(tran = c("tran1", "tran2", "tran3", "tran4"), tenor = c("2w", 
"1m", "7m", "3m"))  

Also, I define

libor_tenor_labels = as.character(c("o_n", "1w", "2w", 
"1m", "2m", "3m", "4m", "5m", "6m", "7m", "8m", "9m", "10m", "11m", 
"12m"))

# 

> df
   tran tenor
1 tran1    2w
2 tran2    1m
3 tran3    7m
4 tran4    3m

# __

# libor_tenor_labels can be anything and need not be 15. Also, df need not be 
consisting of only 4 record. Basically, I can't HARD CODE anything.

In df, first tenor is 2w. So I need to define a previous tenor as "1w" and nest 
tenor as "1m" i.e. I need the output

> df_new
   tran tenor prev_tenor nxt_tenor
1 tran1    2w 1w    1m
2 tran2    1m 2w    2m
3 tran3    7m 6m    8m
4 tran4    3m 2m    4m

# ___

# I have two special cases also. If the tenor is "o_n" or "12m" i.e. extremes, 
I needed to adjust the rates as given in code.

# My code
# ==


tenor_function = function(tran, tenor)

{

if (tenor == libor_tenor_labels[1])
   
   {
 prev_tenor = libor_tenor_labels[1]
 nxt_tenor = libor_tenor_labels[2]
   }

for (i in 2:(length(libor_tenor_labels)-1))

   {
if (tenor == libor_tenor_labels[i])
         {
 prev_tenor = libor_tenor_labels[i-1]
 nxt_tenor = libor_tenor_labels[i+1]
   }
  }

 if (tenor == libor_tenor_labels[length(libor_tenor_labels)])
 {
 prev_tenor = libor_tenor_labels[(length(libor_tenor_labels)-1)]
 nxt_tenor = libor_tenor_labels[length(libor_tenor_labels)]
 }

   return(data.frame(tran = tran, prev_tenor = prev_tenor, tenor = tenor, 
nxt_tenor = nxt_tenor)

}

(tenor_libors = ddply(.data = df, .variables = "tran", .fun = function(x) 
tenor_function(tran = x$tran, tenor = x$tenor)))

# __

# ERROR - I get following error


Error: unexpected '}' in:
"
}"
> 
> (tenor_libors = ddply(.data = df, .variables = "tran", .fun = function(x) 
> tenor_function(tran = x$tran, tenor = x$tenor)))
Error in .fun(piece, ...) : could not find function "tenor_function"

# __

Kindly guide

With warn regards

Katherine
















    [[alternative HTML version deleted]]


-Inline Attachment Follows-

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Linear Interpolation : Missing rates

2013-04-24 Thread Katherine Gobin
Dear R forum

I have data.frame as

df = data.frame(rate_name = c("USD_1w", "USD_1w", "USD_1w", "USD_1w", "USD_1m", 
"USD_1m", "USD_1m", "USD_1m", "USD_2m", "USD_2m", "USD_2m", "USD_2m",  
"GBP_1w", "GBP_1w", "GBP_1w", "GBP_1w", "GBP_1m", "GBP_1m", "GBP_1m", "GBP_1m", 
"GBP_2m", "GBP_2m", "GBP_2m", "GBP_2m", "EURO_1w", "EURO_1w", "EURO_1w", 
"EURO_1w", "EURO_2w", "EURO_2w", "EURO_2w", "EURO_2w", "EURO_2m", "EURO_2m", 
"EURO_2m", "EURO_2m"), rates = c(2.05, 2.07, 2.06, 2.06, 2.22, 2.24, 2.23, 
2.23, 2.31, 2.33, 2.33, 2.31, 1.06, 1.08, 1.08, 1.08, 1.21, 1.21, 1.23, 1.21, 
1.41, 1.39, 1.39, 1.37, 1.82, 1.82, 1.81, 1.80, 1.98, 1.98, 1.97, 1.97, 2.1, 
2.09, 2.09, 2.11)) 

currency = c("EURO", "GBP", "USD")
tenor = c("1w", "2w", "1m", "2m", "3m")

# _

> df
   rate_name rates
   rate_name rates
1 USD_1w  2.05
2 USD_1w  2.07
3 USD_1w  2.06
4 USD_1w  2.06
5 USD_1m  2.22
6 USD_1m  2.24
7 USD_1m  2.23
8 USD_1m  2.23
9 USD_2m  2.31
10    USD_2m  2.33
11    USD_2m  2.33
12    USD_2m  2.31
13    GBP_1w  1.06
14    GBP_1w  1.08
15    GBP_1w  1.08
16    GBP_1w  1.08
17    GBP_1m  1.21
18    GBP_1m  1.21
19    GBP_1m  1.23
20    GBP_1m  1.21
21    GBP_2m  1.41
22    GBP_2m  1.39
23    GBP_2m  1.39
24    GBP_2m  1.37
25   EURO_1w  1.82
26   EURO_1w  1.82
27   EURO_1w  1.81
28   EURO_1w  1.80
29   EURO_2w  1.98
30   EURO_2w  1.98
31   EURO_2w  1.97
32   EURO_2w  1.97
33   EURO_2m  2.10
34   EURO_2m  2.09
35   EURO_2m  2.09
36   EURO_2m  2.11

As can be seen that USD_2w, GBP_2w and EURO_1m are missing and I need to 
INTERPOLATE these rates, which can be done using approx or approxfun. In 
reality I can have many currencies with many tenors. Problem is when the 
data.frame "df" is read or accessed in R, I am not aware which tenor is 
missing. For a given currency, it is possible that mare than 1 consecutive 
tenors may be missing e.g. in case of EURO, I may have EURO_1w, EURO_2w and 
then EURO_4m. So EURO_1m, EURO_2m and EURO_3m are missing. 


I understand it's sort of vague question from me and do apologize for the same. 
Any suggestion please.

Regards

Katherine





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear Interpolation : Missing rates

2013-04-25 Thread Katherine Gobin
Dear Mr Adams,

Thanks a lot for your solution. I understand it was very tricky and needed lot 
of application. Thanks again and do appreciate your efforts.

Regards

Katherine

--- On Thu, 25/4/13, Adams, Jean  wrote:

From: Adams, Jean 
Subject: Re: [R] Linear Interpolation : Missing rates
To: "Katherine Gobin" 
Cc: "R help" 
Date: Thursday, 25 April, 2013, 2:23 PM

Katherine,
Split the rate names into their currency and tenor parts and assign a numeric 
value to each tenor.  Choose a model to do your approximations (I used linear 
regression in the example below).  Use this model to generate estimates for all 
combinations of currency and tenor.


For example:
# split the rate names into currency and tenorsplitnames <- do.call(rbind, 
strsplit(df$rate_name, "_"))df$currency <- as.factor(splitnames[, 1])

df$tenor <- splitnames[, 2]
# assign numeric value to each tenoruniquetenors <- c("1w", "2w", "1m", 
"2m")uniquedays <- c(7, 14, 30.5, 61)

df$tenordays <- uniquedays[match(df$tenor, uniquetenors)]
# fit a linear model of rate on tenordays for each currencyfit <- lm(rates ~ 
currency*tenordays, data=df)


# estimate rates for all combinations of currency and tenorfulldf <- 
expand.grid(tenordays=unique(df$tenordays), 
currency=unique(df$currency))fulldf$est.rates = predict(fit, newdata=fulldf)


# merge observed rates with estimated ratesdfwithest <- merge(df, fulldf, 
all=TRUE)
Jean



On Thu, Apr 25, 2013 at 12:33 AM, Katherine Gobin  
wrote:


Dear R forum



I have data.frame as



df = data.frame(rate_name = c("USD_1w", "USD_1w", "USD_1w", "USD_1w", "USD_1m", 
"USD_1m", "USD_1m", "USD_1m", "USD_2m", "USD_2m", "USD_2m", "USD_2m",  
"GBP_1w", "GBP_1w", "GBP_1w", "GBP_1w", "GBP_1m", "GBP_1m", "GBP_1m", "GBP_1m", 
"GBP_2m", "GBP_2m", "GBP_2m", "GBP_2m", "EURO_1w", "EURO_1w", "EURO_1w", 
"EURO_1w", "EURO_2w", "EURO_2w", "EURO_2w", "EURO_2w", "EURO_2m", "EURO_2m", 
"EURO_2m", "EURO_2m"), rates = c(2.05, 2.07, 2.06, 2.06, 2.22, 2.24, 2.23, 
2.23, 2.31, 2.33, 2.33, 2.31, 1.06, 1.08, 1.08, 1.08, 1.21, 1.21, 1.23, 1.21, 
1.41, 1.39, 1.39, 1.37, 1.82, 1.82, 1.81, 1.80, 1.98, 1.98, 1.97, 1.97, 2.1, 
2.09, 2.09, 2.11))





currency = c("EURO", "GBP", "USD")

tenor = c("1w", "2w", "1m", "2m", "3m")



# _



> df

   rate_name rates

   rate_name rates

1 USD_1w  2.05

2 USD_1w  2.07

3 USD_1w  2.06

4 USD_1w  2.06

5 USD_1m  2.22

6 USD_1m  2.24

7 USD_1m  2.23

8 USD_1m  2.23

9 USD_2m  2.31

10    USD_2m  2.33

11    USD_2m  2.33

12    USD_2m  2.31

13    GBP_1w  1.06

14    GBP_1w  1.08

15    GBP_1w  1.08

16    GBP_1w  1.08

17    GBP_1m  1.21

18    GBP_1m  1.21

19    GBP_1m  1.23

20    GBP_1m  1.21

21    GBP_2m  1.41

22    GBP_2m  1.39

23    GBP_2m  1.39

24    GBP_2m  1.37

25   EURO_1w  1.82

26   EURO_1w  1.82

27   EURO_1w  1.81

28   EURO_1w  1.80

29   EURO_2w  1.98

30   EURO_2w  1.98

31   EURO_2w  1.97

32   EURO_2w  1.97

33   EURO_2m  2.10

34   EURO_2m  2.09

35   EURO_2m  2.09

36   EURO_2m  2.11



As can be seen that USD_2w, GBP_2w and EURO_1m are missing and I need to 
INTERPOLATE these rates, which can be done using approx or approxfun. In 
reality I can have many currencies with many tenors. Problem is when the 
data.frame "df" is read or accessed in R, I am not aware which tenor is 
missing. For a given currency, it is possible that mare than 1 consecutive 
tenors may be missing e.g. in case of EURO, I may have EURO_1w, EURO_2w and 
then EURO_4m. So EURO_1m, EURO_2m and EURO_3m are missing.







I understand it's sort of vague question from me and do apologize for the same. 
Any suggestion please.



Regards



Katherine











        [[alternative HTML version deleted]]




__

R-help@r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Splitting data.frame and saving to csv files

2013-04-26 Thread Katherine Gobin
Dear R Forum,

I have a data.frame as

df = data.frame(date = c("2013-04-15", "2013-04-14", "2013-04-13", 
"2013-04-12", "2013-04-11"),
ABC_f = c(62.80739769,81.04525895,84.65712455,12.78237251,57.61345256),
LMN_d = c(21.16794336,54.6580401,63.8923307,87.59880367,87.07693716),
XYZ_p = c(55.8885464,94.1358684,84.0089114,98.99746696,64.71083712),
LMN_a = c(56.6768395,25.81530198,40.12268441,35.74175237,47.95892209),
ABC_e = c(11.36783959,62.29651784,47.63481552,32.27820673,52.12561419),
LMN_c = c(45.4484695,17.72362438,36.7690054,68.58912931,35.80767235), 
XYZ_zz = c(85.74755089,63.48582415,81.61107212,58.1572924,27.44132817),
PQR = c(71.22867519,95.09994812,83.62437819,30.18524735,25.81804865),
ABC_d =
 c(38.71089816,93.48216193,93.14432203,78.2738731,31.87170019),
ABC_m = c(40.28473769,43.97076327,47.38761559,97.33573412,22.06884976))


> df
    date    ABC_f    LMN_d    XYZ_p    LMN_a    ABC_e
1 2013-04-15 62.80740 21.16794 55.88855 56.67684 11.36784
2 2013-04-14 81.04526 54.65804 94.13587 25.81530 62.29652
3 2013-04-13 84.65712 63.89233 84.00891 40.12268 47.63482
4 2013-04-12 12.78237 87.59880 98.99747 35.74175 32.27821
5 2013-04-11 57.61345 87.07694 64.71084 47.95892 52.12561
 LMN_c   XYZ_zz  PQR    ABC_d    ABC_m
1 45.44847 85.74755 71.22868 38.71090 40.28474
2 17.72362 63.48582 95.09995 93.48216 43.97076
3 36.76901 81.61107 83.62438 93.14432 47.38762
4 68.58913 58.15729 30.18525 78.27387
 97.33573
5 35.80767 27.44133 25.81805 31.87170 22.06885

I need to identify columns with same labels and along-with the dates in the 
first column, save the columns in different csv files.

E.g. in the above data frame, I have 4 columns beginning with ABC so I need to 
save these four columns with the date in the first column as ABC.csv, then 
LMN_d, LMN_a, LMN_c in the LMN.csv file as date, LMN_a, LMN_c, LMN_d and so on. 
In my actual data.frame, I won't be aware how many such rates combinations are 
available. If there is no matching column as "PQR", the PQR.csv file should 
have only date and PQR column. 

Kindly guide how do I split the data.frame and save the respective csv files.

Regards

Katherine











[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Adding elements in data.frame subsets and also subtracting an element from the rest elements in data.frame

2013-04-29 Thread Katherine Gobin
Dear R forum

I have a data.frame as

cashflow_df = data.frame(instrument = 
c("ABC","ABC","ABC","ABC","ABC","ABC","ABC","ABC","ABC","ABC","ABC","ABC","ABC","ABC",
 "ABC", "PQR", "PQR", "PQR","PQR","PQR","PQR","PQR","PQR","PQR","PQR", "PQR", 
"PQR", "PQR","PQR", "PQR","PQR","PQR","PQR", "PQR","PQR","UVWXYZ","UVWXYZ", 
"UVWXYZ", "UVWXYZ", "UVWXYZ","UVWXYZ","UVWXYZ","UVWXYZ", "UVWXYZ", "UVWXYZ"),

id = c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5, 
1,1,2,2,3,3,4,4, 5,5),

cashflow = c(5000,5000,505000,5000,5000,505000,5000,5000,505000, 5000,5000, 
505000, 
5000,5000,505000,500,500,500,102000,500,500,500,102000,500,500,500,102000,500,500,500,102000,500,500,500,102000,8000,808000,8000,808000,8000,808000,8000,808000,8000,808000),

cashflows_pv = c(4931.054, 4479.1116, 431160.8529,4931.9604, 4485.6393, 
432064.0228, 
4932.5438,4489.8451,432646.2398,4932.1548,4487.0404,432257.9551,4932.6087,4490.3129,432711.0084,493.6326,474.0524,455.2489,82252.0304,493.8083,474.7543,456.4356,82744.9157,493.6003,473.9235,455.031,82161.7368,493.8175,474.7913,456.4982,82770.9849,493.8592,474.9581,456.7804,82888.4556,7451.3118,681810.5522,7462.0148,684153.4992,7441.1294,679585.9186,7426.6407,676427.7274,7427.1225,676532.6262))

#  __

> cashflow_df
   instrument id cashflow cashflows_pv
1 ABC  1 5000    4931.0540
2 ABC  1 5000    4479.1116
3 ABC  1   505000  431160.8529
4 ABC  2 5000    4931.9604
5 ABC  2 5000    4485.6393
6 ABC  2   505000  432064.0228
7 ABC  3 5000    4932.5438
8 ABC  3 5000    4489.8451
9 ABC  3   505000  432646.2398
10    ABC  4 5000    4932.1548
11    ABC  4 5000    4487.0404
12    ABC  4   505000  432257.9551
13    ABC  5 5000    4932.6087
14    ABC  5 5000    4490.3129
15    ABC  5   505000  432711.0084
16    PQR  1  500 493.6326
17    PQR  1  500 474.0524
18    PQR  1  500 455.2489
19    PQR  1   102000   82252.0304
20    PQR  2  500 493.8083
21    PQR  2  500 474.7543
22    PQR  2  500 456.4356
23    PQR  2   102000   82744.9157
24    PQR  3  500 493.6003
25    PQR  3  500 473.9235
26    PQR  3  500 455.0310
27    PQR  3   102000   82161.7368
28    PQR  4  500 493.8175
29    PQR  4  500 474.7913
30    PQR  4  500 456.4982
31    PQR  4   102000   82770.9849
32    PQR  5  500 493.8592
33    PQR  5  500 474.9581
34    PQR  5  500 456.7804
35    PQR  5   102000   82888.4556
36 UVWXYZ  1 8000    7451.3118
37 UVWXYZ  1   808000  681810.5522
38 UVWXYZ  2 8000    7462.0148
39 UVWXYZ  2   808000  684153.4992
40 UVWXYZ  3 8000    7441.1294
41 UVWXYZ  3   808000  679585.9186
42 UVWXYZ  4 8000    7426.6407
43 UVWXYZ  4   808000  676427.7274
44 UVWXYZ  5 8000    7427.1225
45 UVWXYZ  5   808000  676532.6262

# ===

# My PROBLEM


For a given instrument and id, I need the totals of cashflow and cashflows_pv  
and also the difference of (total_cashflow_pv pertaining to the first ID for 
the given instrument from total_cashflow_pv for the same instrument) as shown 
in the fourth column of following output.

output

   instrument id   total_cashflow   total_cashflow_pv
1 ABC  1 515000 440571.02
2 ABC  2 515000 441481.62
3 ABC  3 515000 442068.63
4 ABC  4 515000 441677.15
5 ABC  5 515000 442133.93
6 PQR  1 103500  83674.96
7 PQR  2 103500  84169.91
8 PQR  3 103500  83584.29
9 PQR  4 103500  84196.09
10    PQR  5 103500  84314.05
11 UVWXYZ  1 816000 689261.86
12 UVWXYZ  2 816000 691615.51
13 UVWXYZ  3 816000 687027.05
14 UVWXYZ  4 816000 683854.37
15 UVWXYZ  5 816000 683959.75
 

 cashflow_change
1   0.  # This is  (440571.02 -  440571.02) 1st ID value - 1st 
ID value for ABC 
2 910.6040    # This is  (441481.62 -  440571.02) 2nd ID value - 1st ID 
value for ABC
3    1497.6102   # This is  (442068.63 -  440571.02) 3rd ID value - 1st ID 
value for ABC
4    1106.1318
5    1562.9115
6   0.    # This is  (83674.96 - 83674.96) 1st ID value - 1st ID 
value for PQR 
7 494.9496
8 -90.6727
9 521.1276
10    639.0890
11  0.
12   2353.6500
13  -2234.8160
14  -5407.4959
15  -5302.1153   # This is  (683959.75 -689261.86 ) 5th ID value - 1st ID 
value for UVWXYZ


Kindly guide

Regards

Ka

[R] Clean Price of Bond : Can't install "RQuantLib" in R version 3.0.0

2013-05-02 Thread Katherine Gobin
Dear Forum,

I have R version 3.0.0 installed and need to install RQuantLib pacakge. I tried 
to install it from CRAN Mirror and I couldn't load it. I had saved the package 
i zip format and tried to install it locally but I am getting following error.


> utils:::menuInstallLocal()
Error in read.dcf(file.path(pkgname, "DESCRIPTION"), c("Package", "Type")) : 
  cannot open the connection
In addition: Warning message:
In read.dcf(file.path(pkgname, "DESCRIPTION"), c("Package", "Type")) :
  cannot open compressed file 'RQuantLib_0.3.10(1)/DESCRIPTION', probable 
reason 'No such file or directory'

I need to install this package as I need to find out how the clean price of 
bond is arrived at?

Kindly
 guide

Regards

Katherine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Clean Price of Bond : Can't install "RQuantLib" in R version 3.0.0

2013-05-02 Thread Katherine Gobin
Dear Sir,

Thanks a lot. The "1 in  'RQuantLib_0.3.10(1)", I understand was appearing 
because I had saved RQuantLib number of times in my local directory and each 
time it renamed the installation file with no added to it. 

Thanks again. Now I am able to install the package along-with Rcpp.

Regards

Katherine

 

--- On Thu, 2/5/13, Prof Brian Ripley  wrote:

From: Prof Brian Ripley 
Subject: Re: [R] Clean Price of Bond : Can't install "RQuantLib" in R version 
3.0.0
To: "Katherine Gobin" 
Cc: r-help@r-project.org
Date: Thursday, 2 May, 2013, 1:23 PM

On 02/05/2013 13:09, Katherine Gobin wrote:
> Dear Forum,
>
> I have R version 3.0.0 installed and need to install RQuantLib pacakge. I 
> tried to install it from CRAN Mirror and I couldn't load it. I had saved the 
> package i zip format and tried to install it locally but I am getting 
> following error.

What is the name of the file you downloaded?   I am guessing it did not 
arrive with the name on the archive.

You seem to be using Windows without saying so.  The file for Windows is 
http://cran.r-project.org/bin/windows/contrib/r-release/RQuantLib_0.3.10.zip

without (1) in the name.

>
>> utils:::menuInstallLocal()
> Error in read.dcf(file.path(pkgname, "DESCRIPTION"), c("Package", "Type")) :
>    cannot open the connection
> In addition: Warning message:
> In read.dcf(file.path(pkgname, "DESCRIPTION"), c("Package", "Type")) :
>    cannot open compressed file 'RQuantLib_0.3.10(1)/DESCRIPTION', probable 
>reason 'No such file or directory'
>
> I need to install this package as I need to find out how the clean price of 
> bond is arrived at?
>
> Kindly
>   guide
>
> Regards
>
> Katherine
>
>     [[alternative HTML version deleted]]
>
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Brian D. Ripley,                  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Finding Beta

2013-06-04 Thread Katherine Gobin
Dear R forum

I have a dataframe (of prices) as given below -

dat
 = data.frame(company = rep(c("A", "B", "C", "D", "index"), each = 5), 
prices = c(runif(5, 10, 12), runif(5, 108, 112), runif(5, 500, 510), 
runif(5, 40, 50), runif(5, 1000, 1020)))

   company prices
1    A   10.61727
2    A   10.51892
3    A   11.80495
4    A   11.15243
5    A   10.77543
6    B  111.23817
7    B  109.19825
8    B 
 108.80053
9    B  110.79876
10   B  108.84385
11   C  504.71801
12   C  504.11778
13   C  502.89416
14   C  500.65996
15   C  502.26748
16   D   42.35901
17   D   43.71947
18   D   46.46092
19   D   43.62220
20   D   48.47480
21   index 1017.24476
22   index 1002.88139
23   index 1005.16148
24   index 1014.54480
25   index 1014.12103

I need to find the beta
 of A, B, C and D w.r.t index. 

Beta between two variables X and Y (where Y is dependent) is given by,

beta = coef(lm(Y ~ X))[2] 

Any guidance is appreciated.

With regards

Katherine
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Removing "NA" from matrix

2013-06-14 Thread Katherine Gobin
Dear R forum,

I have a data frame 


dat = data.frame(
ABC = c(25.28000732,48.33857234,19.8013245,10.68361461),
DEF = c(14.02722251,10.57985168,11.81890316,21.40171514),
GHI = c(1,1,1,1),
JKL = c(45.96423231,44.52986236,16.56514176,32.14545122),
MNO = c(45.38438063,15.54338206,18.78444777,24.29486984))

> dat
   ABC  DEF GHI  JKL  MNO
1 25.28001 14.02722   1 45.96423 45.38438
2 48.33857 10.57985   1 44.52986 15.54338
3 19.80132 11.81890   1 16.56514 18.78445
4 10.68361 21.40172   1 32.14545 24.29487


When I try to find the correlation I get (which is obvious as my one column 
shows no variation)

dat_cor = cor(dat)


Warning message:
In cor(dat) : the standard deviation is zero
> dat_cor
   ABC DEF GHI JKL    MNO
ABC  1.000 -0.75600764  NA  0.55245223 -0.2735585
DEF -0.7560076  1.  NA -0.06479082  0.2020781
GHI NA  NA   1  NA NA
JKL  0.5524522 -0.06479082  NA  1.  0.4564568
MNO -0.2735585  0.20207810  NA  0.45645683  1.000


In reality I am dealing with about 300 variables and don't know which variables 
don't vary.

My query is how do I remove the columns and rows with NA's.

So for example, I need the correlation matrix for ABC, DEF, JKL and MNO only.

Kindly guide.

Thanking in advance.

Regards

Katherine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Choosing subset of data.frame

2013-06-20 Thread Katherine Gobin
Dear R Forum

I have a data frame as

beta_results = data.frame(instrument = c("ABC", "DEF", "JKL",  "LMN", "PQR", 
"STU", "UVW", "XYZ"), 

beta_values = c(1.27, -0.22, 0.529, 0.011, 2.31, -1.08, -2.7, 0.42))

> beta_results
  instrument beta_values
1    ABC   1.270
2    DEF  -0.220
3    JKL   0.529
4    LMN   0.011
5    PQR   2.310
6    STU  -1.080
7    UVW  -2.700
8    XYZ   0.420


Through some other process, I am getting instrument names as say (which may 
change each time I run this process
and hence I can't hard code it).


instru = c("JKL", "STU", "XYZ")

Now I want the subset of beta_results, (say beta_results_A)  pertaining to only 
instru i.e

beta_results_A = 


  instrument beta_values
3    JKL   0.529
6    STU  -1.080
8    XYZ   0.420


I did try

beta_results_A = beta_results[instru]
or
beta_results_A = subset(beta_results, beta_results$instrument = instru]

but I guess it's failing.

Kindly guide

Regards

Katherine
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to store interim print results

2014-04-03 Thread Katherine Gobin
Dear R forum,

Following is an customized extract of a code I am working on.

settlement = as.Date("2013-11-25")
maturity   = as.Date("2015-10-01")
coupon     = 0.066
yield      = 0.1040
basis      = 1  
frequency = 2
redemption = 100

# __

add.months = function(date, n) 
{
  nC <- seq(date, by=paste (n, "months"), length = 2)[2]
  fD <- as.Date(strftime(as.Date(date), format='%Y-%m-01'))
  C  <- (seq(fD, by=paste (n+1, "months"), length = 2)[2])-1
  if(nC>C) return(C)
  return(nC)
}

date.diff = function(end, start, basis=1) {
  if (basis != 0 && basis != 4)
  return(as.numeric(end - start))
  e <- as.POSIXlt(end)
  s <- as.POSIXlt(start)
  d <-   (360 * (e$year - s$year)) + (30 * (e$mon  - s$mon )) + (min(30, 
e$mday) - min(30, s$mday))
  
  return (d)
}

 cashflows   <- 0
 last.coupon <- maturity
 while (last.coupon > settlement) {
          print(last.coupon)             # I need to store these dates
 last.coupon <- add.months(last.coupon, -12/frequency)
 cashflows <- cashflows + 1
print(cashflows)                 # I need to store these cashflow numbers   
  }

The print command causes the following output

[1] "2015-10-01"
[1] 1
[1] "2015-04-01"
[1] 2
[1] "2014-10-01"
[1] 3
[1] "2014-04-01"
[1] 4

My problem is how do I store these print outputs or while the loop is getting 
executed, how do I save these to some data.frame say

output_dat 

cashflow_tenure    cashflow_nos

1      2015-10-01            1
2      2015-04-01            2
3      2014-10-01            3
4      2014-04-01            4

Kindly advise

With regards

Katherine
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to store interim print results

2014-04-03 Thread Katherine Gobin
Dear Jean

Thanks a lot for this solution. Its very useful. I did only one small change 
and defined 

cashflow.tenure <- numeric(0) instead of character(0). This helps me in further 
numerical calculations using these dates like finding the difference between 
two dates etc.


Thanks again,

Regards

Katherine
On Thursday, 3 April 2014 6:58 PM, "Adams, Jean"  wrote:
 
Katherine,

One easy way to do this for small data is by using the append() function (see 
code below).  But, if you have a lot of data, it may be too slow for you.  In 
that case, you can gain some efficiency if you determine in advance how long 
the vectors will be, then use indexing to fill in the vectors without using the 
append() function.  Or, rewrite the code to be vectorized instead of using a 
while() loop.


cashflows   <- 0
last.coupon <- maturity
# create "empty" vectors
cashflow.tenure <- character(0)
cashflow.nos <- numeric(0)

while (last.coupon > settlement) {
print(last.coupon)
# store the dates
cashflow.tenure <- append(cashflow.tenure, last.coupon)
last.coupon <- add.months(last.coupon, -12/frequency)
cashflows <- cashflows + 1
print(cashflows)
# store the cashflow numbers   
cashflow.nos <- append(cashflow.nos, cashflows)
}

output.dat <- data.frame(cashflow.tenure, cashflow.nos)
output.dat

Jean



On Thu, Apr 3, 2014 at 5:22 AM, Katherine Gobin  
wrote:

Dear R forum,
>
>Following is an customized extract of a code I am working on.
>
>settlement = as.Date("2013-11-25")
>maturity   = as.Date("2015-10-01")
>coupon     = 0.066
>yield      = 0.1040
>basis      = 1  
>frequency = 2
>redemption = 100
>
># __
>
>add.months = function(date, n) 
>{
>  nC <- seq(date, by=paste (n, "months"), length = 2)[2]
>  fD <- as.Date(strftime(as.Date(date), format='%Y-%m-01'))
>  C  <- (seq(fD, by=paste (n+1, "months"), length = 2)[2])-1
>  if(nC>C) return(C)
>  return(nC)
>}
>
>date.diff = function(end, start, basis=1) {
>  if (basis != 0 && basis != 4)
>  return(as.numeric(end - start))
>  e <- as.POSIXlt(end)
>  s <- as.POSIXlt(start)
>  d <-   (360 * (e$year - s$year)) + (30 * (e$mon  - s$mon )) + (min(30, 
>e$mday) - min(30, s$mday))
>  
>  return (d)
>}
>
> cashflows   <- 0
> last.coupon <- maturity
> while (last.coupon > settlement) {
>          print(last.coupon)             # I need to store these dates
> last.coupon <- add.months(last.coupon, -12/frequency)
> cashflows <- cashflows + 1
>print(cashflows)                 # I need to store these cashflow numbers   
>  }
>
>The print command causes the following output
>
>[1] "2015-10-01"
>[1] 1
>[1] "2015-04-01"
>[1] 2
>[1] "2014-10-01"
>[1] 3
>[1] "2014-04-01"
>[1] 4
>
>My problem is how do I store these print outputs or while the loop is getting 
>executed, how do I save these to some data.frame say
>
>output_dat 
>
>cashflow_tenure    cashflow_nos
>
>1      2015-10-01            1
>2      2015-04-01            2
>3      2014-10-01            3
>4      2014-04-01            4
>
>Kindly advise
>
>With regards
>
>Katherine
>        [[alternative HTML version deleted]]
>
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to store interim print results

2014-04-03 Thread Katherine Gobin
Dear Sir,

Thanks a lot for your guidance and efforts. Appreciate it.

Thanks again.

Katherine
On Thursday, 3 April 2014 6:55 PM, jim holtman  wrote:
 
This will get you close:

> settlement = as.Date("2013-11-25")
> maturity   = as.Date("2015-10-01")
> coupon     = 0.066
> yield      = 0.1040
> basis      = 1  
> frequency = 2
> redemption = 100
> 
> # __
> 
> add.months = function(date, n) 
+ {
+   nC <- seq(date, by=paste (n, "months"), length = 2)[2]
+   fD <- as.Date(strftime(as.Date(date), format='%Y-%m-01'))
+   C  <- (seq(fD, by=paste (n+1, "months"), length = 2)[2])-1
+   if(nC>C) return(C)
+   return(nC)
+ }
> 
> date.diff = function(end, start, basis=1) {
+   if (basis != 0 && basis != 4)
+   return(as.numeric(end - start))
+   e <- as.POSIXlt(end)
+   s <- as.POSIXlt(start)
+   d <-   (360 * (e$year - s$year)) + (30 * (e$mon  - s$mon )) + (min(30, 
e$mday) - min(30, s$mday))
+   
+   return (d)
+ }
> 
> output <- capture.output({  # collect the print output
+  cashflows   <- 0
+  last.coupon <- maturity
+  while (last.coupon > settlement) {
+           print(last.coupon)             # I need to store these dates
+  last.coupon <- add.months(last.coupon, -12/frequency)
+  cashflows <- cashflows + 1
+ print(cashflows)                 # I need to store these cashflow numbers   
+   }
+   
+ })  
> 
> # remove line numbers
> output <- sub("^", "", output)
> 
> # remove extra quotes
> output <- gsub('"', '', output)
> 
> 
> # now read in the data
> report <- matrix(output, ncol = 2, byrow = TRUE)
> 
> report
     [,1]         [,2]
[1,] "2015-10-01" "1" 
[2,] "2015-04-01" "2" 
[3,] "2014-10-01" "3" 
[4,] "2014-04-01" "4" 
> 
> 




Jim Holtman
Data Munger Guru
 
What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Thu, Apr 3, 2014 at 6:22 AM, Katherine Gobin  
wrote:

Dear R forum,
>
>Following is an customized extract of a code I am working on.
>
>settlement = as.Date("2013-11-25")
>maturity   = as.Date("2015-10-01")
>coupon     = 0.066
>yield      = 0.1040
>basis      = 1  
>frequency = 2
>redemption = 100
>
># __
>
>add.months = function(date, n) 
>{
>  nC <- seq(date, by=paste (n, "months"), length = 2)[2]
>  fD <- as.Date(strftime(as.Date(date), format='%Y-%m-01'))
>  C  <- (seq(fD, by=paste (n+1, "months"), length = 2)[2])-1
>  if(nC>C) return(C)
>  return(nC)
>}
>
>date.diff = function(end, start, basis=1) {
>  if (basis != 0 && basis != 4)
>  return(as.numeric(end - start))
>  e <- as.POSIXlt(end)
>  s <- as.POSIXlt(start)
>  d <-   (360 * (e$year - s$year)) + (30 * (e$mon  - s$mon )) + (min(30, 
>e$mday) - min(30, s$mday))
>  
>  return (d)
>}
>
> cashflows   <- 0
> last.coupon <- maturity
> while (last.coupon > settlement) {
>          print(last.coupon)             # I need to store these dates
> last.coupon <- add.months(last.coupon, -12/frequency)
> cashflows <- cashflows + 1
>print(cashflows)                 # I need to store these cashflow numbers   
>  }
>
>The print command causes the following output
>
>[1] "2015-10-01"
>[1] 1
>[1] "2015-04-01"
>[1] 2
>[1] "2014-10-01"
>[1] 3
>[1] "2014-04-01"
>[1] 4
>
>My problem is how do I store these print outputs or while the loop is getting 
>executed, how do I save these to some data.frame say
>
>output_dat 
>
>cashflow_tenure    cashflow_nos
>
>1      2015-10-01            1
>2      2015-04-01            2
>3      2014-10-01            3
>4      2014-04-01            4
>
>Kindly advise
>
>With regards
>
>Katherine
>        [[alternative HTML version deleted]]
>
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Conditional subtraction

2014-04-07 Thread Katherine Gobin
Dear R forum

I have following data.frame

dat = data.frame(key = c("A", "B", "C", "D", "E", "E"), id = c("instru_A", 
"instru_B", "instru_B", "instru_B", "instru_C", "instru_C"), price = c(101.38, 
3.9306, 3.7488, 92.9624, 5.15, 96.1908), adj_factor = c(2.08, 2.5217, 2.5217, 
2.5217, 3.08, 3.08))

> dat
  key       id     price         adj_factor
1   A instru_A 101.3800   2.0800
2   B instru_B   3.9306     2.5217
3   C instru_B   3.7488     2.5217
4   D instru_B  92.9624    2.5217
5   E instru_C   5.1500     3.0800
6   E instru_C  96.1908    3.0800

This is just a part of big database and ids can appear any no of times.

# MY PROBLEM


I need to subtract adj_factor from the price, however only from the first id 
only.

In case of instru_A, there is only 1 id, so 2.08 should be subtracted from 
101.38.

The id "instru_B" is appearing 3 times. So in this case, adj_factor = 2.5217 
should be subtracted from 3.9306 and rest should remain same.

Similarly, id "instru_C" is appearing 2 times, hence the adj_factor = 3.08 
should be subtracted from 5.15.


Effectively I am looking for 

> dat_new

  key       id     price         adj_factor   adjusted_price
1   A instru_A 101.3800   2.0800        99.3000      # price adjusted
2   B instru_B   3.9306     2.5217         1.4089      # price adjusted
3   C instru_B   3.7488     2.5217         3.7488
4   D instru_B  92.9624    2.5217        92.9624
5   E instru_C   5.1500     3.0800         2.0700      # price adjusted
6   E instru_C  96.1908    3.0800        96.1908




I tried something like

adj_price = function(id, price, adj_factor)
{
id_length = length(id)

if(id_length == 1)

{
(adjusted_price = price-adj_factor)
}

if(id_length == 2)

{
(adjusted_price = c(price[1]-adj_factor[1], price[2]))
}

if(id_length > 2)

{
(adjusted_price = c(price[1]-adj_factor[1],price[2:id_length]))
}

return(adjusted_price)

}

(final_price = adj_price(dat$id, dat$price, dat$adj_factor))

> (final_price = adj_price(dat$id, dat$price, dat$adj_factor))
[1] 99.3000  3.9306  3.7488 92.9624  5.1500 96.1908


Kindly advise

Regards

Katherine
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R equivalent functions of some EXCEL functions

2014-05-14 Thread Katherine Gobin
Dear R forum,


EXCEL has some standard functions e.g. 


(1) PRICE function : Returns the price per $100 face value of a security that 
pays periodic interest.


(2) COUPDAYBS : Returns the number of days from the beginning of the coupon 
period to the settlement date.


(3) COUPDAYS : Returns the number of days in the coupon period that contains 
settlement date.


4) COUPDAYSNC : Returns the number of days from the settlement date to the next 
coupon date.


Kindly guide if R has some inbuilt functions giving the results same as 
obtained from Excel functions mentioned above.


With regards

Katherine
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Splitting vector elements

2014-05-21 Thread Katherine Gobin
Dear R forum

I have a vector as 

dat = c("ABC 1", "ABC 2", "ABC 3", "DEF 10", "DEF 20")

> dat
[1] "ABC 1"  "ABC 2"  "ABC 3"  "DEF 10" "DEF 20"

I need to split the names into two parts say

   p1      p2
 ABC    1
 ABC    2
 ABC    3
 DEF     10
 DEF     20

Kindly guide

Katherine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Paper on Analytics using R

2014-07-10 Thread Katherine Gobin
Dear R Forum,

I am looking for some write-up or paper on Use of R for Analytics or why R 
should be preferred over others for Analytics purpose. Tried google but got 
some info about some commercial vendors using R for analytics. I am looking for 
some paper where no commercial flavor is given, I mean it deals with R strictly 
and doesn't talk about some product using R for analytics.

Kindly share if you are aware of some writeups or paper.

Regards

Katherine
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reversing the Equation to find value of variable

2014-01-06 Thread Katherine Gobin
Dear R forum

I have following variables -

EAD = 1
LGD = 0.45
PD = 0.47
M = 3

# Equation 1

R = 0.12*(1-exp(-50*PD))/(1-exp(-50)) + 0.24*(1-(1-exp(-50*PD))/(1-exp(-50)))

b = (0.11852 - 0.05478 * log(PD))^2

K = (LGD * pnorm((1 - R)^(-0.5) * qnorm(PD) + (R / (1 - R))^0.5 * qnorm(0.999)) 
- PD * LGD) * (1 - 1.5 * b)^(-1) * (1 + (M - 2.5) * b)

RWA = K * 12.5 * EAD


> RWA
[1] 22845.07

# _

# MY Problem

In the above part, knowing values of LGD, EAD, M and PD, the value of RWA was 
calculated. However, I need to go reverse way in the sense knowing the values 
of LGD, EAD, M and RWA, I need to find value of PD.

So I have tried to use uniroot as (RWA - K * 12.5 * EAD and used the above 
equations i place of K and R)

RWA = 22845.07
LGD = 0.45
EAD = 1
M = 3

f = function(x) RWA -  
(LGD*pnorm((1-(0.12*(1-exp(-50*x))/(1-exp(-50))+0.24*(1-(1-exp(-50*x))/(1-exp(-50)^(-0.5)*qnorm(x)+((0.12*(1-exp(-50*x))/(1-exp(-50))+0.24*(1-(1-exp(-50*x))/(1-exp(-50/(1-(0.12*(1-exp(-50*x))/(1-exp(-50))+0.24*(1-(1-exp(-50*x))/(1-exp(-50))^0.5*qnorm(0.999))-x*LGD)
 * (1-1.5*((0.11852-0.05478 * log(x))^2))^(-1)*(1+(M-2.5)*((0.11852-0.05478 * 
log(x))^2))*12.5*EAD

uniroot(f, c(0,1), tol = 0.01)

I get following error -

> uniroot(f, c(0,1), tol = 0.01)
Error in uniroot(f, c(0, 1), tol = 1e-10) : f.lower = f(lower) is NA

Kindly guide as I am not sure if uniroot is the correct way of doing it or not. 
Ideally, I should be getting the PD value of 0.47.

With regards

Katherine
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reversing the Equation to find value of variable

2014-01-06 Thread Katherine Gobin
Dear Sir,

Thanks a lot for your wonderful guidance. It gave me a new vision to look at 
teh equations. Really appreciate.

Thanks a lot once again.

Katherine



On Monday, 6 January 2014 5:31 PM, Frede Aakmann Tøgersen  
wrote:
 
Hi

Reading the error message carefully you can see that f() is not defined at 0:

> uniroot(f, c(0, 1))
Error in uniroot(f, c(0, 1)) : f.lower = f(lower) is NA
> f(0)
[1] NaN

If you plot f() in the interval (0,1) then you'll see there is two solutions:


> uniroot(f, c(0.0001, 1))
Error in uniroot(f, c(1e-04, 1)) : 
  f() values at end points not of opposite sign
> uniroot(f, c(0.0001, 0.2))
$root
[1] 0.1533901

$f.root
[1] 0.3414232

$iter
[1] 6

$estim.prec
[1] 6.103516e-05


> uniroot(f, c(0.3, 1))
$root
[1] 0.4699984

$f.root
[1] -0.04112121

$iter
[1] 8

$estim.prec
[1] 6.103516e-05

>


Yours sincerely / Med venlig hilsen


Frede Aakmann Tøgersen
Specialist, M.Sc., Ph.D.
Plant Performance & Modeling

Technology & Service Solutions
T +45 9730 5135
M +45 2547 6050
fr...@vestas.com
http://www.vestas.com

Company reg. name: Vestas Wind Systems A/S
This e-mail is subject to our e-mail disclaimer statement.
Please refer to www.vestas.com/legal/notice
If you have received this e-mail in error please contact the sender. 


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Katherine Gobin
> Sent: 6. januar 2014 12:42
> To: r-help@r-project.org
> Subject: [R] Reversing the Equation to find value of variable
> 
> Dear R forum
> 
> I have following variables -
> 
> EAD = 1
> LGD = 0.45
> PD = 0.47
> M = 3
> 
> # Equation 1
> 
> R = 0.12*(1-exp(-50*PD))/(1-exp(-50)) + 0.24*(1-(1-exp(-50*PD))/(1-exp(-
> 50)))
> 
> b = (0.11852 - 0.05478 * log(PD))^2
> 
> K = (LGD * pnorm((1 - R)^(-0.5) * qnorm(PD) + (R / (1 - R))^0.5 *
> qnorm(0.999)) - PD * LGD) * (1 - 1.5 * b)^(-1) * (1 + (M - 2.5) * b)
> 
> RWA = K * 12.5 * EAD
> 
> 
> > RWA
> [1] 22845.07
> 
> #
> __
> ___
> 
> # MY Problem
> 
> In the above part, knowing values of LGD, EAD, M and PD, the value of RWA
> was calculated. However, I need to go reverse way in the sense knowing the
> values of LGD, EAD, M and RWA, I need to find value of PD.
> 
> So I have tried to use uniroot as (RWA - K * 12.5 * EAD and used the above
> equations i place of K and R)
> 
> RWA = 22845.07
> LGD = 0.45
> EAD = 1
> M = 3
> 
> f = function(x) RWA -  (LGD*pnorm((1-(0.12*(1-exp(-50*x))/(1-exp(-
> 50))+0.24*(1-(1-exp(-50*x))/(1-exp(-50)^(-0.5)*qnorm(x)+((0.12*(1-
> exp(-50*x))/(1-exp(-50))+0.24*(1-(1-exp(-50*x))/(1-exp(-50/(1-(0.12*(1-
> exp(-50*x))/(1-exp(-50))+0.24*(1-(1-exp(-50*x))/(1-exp(-
> 50))^0.5*qnorm(0.999))-x*LGD) * (1-1.5*((0.11852-0.05478 *
> log(x))^2))^(-1)*(1+(M-2.5)*((0.11852-0.05478 * log(x))^2))*12.5*EAD
> 
> uniroot(f, c(0,1), tol = 0.01)
> 
> I get following error -
> 
> > uniroot(f, c(0,1), tol = 0.01)
> Error in uniroot(f, c(0, 1), tol = 1e-10) : f.lower = f(lower) is NA
> 
> Kindly guide as I am not sure if uniroot is the correct way of doing it or 
> not.
> Ideally, I should be getting the PD value of 0.47.
> 
> With regards
> 
> Katherine
>     [[alternative HTML version deleted]]
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Running the Loop

2014-02-04 Thread Katherine Gobin
Dear R forum,

I have following data.frames

dat = data.frame(id = c(1:3), root = c(0.10, 0.20, 0.74), maturity_period = 
c(20, 155, 428), mtm = c(1000, 1, 10), curve = c("USD", "USD", "USD"))


> dat
  id root maturity_period   mtm curve
1  1 0.10              20 1e+03   USD
2  2 0.20             155 1e+04   USD
3  3 0.74             428 1e+05   USD

standard_tenors = data.frame(T = c("1m", "3m", "6m", "12m", "5yr"), D = c(30, 
91, 182, 365, 1825))


> standard_tenors
    T    D
1  1m   30
2  3m   91
3  6m  182
4 12m  365
5 5yr 1825

# 
.

library(plyr)


T = standard_tenors$T

D = standard_tenors$D
n = length(standard_tenors$T)


mtm_split_function = function(maturity_period, curve, root, mtm)

{

for(i in 1:(n-1))
{
if (maturity_period < D[i])


{
N1 = paste(curve, T[i], sep ="_")
N2 = paste(curve, T[i], sep ="_")
PV1 = mtm
PV2 = 0
}else

if (maturity_period > D[i] & maturity_period < D[i+1])

{
N1 = paste(curve, T[i], sep ="_")
N2 = paste(curve, T[1+1], sep ="_")
PV1 = (mtm)*root
PV2 = (mtm)*(1-root)
}else

if (maturity_period > D[i+1])
{
N1 = paste(curve, T[i], sep ="_")
N2 = paste(curve, T[i], sep ="_")
PV1 = 0
PV2 = mtm
}

}


return(data.frame(Risk_factor1 = N1, Risk_factor2 = N2, Risk_factor1_mtm = PV1,
Risk_factor2_mtm = PV2))
}

# 
.

splitted_mtm <- ddply(.data = dat, .variables = "id",
                .fun=function(x) mtm_split_function(maturity_period = 
x$maturity_period, curve = x$curve, root = x$root, mtm = x$mtm))

# OUTPUT I am getting

  id Risk_factor1 Risk_factor2 Risk_factor1_mtm Risk_factor2_mtm
1  1      USD_12m      USD_12m             1000                0
2  2      USD_12m      USD_12m            1                0
3  3      USD_12m       USD_3m            74000            26000




# My PROBLEM

However, My OUTPUT should be  

  id Risk_factor1 Risk_factor2 Risk_factor1_mtm Risk_factor2_mtm
1  1      USD_1m       USD_1m             1000                0
2  2      USD_3m       USD_6m             2000             8000
3  3      USD_12m      USD_5yr           74000            26000

Kindly guide

With warm regards

Katherine
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to write an error to output

2013-10-16 Thread Katherine Gobin
Dear R forum,

The example below is just an indicative one and I have constructed it. My real 
life data and conditions are different.

I have a data.frame as given below

mydat = data.frame(A = c(19, 20, 19, 19, 19, 18, 16, 18, 19, 20), B = c(19, 20, 
20, 19, 20, 18, 19, 18, 17, 16))

if (length(mydat$A) > 10)

{
stop("A has length more than 10")
}else

if (max(mydat$B) > 18)
{
stop("max B exceeds limit")
}else

{result = mydat$A + mydat$B

    if (length(result) > 0)

{
         write.csv(data.frame(result = result), 'result.csv', row.names = FALSE)
         }
}

# -

When i execute above code, I get message

Error: max B exceeds limit

If all conditions are met, obviously I am getting an output as result.csv

If result.csv is generated, I am able to capture and show the output in front 
end. However, if the process couldn't be run owing to the violation of 
conditions, the error is produced. How do I capture this error (and express it 
as csv file) so that I can show it as a comment in front end. 

Kindly guide.


Katherine
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to write an error to output

2013-10-16 Thread Katherine Gobin
Dear sir,

Thanks a lot for your wonderful suggestion.

Regards

Katherine



On Wednesday, 16 October 2013 5:28 PM, jim holtman  wrote:
 
Will this work for you:


mydat = data.frame(A = c(19, 20, 19, 19, 19, 18, 16, 18, 19, 20), B =
c(19, 20, 20, 19, 20, 18, 19, 18, 17, 16))

if (length(mydat$A) > 10)

{
write.csv(data.frame(error = "A has length more than 10"),
'result.csv', row.names = FALSE)
stop("A has length more than 10")
}else

if (max(mydat$B) > 18)
{
write.csv(data.frame(error = "max B exceeds limit"), 'result.csv',
row.names = FALSE)
stop("max B exceeds limit")
}else

{result = mydat$A + mydat$B

    if (length(result) > 0)

{
         write.csv(data.frame(result = result), 'result.csv', row.names = FALSE)
         }
}

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.



On Wed, Oct 16, 2013 at 7:01 AM, Katherine Gobin
 wrote:
> Dear R forum,
>
> The example below is just an indicative one and I have constructed it. My 
> real life data and conditions are different.
>
> I have a data.frame as given below
>
> mydat = data.frame(A = c(19, 20, 19, 19, 19, 18, 16, 18, 19, 20), B = c(19, 
> 20, 20, 19, 20, 18, 19, 18, 17, 16))
>
> if (length(mydat$A) > 10)
>
> {
> stop("A has length more than 10")
> }else
>
> if (max(mydat$B) > 18)
> {
> stop("max B exceeds limit")
> }else
>
> {result = mydat$A + mydat$B
>
>     if (length(result) > 0)
>
> {
>          write.csv(data.frame(result = result), 'result.csv', row.names = 
>FALSE)
>          }
> }
>
> # -
>
> When i execute above code, I get message
>
> Error: max B exceeds limit
>
> If all conditions are met, obviously I am getting an output as result.csv
>
> If result.csv is generated, I am able to capture and show the output in front 
> end. However, if the process couldn't be run owing to the violation of 
> conditions, the error is produced. How do I capture this error (and express 
> it as csv file) so that I can show it as a comment in front end.
>
> Kindly guide.
>
>
> Katherine
>         [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

>
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Subseting a data.frame

2013-10-17 Thread Katherine Gobin
Dear Forum,

I have a data frame as 

mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency = c(0.15, 
0.07, 0.03, 0.001))

> mydat
  basel_asset_class defa_frequency
1                 2          0.150
2                 8          0.070
3                 8          0.030
4                 8          0.001


I need to get the subset of this data.frame where no of records for the given 
basel_asset_class is > 2, i.e. I need to obtain subset of above data.frame as 
(since there is only 1 record, against basel_asset_class = 2, I want to filter 
it)

> mydat_a
  basel_asset_class defa_frequency
1                 8          0.070
2                 8          0.030
3                 8          0.001

Kindly guide

Katherine
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subseting a data.frame

2013-10-17 Thread Katherine Gobin
 I am sorry perhaps  was not able to put the question properly. I am not 
looking for the subset of the data.frame where the basel_asset_class is > 2. I 
do agree that would have been a basic requirement. Let me try to put the 
question again. 

I have a data frame as 

mydat = data.frame(basel_asset_class = c(4, 8, 8 ,8), defa_frequency = c(0.15, 
0.07, 0.03, 0.001))

# Please note I have changed the basel_asset_class to 4 from 2, to avoid 
confusion.

> mydat
  basel_asset_class defa_frequency
1                 4          0.150
2                 8          0.070
3                 8          0.030
4                 8          0.001



This is just an representative example. In reality, I may have no of basel 
asset classes. 4, 8 etc are the IDs can be anything thus I cant hard code it as 
subset(mydat, mydat$basel_asset_class > 2).


What I need is to select only those records for which there are more than two 
default frequencies (defa_frequency), Thus, there is only one default frequency 
= 0.150 w.r.t basel_asset_class = 4 whereas there are default frequencies 
w.r.t. basel aseet class 4, similarly there could be another basel asset class 
having say 5 default frequncies. Thus, I need to take subset of the data.frame 
s.t. the no of corresponding defa_frequencies is greater than 2.

The idea is we try to fit exponential curve Y = A exp( BX ) for each of the 
basel asset classes and to estimate values of A and B, mathematically one needs 
to have at least two values of X.

I hope I may be able to express my requirement. Its not that I need the subset 
of mydat s.t. basel asset class is > 2 (now 4 in revised example), but sbuset 
s.t. no of default frequencies is greater than or equal to 2. This 2 is not 
same as basel asset class 2.

Kindly guide

With warm regards

Katherine Gobin




On Thursday, 17 October 2013 9:33 PM, Bert Gunter  
wrote:
 
"Kindly guide" ...

This is a very basic question, so the kindest guide I can give is to read an 
Introduction to R (ships with R) or a R web tutorial of your choice so that you 
can learn how R works instead of posting to this list.

Cheers,
Bert




On Wed, Oct 16, 2013 at 11:55 PM, Katherine Gobin  
wrote:

Dear Forum,
>
>I have a data frame as 
>
>mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency = c(0.15, 
>0.07, 0.03, 0.001))
>
>> mydat
>  basel_asset_class defa_frequency
>1                 2          0.150
>2                 8          0.070
>3                 8          0.030
>4                 8          0.001
>
>
>I need to get the subset of this data.frame where no of records for the given 
>basel_asset_class is > 2, i.e. I need to obtain subset of above data.frame as 
>(since there is only 1 record, against basel_asset_class = 2, I want to filter 
>it)
>
>> mydat_a
>  basel_asset_class defa_frequency
>1                 8          0.070
>2                 8          0.030
>3                 8          0.001
>
>Kindly guide
>
>Katherine
>        [[alternative HTML version deleted]]
>
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>


-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subseting a data.frame

2013-10-17 Thread Katherine Gobin
Correction. (2nd para first three lines)
 
Pl read following line 

What I need is to select only those records for which there are more than two 
default frequencies (defa_frequency), Thus, there is only one default frequency 
= 0.150 w.r.t basel_asset_class = 4 whereas there are default frequencies 
w.r.t. basel aseet class 4,


as

What I need is to select only those records for which there are more than two 
default frequencies (defa_frequency), Thus, there is only one default frequency 
= 0.150 w.r.t basel_asset_class = 4 whereas there are THREE default frequencies 
w.r.t. basel aseet class 8,



I alpologize for the incovenience.

Regards

KAtherine








On , Katherine Gobin  wrote:
 
 I am sorry perhaps  was not able to put the question properly. I am not 
looking for the subset of the data.frame where the basel_asset_class is > 2. I 
do agree that would have been a basic requirement. Let me try to put the 
question again. 

I have a data frame as 

mydat = data.frame(basel_asset_class = c(4, 8, 8 ,8), defa_frequency = c(0.15, 
0.07, 0.03, 0.001))

# Please note I have changed the basel_asset_class to 4 from 2, to avoid 
confusion.

> mydat
  basel_asset_class defa_frequency
1                 4          0.150
2                 8          0.070
3                 8          0.030
4                 8          0.001



This is just an representative example. In reality, I may have no of basel 
asset classes. 4, 8 etc are the IDs can be anything thus I cant hard code it as 
subset(mydat, mydat$basel_asset_class > 2).


What I need is to select only those records for which there are more than two 
default frequencies (defa_frequency), Thus, there is only one default frequency 
= 0.150 w.r.t basel_asset_class = 4 whereas there are default frequencies 
w.r.t. basel aseet class 4, similarly there could be another basel asset class 
having say 5 default frequncies. Thus, I need to take subset of the data.frame 
s.t. the no of corresponding defa_frequencies is greater than 2.

The idea is we try to fit exponential curve Y = A exp( BX ) for each of the 
basel asset classes and to estimate values of A and B, mathematically one needs 
to have at least two values of X.

I hope I may be able to express my requirement. Its not that I need the subset 
of mydat s.t. basel asset class is > 2 (now 4 in revised example), but sbuset 
s.t. no of default frequencies is greater than or equal to 2. This 2 is not 
same as basel asset class 2.

Kindly guide

With warm regards

Katherine Gobin




On Thursday, 17 October 2013 9:33 PM, Bert Gunter  
wrote:
 
"Kindly guide" ...

This is a very basic question, so the kindest guide I can give is to read an 
Introduction to R (ships with R) or a R web tutorial of your choice so that you 
can learn how R works instead of posting to this list.

Cheers,
Bert




On Wed, Oct 16, 2013 at 11:55 PM, Katherine Gobin  
wrote:

Dear Forum,
>
>I have a data frame as 
>
>mydat = data.frame(basel_asset_class = c(2, 8, 8 ,8), defa_frequency = c(0.15, 
>0.07, 0.03, 0.001))
>
>> mydat
>  basel_asset_class defa_frequency
>1                 2          0.150
>2                 8          0.070
>3                 8          0.030
>4                 8          0.001
>
>
>I need to get the subset of this data.frame where no of records for the given 
>basel_asset_class is > 2, i.e. I need to obtain subset of above data.frame as 
>(since there is only 1 record, against basel_asset_class = 2, I want to filter 
>it)
>
>> mydat_a
>  basel_asset_class defa_frequency
>1                 8          0.070
>2                 8          0.030
>3                 8          0.001
>
>Kindly guide
>
>Katherine
>        [[alternative HTML version deleted]]
>
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>


-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subseting a data.frame

2013-10-18 Thread Katherine Gobin
Dear sir,

Thanks a lot for your guidance. I have been benefited immensely by this 
discussion. Thanks again.

Regards

Katherine



On Friday, 18 October 2013 2:50 AM, Bert Gunter  wrote:
 
Thanks, Bill.

But ?ave specifically says:

ave(x, ..., FUN = mean)

Arguments:
x

A numeric.

So that it should not be expected to work properly if the argument is
not (coercible to) numeric. Nevertheless, defensive programming is
always wise.

Cheers,
Bert


On Thu, Oct 17, 2013 at 1:34 PM, William Dunlap  wrote:
>   May I ask why:
>     count_by_class <- with(dat, ave(numeric(length(basel_asset_class)),
> basel_asset_class, FUN=length))
>
>   should not be more simply done as:
>     count_by_class <- with(dat, ave(basel_asset_class, basel_asset_class,
> FUN=length))
>
> The way I did it would work if basel_asset_class were non-numeric.
>
> In ave(x, group, FUN=FUN), FUN's return value should be the same type as x
> (or
>
> you can get some odd type conversions).  E.g.,
>
>
>
>    > num <- c(2,3,2,2) ;  char <- c("Two","Three","Two","Two")
>
>    > ave(num, num, FUN=length) # good
>
>    [1] 3 1 3 3
>
>    > ave(char, char, FUN=length) # bad
>
>    [1] "3" "1" "3" "3"
>
>    > fac <- factor(char, levels=c("One","Two","Three"))
>
>    > ave(fac, fac, FUN=length)
>
>    [1]
>
>    Levels: One Two Three
>
>    Warning messages:
>
>    1: In `[<-.factor`(`*tmp*`, i, value = 0L) :
>
>      invalid factor level, NA generated
>
>    2: In `[<-.factor`(`*tmp*`, i, value = 3L) :
>
>      invalid factor level, NA generated
>
>    3: In `[<-.factor`(`*tmp*`, i, value = 1L) :
>
>      invalid factor level, NA generated
>
> but x=integer(length(group)) works in all cases:
>
>    > ave(integer(length(fac)), fac, FUN=length)
>
>    [1] 3 1 3 3
>
>    > ave(integer(length(char)), char, FUN=length)
>
>       [1] 3 1 3 3
>
>
>
> Bill Dunlap
>
> Spotfire, TIBCO Software
>
> wdunlap tibco.com
>
>
>
> From: Bert Gunter [mailto:gunter.ber...@gene.com]
> Sent: Thursday, October 17, 2013 1:06 PM
> To: William Dunlap
> Cc: Katherine Gobin; r-help@r-project.org
> Subject: Re: [R] Subseting a data.frame
>
>
>
> May I ask why:
>
> count_by_class <- with(dat, ave(numeric(length(basel_
>
> asset_class)), basel_asset_class, FUN=length))
>
> should not be more simply done as:
>
> count_by_class <- with(dat, ave(basel_asset_class, basel_asset_class,
> FUN=length))
>
> ?
>
> -- Bert
>
>
>
> On Thu, Oct 17, 2013 at 12:36 PM, William Dunlap  wrote:
>
>> What I need is to select only those records for which there are more than
>> two default
>> frequencies (defa_frequency),
>
> Here is one way.  There are many others:
>    > dat <- data.frame( # slightly less trivial example
>         basel_asset_class=c(4,8,8,8,74,3,74),
>         defa_frequency=(1:7)/8)
>    > count_by_class <- with(dat, ave(numeric(length(basel_asset_class)),
> basel_asset_class, FUN=length))
>    > cbind(dat, count_by_class) # see what we just computed
>      basel_asset_class defa_frequency count_by_class
>    1                 4          0.125              1
>    2                 8          0.250              3
>    3                 8          0.375              3
>    4                 8          0.500              3
>    5                74          0.625              2
>    6                 3          0.750              1
>    7                74          0.875              2
>    > mydat[count_by_class>1, ] # I think this is what you are asking for
>      basel_asset_class defa_frequency
>    2                 8          0.250
>    3                 8          0.375
>    4                 8          0.500
>    5                74          0.625
>    7                74          0.875
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
>> On Behalf
>> Of Katherine Gobin
>> Sent: Thursday, October 17, 2013 11:05 AM
>> To: Bert Gunter
>> Cc: r-help@r-project.org
>> Subject: Re: [R] Subseting a data.frame
>>
>> Correction. (2nd para first three lines)
>>
>> Pl read following line
>>
>> What I need is to select only those records for which there are more than
>> two default
>> frequencies (defa_frequency), Thus, there is only one default frequency =
>> 0.150 w.r.t
>> basel_asse

[R] Yield to maturity in R

2013-10-30 Thread Katherine Gobin
Dear R forum,

Just want to know if there is any function / package in R which will calculate 
Yield to Maturity in R for a given bond?

Regards

Katherine
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Readjusting frequencies

2013-11-11 Thread Katherine Gobin
Dear Forum,

I have following data.frame as

fraud_data = data.frame(no_of_frauds = c(1, 2, 4, 6, 7, 9, 10), frequency = 
c(3, 1, 7, 11, 13, 1, 4))

> fraud_data
  no_of_frauds frequency
1            1         3
2            2         1
3            4         7
4            6        11
5            7        13
6            9         1
7           10         4


I need to regroup the data in such a way that if the frequency is less than 5, 
the corresponding class data gets merged to next class i.e. the frequencies get 
added added till the added frequencies exceed 5. Thus, in above data.frame 
since frequencies pertaining to no_of_frauds 1 and 2 are 3 and 1 respectively, 
these get added to class 4 and the frequency of this class now becomes 3+1+7 = 
11. Likewise, frequency of classes 9 and 10 are 1 and 4 and when these are 
added still it is 5 i.e. doesn't exceed 5. Thus, these should get added to the 
previous class i.e. 7.

Thus I need to have

no_of_frauds       frequency
        4                   11            #  ( 3 + 1 + 7)
        6                   11           
        7                   18            #  (13 + 1 + 4)

Kindly guide

Regards

Katherine
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lmom package

2014-12-03 Thread Katherine Gobin via R-help
Dear R Forum
I have a set of data say as given below and as an exercise of trying to fit 
statistical distribution to this data, I am estimating parameters. 
amounts =  
c(38572.5599129508,11426.6705314315,21974.1571641187,118530.32782443,3735.43055996748,66309.5211176106,72039.2934132668,21934.8841708626,78564.9136114375,1703.65825161293,2116.89180930203,11003.495671332,19486.3296339113,1871.35861218795,6887.53851253407,148900.978055447,7078.56497101651,79348.1239806592,20157.6241066905,1259.99802108593,3934.45912233674,3297.69946631591,56221.1154121067,13322.0705174134,45110.2498756567,31910.3686613912,3196.71168501252,32843.0140437202,14615.1499458453,13013.9915051561,116104.176753387,7229.03056392023,9833.37962177814,2882.63239493673,165457.372543821,41114.066453219,47188.1677766245,25708.5883755617,82703.7378298092,8845.04197017415,844.28834047836,35410.8486123933,19446.3808445684,17662.2398792892,11882.8497070776,4277181.17817307,30239.0371267968,45165.7512343364,22102.8513746687,5988.69296597127,51345.0146170238,1275658.35495898,15260.4892854214,8861.76578480635,37647.1638704867,4979.53544046949,7012.48134772332,3385.20612391205,1911.03114395959,66886.5036605189,2223.47536156462,814.947809578378,234.028589468841,5397.4347625133,13346.3226579065,28809.3901352898,6387.69226236731,5639.42730553242,2011100.92675507,4150.63707173462,34098.7514446498,3437.10672573502,289710.315303182,8664.66947305203,13813.3867161134,208817.521491857,169317.624400274,9966.78447705792,37811.1721605562,2263.19211279927,80434.5581206454,19057.8093104899,24664.5067589624,25136.5042354789,3582.85741610706,6683.13898432794,65423.9991390846,134848.302304064,3018.55371579808,546249.641168158,172926.689143006,3074.15064180208,1521.70624812788,59012.4248281661,21226.928522236,17572.5682970983,226.646947337851,56232.2982652019,14641.0043361533,6997.94414914865)
library(lmom)lmom           <- samlmu(amounts)

# 
# Normal distribution
parameters_of_NOR  <- pelnor(lmom); parameters_of_NOR

> parameters_of_NOR  <- pelnor(lmom); parameters_of_NOR      mu        sigma 
> 115148.4  175945.8 

# Minitab and SPSS parameter values                              Location       
             Scale
Minitab              115148.4                 485173SPSS                 
115148.4                 485173           
# __

# Log normal 3 parameter distribution parameters_of_LN3  <- pelln3(lmom); 
parameters_of_LN3

> parameters_of_LN3  <- pelln3(lmom); parameters_of_LN3
       zeta              mu                sigma 3225.798890    9.114879     
2.240841
                               Location             Scale                  
ShapeMinitab                  9.73361             1.76298               
75.51864SPSS                    9.7336                1.763                    
75.519         

Similarly besides Generalized extreme Value distribution, all the parameter 
values vary significantly than parameter values obtained using Minitab and 
SPSS. In case of Normal distribution, the dispersion parameter is simply sample 
standard deviation and excel also gives the parameter value 485172.8 and varies 
significantly than what we get from R.
And parameter values do differ even for many other distributions too viz. Gamma 
distribution etc.
Is there any different algorithm or logic used in R? Can someone please guide.?
Regards
Katherine


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lmom package - Resending the email

2014-12-03 Thread Katherine Gobin via R-help
Dear R forum
I sincerely apologize as my earlier mail with the captioned subject, since all 
the values got mixed up and the email is not readable. I am trying to write it 
again. 
My problem is I have a set of data and I am trying to fit some distributions to 
it. As a part of this exercise, I need to find out the parameter values of 
various distributions e.g. Normal distribution, Log normal distribution etc. I 
am using lmom package to do the same, however the parameter values obtained 
using lmom pacakge differ to a large extent from the parameter values obtained 
using say MINITAB and SPSS as given below -
_

amounts =  
c(38572.5599129508,11426.6705314315,21974.1571641187,118530.32782443,3735.43055996748,66309.5211176106,72039.2934132668,21934.8841708626,78564.9136114375,1703.65825161293,2116.89180930203,11003.495671332,19486.3296339113,1871.35861218795,6887.53851253407,148900.978055447,7078.56497101651,79348.1239806592,20157.6241066905,1259.99802108593,3934.45912233674,3297.69946631591,56221.1154121067,13322.0705174134,45110.2498756567,31910.3686613912,3196.71168501252,32843.0140437202,14615.1499458453,13013.9915051561,116104.176753387,7229.03056392023,9833.37962177814,2882.63239493673,165457.372543821,41114.066453219,47188.1677766245,25708.5883755617,82703.7378298092,8845.04197017415,844.28834047836,35410.8486123933,19446.3808445684,17662.2398792892,11882.8497070776,4277181.17817307,30239.0371267968,45165.7512343364,22102.8513746687,5988.69296597127,51345.0146170238,1275658.35495898,15260.4892854214,8861.76578480635,37647.1638704867,4979.53544046949,7012.48134772332,3385.20612391205,1911.03114395959,66886.5036605189,2223.47536156462,814.947809578378,234.028589468841,5397.4347625133,13346.3226579065,28809.3901352898,6387.69226236731,5639.42730553242,2011100.92675507,4150.63707173462,34098.7514446498,3437.10672573502,289710.315303182,8664.66947305203,13813.3867161134,208817.521491857,169317.624400274,9966.78447705792,37811.1721605562,2263.19211279927,80434.5581206454,19057.8093104899,24664.5067589624,25136.5042354789,3582.85741610706,6683.13898432794,65423.9991390846,134848.302304064,3018.55371579808,546249.641168158,172926.689143006,3074.15064180208,1521.70624812788,59012.4248281661,21226.928522236,17572.5682970983,226.646947337851,56232.2982652019,14641.0043361533,6997.94414914865)

library(lmom)
lmom  =  samlmu(amounts)
# __
# Normal Distribution parameters
parameters_of_NOR  <- pelnor(lmom); parameters_of_NOR

      mu          sigma 115148.4    175945.8
                      Location       Scale     Minitab         115148.4     
485173SPSS           115148.4     485173
# __
# Log Normal (3 Parameter) Distribution parameters
       zeta                mu               sigma 3225.798890    9.114879      
2.240841
                              Location            Scale           Shape
MINITAB               9.73361             1.76298      75.51864SPSS             
       9.7336                1.763          75.519           # 
__

Besides Genaralized extreme Value distributions, all the other distributions 
e.g. Gamma, Exponential (2 parameter) distributions etc give different results 
than MINITAB and SPSS.
Can some one guide me?

Regards
Katherine











































[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.