Re: [R] Deleting Rows based on Factor and Time Period

Anna Dunietz Wed, 14 Sep 2011 01:56:10 -0700

Alright - for the first stock in the month, I pretty much copied Mikkel's
code and got what I wanted.  Thanks! Anna


install.packages("surveillance")
library(surveillance)
alldat$year <- isoWeekYear(alldat$mydate)$ISOYear
alldat$months<-months(alldat$mydate)
alldat <- alldat[order(alldat$year, alldat$months), ]
alldat[!duplicated(paste(alldat$year, alldat$months, alldat$myeq)), ]

On Wed, Sep 14, 2011 at 10:21 AM, Anna Dunietz <anna.duni...@gmail.com>wrote:

> Mikkel - thank you so much! That's a great start! I could change my demands
> a bit and pick out the first stock each month.  The first stock each week is
> helpful, but I would need a bigger gap between dates - hence, the first
> stock a month would be much better.  It seems to me, however, that
> isoWeekYear is great for week and year but not so great for month.  If you
> have any ideas, please let me know!
>
> Thanks!
> Anna
>
>   On Tue, Sep 13, 2011 at 6:55 PM, Mikkel Grum <mi2kelg...@yahoo.com>wrote:
>
>> The following will get you the first stock in each week. Is that useful?
>>
>> install.packages("surveillance")
>> library(surveillance)
>> alldat$year <- isoWeekYear(alldat$mydate)$ISOYear
>> alldat$week <- isoWeekYear(alldat$mydate)$ISOWeek
>> alldat <- alldat[order(alldat$year, alldat$week), ]
>> alldat[!duplicated(paste(alldat$year, alldat$week, alldat$myeq)), ]
>>
>>
>>
>> ----- Original Message -----
>> From: Anna Dunietz <anna.duni...@gmail.com>
>> To: r-help@r-project.org
>> Cc:
>> Sent: Tuesday, September 13, 2011 9:04 AM
>> Subject: [R] Deleting Rows based on Factor and Time Period
>>
>> Hi All!
>>
>> I have been messing around with this problem for about a week but to no
>> avail! The following data has been cut down in order to make my question
>> reproducible.  The alldat data frame includes 2 columns: 1 date column and
>> 1
>> factor column (equity names)).
>>
>>
>> mydate<-as.Date(c("2001-07-02","2001-07-02","2001-07-03","2001-07-03","2001-07-05","2001-07-05","2001-07-10","2001-07-13","2010-01-27"),origin="1970-01-01")
>>
>>
>> myeq<-factor(c("FCX.UN.Equity","TIE.UN.Equity","FCX.UN.Equity","TIE.UN.Equity","FCX.UN.Equity","TIE.UN.Equity","TIE.UN.Equity","L.UN.Equity","FCX.UN.Equity"))
>>
>> alldat<-data.frame(mydate,myeq)
>>
>>
>> > alldat      mydate          myeq
>> 1 2001-07-02 FCX.UN.Equity
>> 2 2001-07-02 TIE.UN.Equity
>> 3 2001-07-03 FCX.UN.Equity
>> 4 2001-07-03 TIE.UN.Equity
>> 5 2001-07-05 FCX.UN.Equity
>> 6 2001-07-05 TIE.UN.Equity
>> 7 2001-07-10 TIE.UN.Equity
>> 8 2001-07-13   L.UN.Equity
>> 9 2010-01-27 FCX.UN.Equity
>>
>>
>> I group respective factors together by using the split function.  For each
>> respective factor, I am interested in deleting the rows that entail dates
>> that are less than or equal to the *first* stock in that column + 6.
>> Repeat
>> the following sentence, but instead of *first* use second, third, etc.  In
>> short, I do not want an equity that has dates within a week of one another
>> at any point in the data frame/list (depending on if you're looking at
>> alldat or divall).  For example, for FCX.UN.Equity, I would only want the
>> row beginning with 2001-07-02 to remain, as well as the row starting with
>> 2010-01-27.  I cannot delete rows immediately because I need all rows in
>> order to determine which rows to delete.
>>
>> diveq<-alldat$myeq
>> divall<-split(alldat,diveq)
>>
>> I try to pick out those rows that I want to delete by using a double loop
>> (inefficient and awful, I know).  For better or for worse, the double loop
>> does not work.  I get integer(0) for all elements of workin.  I put the
>> second condition in the which function, so that the first date is saved.
>> I
>> use the third condition, so that the dates looked at are all greater than
>> or
>> equal than the date being looked at.  I have spent many, many hours on
>> this
>> and can still not figure it out.
>>
>> workin<-list()
>>   for(j in 1:length(divall)){
>>   for(i in 1:nrow(divall[[j]])){
>>     workin[[j]]<-which(divall[[j]][,1]<=divall[[j]][i,1]+6 &
>> divall[[j]][,1]!=divall[[j]][i,1] & divall[[j]][,1]>=divall[[j]][i,1])
>>   }}
>>
>> If I could get the workin list to work, I would use unique and unlist in
>> order to find the index that would show me which rows in divall/alldat
>> need
>> to be deleted.
>>
>> I hope this has been clear.  Please let me know if you need any more
>> information!
>>
>> Thank you very much!
>> Anna
>>
>>     [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Deleting Rows based on Factor and Time Period

Reply via email to