Thank you all. I now have the right solution for this (perhaps of interest to some):
check_pre <- function(idx, k) { pre_vec <- sapply(1:length(idx), function(x) +any(idx[x:(pmin(x + k, length(idx)))] %in% 1)); pre_vec[idx == 1] <- 0; return(pre_vec) } df %>% group_by(country) %>% mutate( idx = +( (lag(X1) == 0 & X1 == 1) | row_number() == 1 & X1 == 1), X1_pre4 = check_pre(idx, 4), X1_pre5 = check_pre(idx, 5), idx = NULL ) > On 27 Jul 2019, at 10:45, Faradj Koliev <farad...@gmail.com> wrote: > > Peter Dalgaard, > > Thanks for this. > > I’ll try to think of ways to apply this logic. At the moment, I’m trying to > do this with “mutate” using dplyr package. But it’s not easy.. > >> On 27 Jul 2019, at 10:33, peter dalgaard <pda...@gmail.com> wrote: >> >> Some pointers (not tested, may contain blunders...) >> >> (a) you likely need some sort of split-operate-unsplit construct, by >> country. E.g., >> >> myfun <- function(d) {....operate on data frame with only one country....} >> ll <- split(data, data$country) >> ll.new <- lapply(ll, myfun) >> data.new <- unsplit(ll.new, data$country) >> >> (There might be a tidyverse idiom for this too) >> >> (b) your X1_pre5count looks like it is the same as cumsum(1-X1)*X1 (within >> country) >> >> (c) if you count in the opposite direction, tt <- rev(cumsum(rev(1-X1))) you >> get number of years until agreement. Then X1_pre4 should be as.integer(tt >> <=4 & tt > 0) >> >> -pd >> >>> On 27 Jul 2019, at 09:13 , Faradj Koliev <farad...@gmail.com> wrote: >>> >>> Re-post, now in *plain text*. >>> >>> >>> >>> Dear R-users, >>> >>> I’ve a rather complicated task to do and need all the help I can get. >>> >>> I have data indicating whether a country has signed an agreement or not >>> (1=yes and 0=otherwise). I want to simply create variable that would >>> capture the years before the agreement is signed. The aim is to see whether >>> pre or post agreement period has any impact on my dependent variables. >>> >>> More preciesly, I want to create the following variables: >>> (i) a variable that is =1 in the 4 years pre/before the agreement, 0 >>> otherwise; >>> (ii) a variable that is =1 5 years pre the agreement and >>> (iii) a variable that would count the 4 and 5 years pre the agreement >>> (1,2,3,4..). >>> >>> Please see the sample data below. I have manually added the variables I >>> would like to generate in R, labelled as “X1_pre4” ( 4 years before the >>> agreement X1), “X2_pre4”, “X1_pret5” ( 5 years before the agreement X5), >>> and “X1pre5_count” (which basically count the years, 1,2,3, etc). The X1 >>> and X2 is the agreement that countries have either signed (1) or not (0). >>> Note though that I want the variable to capture all the years up to 4 and >>> 5. If it’s only 2 years, it should still be ==1 (please see the example >>> below). >>> >>> To illustrate the logic: the country A has signed the agreement X1 in 1972 >>> in the sample data, then, the (i) and (ii) variables as above should be =1 >>> for the years 1970, 1971, and =0 from 1972 until the end of the study >>> period. >>> >>> The country A has signed the agreement X2 in 1975, then, the (i) variable >>> should be =1 from 1971 to 1974 (post 4 years) and (ii) should be =1 for the >>> 1970-1974 period (post 5 years before the agreement is signed). >>> >>> Later, I would also like to create post_4 and post_5 variables, but I think >>> I’ll be able to figure it out once I know how to generate the pre/before >>> variables. >>> >>> All suggestions are much appreciated! >>> >>> >>> >>> data<-structure(list(country = structure(c(1L, 1L, 1L, 1L, 1L, 1L, >>> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, >>> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, >>> 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, >>> 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("A", "B", "C"), class = "factor"), >>> year = c(1970L, 1971L, 1972L, 1973L, 1974L, 1975L, 1976L, >>> 1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 1984L, 1985L, >>> 1986L, 1987L, 1988L, 1970L, 1971L, 1972L, 1973L, 1974L, 1975L, >>> 1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 1984L, >>> 1985L, 1986L, 1987L, 1988L, 1970L, 1971L, 1972L, 1973L, 1974L, >>> 1975L, 1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, >>> 1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1990L, 1991L), >>> X1 = c(0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, >>> 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, >>> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, >>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, >>> 1L, 1L), X2 = c(0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, >>> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, >>> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, >>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, >>> 1L, 1L, 1L, 1L), X1_pre4 = c(1L, 1L, 0L, 0L, 0L, 0L, 0L, >>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, >>> 0L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, >>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, >>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X2_pre4 = c(0L, 1L, 1L, >>> 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, >>> 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, >>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, >>> 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X1_pre5 = c(1L, >>> 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, >>> 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, >>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, >>> 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), >>> X1_pre5_count = c(1L, 2L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, >>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 2L, 3L, >>> 4L, 5L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, >>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 2L, 3L, 4L, 5L, 0L, 0L, 0L, >>> 0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA, >>> -60L)) >>> >>>> On 26 Jul 2019, at 21:58, Bert Gunter <bgunter.4...@gmail.com> wrote: >>>> >>>> Because you posted in HTML, your example got mangled and resulted in an >>>> error. Re-post in *plain text* please (making sure that you cut and paste >>>> correctly) >>>> >>>> Bert Gunter >>>> >>>> "The trouble with having an open mind is that people keep coming along and >>>> sticking things into it." >>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >>>> >>>> >>>> On Fri, Jul 26, 2019 at 12:25 PM Faradj Koliev <farad...@gmail.com> wrote: >>>> Dear R-users, >>>> >>>> I’ve a rather complicated task to do and need all the help I can get. >>>> >>>> I have data indicating whether a country has signed an agreement or not >>>> (1=yes and 0=otherwise). I want to simply create variable that would >>>> capture the years before the agreement is signed. The aim is to see >>>> whether pre or post agreement period has any impact on my dependent >>>> variables. >>>> >>>> More preciesly, I want to create the following variables: >>>> (i) a variable that is =1 in the 4 years pre/before the agreement, 0 >>>> otherwise; >>>> (ii) a variable that is =1 5 years pre the agreement and >>>> (iii) a variable that would count the 4 and 5 years pre the agreement >>>> (1,2,3,4..). >>>> >>>> Please see the sample data below. I have manually added the variables I >>>> would like to generate in R, labelled as “X1_pre4” ( 4 years before the >>>> agreement X1), “X2_pre4”, “X1_pret5” ( 5 years before the agreement X5), >>>> and “X1pre5_count” (which basically count the years, 1,2,3, etc). The X1 >>>> and X2 is the agreement that countries have either signed (1) or not (0). >>>> Note though that I want the variable to capture all the years up to 4 and >>>> 5. If it’s only 2 years, it should still be ==1 (please see the example >>>> below). >>>> >>>> To illustrate the logic: the country A has signed the agreement X1 in 1972 >>>> in the sample data, then, the (i) and (ii) variables as above should be >>>> =1 for the years 1970, 1971, and =0 from 1972 until the end of the study >>>> period. >>>> >>>> The country A has signed the agreement X2 in 1975, then, the (i) variable >>>> should be =1 from 1971 to 1974 (post 4 years) and (ii) should be =1 for >>>> the 1970-1974 period (post 5 years before the agreement is signed). >>>> >>>> Later, I would also like to create post_4 and post_5 variables, but I >>>> think I’ll be able to figure it out once I know how to generate the >>>> pre/before variables. >>>> >>>> All suggestions are much appreciated! >>>> >>>> >>>> >>>> data<–structure(list(country = structure(c(1L, 1L, 1L, 1L, 1L, 1L, >>>> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, >>>> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, >>>> 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, >>>> 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("A", "B", "C"), class = "factor"), >>>> year = c(1970L, 1971L, 1972L, 1973L, 1974L, 1975L, 1976L, >>>> 1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 1984L, 1985L, >>>> 1986L, 1987L, 1988L, 1970L, 1971L, 1972L, 1973L, 1974L, 1975L, >>>> 1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, 1984L, >>>> 1985L, 1986L, 1987L, 1988L, 1970L, 1971L, 1972L, 1973L, 1974L, >>>> 1975L, 1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 1982L, 1983L, >>>> 1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1990L, 1991L), >>>> X1 = c(0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, >>>> 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, >>>> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, >>>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, >>>> 1L, 1L), X2 = c(0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, >>>> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, >>>> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, >>>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, >>>> 1L, 1L, 1L, 1L), X1_pre4 = c(1L, 1L, 0L, 0L, 0L, 0L, 0L, >>>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, >>>> 0L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, >>>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, >>>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X2_pre4 = c(0L, 1L, 1L, >>>> 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, >>>> 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, >>>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, >>>> 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), X1_pre5 = c(1L, >>>> 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, >>>> 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, >>>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, >>>> 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), >>>> X1_pre5_count = c(1L, 2L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, >>>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 2L, 3L, >>>> 4L, 5L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, >>>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 2L, 3L, 4L, 5L, 0L, 0L, 0L, >>>> 0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA, >>>> -60L)) >>>> >>>> >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> -- >> Peter Dalgaard, Professor, >> Center for Statistics, Copenhagen Business School >> Solbjerg Plads 3, 2000 Frederiksberg, Denmark >> Phone: (+45)38153501 >> Office: A 4.23 >> Email: pd....@cbs.dk Priv: pda...@gmail.com >> >> >> >> >> >> >> >> >> > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.