Hi Bill,
Modifying `f2` seems to solve the problem.
f2 <- function (data)
{
library(dplyr)
data%>%
group_by(id, value) %>%
mutate(date=as.Date(date))%>%
arrange(date) %>%
filter(indx =any(c(abs(diff(date)),NA) >31)& date==min(date)) %>%
filter(row_number()==1)
}
> filter(any(c(abs(diff(as.Date(date))),NA)>31)& date == min(date))
Note that the 'date == min(date)' will cause superfluous output rows
when there are several readings on initial date for a given id/value
pair. E.g.,
> dat1 <- data.frame(stringsAsFactors=FALSE, id=rep("A", 4), value=rep("x", 4)
Using base R you can solve this by doing some sorting and comparing
the first and last dates in each id-value group. Computing the last
and last dates can be vectorized.
f1 <- function(data) {
# sort by id, break ties with value, break remaining ties with date
sortedData <- data[with(data
Hi,
If `dat` is the dataset
library(dplyr)
dat%>%
group_by(id,value)%>%
arrange(date=as.Date(date))%>%
filter(any(c(abs(diff(as.Date(date))),NA)>31)& date == min(date))
#Source: local data frame [3 x 3]
#Groups: id, value
#
# id date value
#1 a 2000-01-01 x
#2 c 2000-09-10 y
#3
On Wed, Jul 16, 2014 at 8:51 AM, jim holtman wrote:
> I can reproduce what you requested, but there was the question about
> what happens with the multiple 'c-y' values.
>
>
>
>> require(data.table)
>> x <- read.table(text = 'id date value
> + a2000-01-01 x
> + a2000
Thanks guys - amazingly prompt solutions from the R community as always.
Yes, the c-y value reverts to just the first date event - the spirit of
this is that I am trying to identify and confirm a list of diagnoses that
a patient has coded in government administrative data. Once a diagnosis is
made
Thanks. So you only want a single entry with a given "id" & "value",
even if there are multiple possible confirmations.
Too bad about not being in an SQL data base. I've already partially
solved the problem using PostgreSQL. Just in case you, or others,
might be interested, below is a transcript o
I can reproduce what you requested, but there was the question about
what happens with the multiple 'c-y' values.
> require(data.table)
> x <- read.table(text = 'id date value
+ a2000-01-01 x
+ a2000-03-01 x
+ b2000-11-11 w
+ c2000-11-11 y
+ c2000-10-01
On Wed, Jul 16, 2014 at 8:07 AM, Williams Scott
wrote:
> Hi R experts,
>
> I have a dataset as sampled below. Values are only regarded as Œconfirmed¹
> in an individual (Œid¹) if they occur
> more than once at least 30 days apart.
>
>
> id date value
> a2000-01-01 x
> a2000-03-01 x
> b
Hi R experts,
I have a dataset as sampled below. Values are only regarded as Œconfirmed¹
in an individual (Œid¹) if they occur
more than once at least 30 days apart.
id date value
a2000-01-01 x
a2000-03-01 x
b2000-11-11 w
c2000-11-11 y
c2000-10-01 y
c2000-09-10 y
c
10 matches
Mail list logo