Hi Josh,

Thanks for pointing this out. It hadn't occurred to me that someone might post 
something like this to indicate they would like to receive fewer or no 
messages. 

Paul 

--- On Mon, 5/21/12, Joshua Wiley <jwiley.ps...@gmail.com> wrote:

> From: Joshua Wiley <jwiley.ps...@gmail.com>
> Subject: Re: [R] Complex text parsing task
> To: "Paul Miller" <pjmiller...@yahoo.com>
> Cc: "Nick Gayeski" <n...@wildfishconservancy.org>, r-help@r-project.org
> Received: Monday, May 21, 2012, 11:01 AM
> Hi Paul,
> 
> I do not think that Nick's comment was really meant to be
> directed at
> you.  He is probably just tired of getting so many
> emails from R-help.
> 
> Nick, to stop getting emails if you no longer want them, try
> following
> the link at the bottom of every single email you have
> received from
> R-help...you can unsubscribe yourself from there if you
> want.  If you
> like R-help but just do not like the quantity of emails, you
> could
> consider switching your subscription to a daily digest so
> you just get
> one email.  Alternately, you could create a special
> folder in your
> email for R-help messages, and create a filter that
> automatically
> sends all message from R-help to that special folder so you
> still have
> them all but they do not clutter up your inbox.
> 
> Cheers,
> 
> Josh
> 
> On Mon, May 21, 2012 at 8:53 AM, Paul Miller <pjmiller...@yahoo.com>
> wrote:
> > Hi Nick,
> >
> > Can you elaborate (hopefully in a constructive way) on
> what it is that you find objectionable about my post?
> >
> > Thanks,
> >
> > Paul
> >
> > --- On Mon, 5/21/12, Nick Gayeski <n...@wildfishconservancy.org>
> wrote:
> >
> >> From: Nick Gayeski <n...@wildfishconservancy.org>
> >> Subject: RE: [R] Complex text parsing task
> >> To: "'Paul Miller'" <pjmiller...@yahoo.com>,
> r-help@r-project.org
> >> Received: Monday, May 21, 2012, 10:36 AM
> >> Please stop sending these emails!
> >>
> >>
> >> -----Original Message-----
> >> From: r-help-boun...@r-project.org
> >> [mailto:r-help-boun...@r-project.org]
> >> On
> >> Behalf Of Paul Miller
> >> Sent: Monday, May 21, 2012 8:32 AM
> >> To: r-help@r-project.org
> >> Subject: [R] Complex text parsing task
> >>
> >> Hello Everyone,
> >>
> >> I have what I think is a complex text parsing task.
> I've
> >> provided some
> >> sample data below. There's a relatively simple
> version of
> >> the coding that
> >> needs to be done and a more complex version. If
> someone
> >> could help me out
> >> with either version, I'd greatly appreciate it.
> >>
> >> Here are my sample data.
> >>
> >> haveData <-
> >> structure(list(profile_key = structure(c(1L, 1L,
> 2L, 2L, 2L,
> >> 3L, 3L, 4L, 4L,
> >> 5L, 5L, 5L, 6L, 6L, 7L, 7L), .Label = c("001-001
> ",
> >> "001-002 ", "001-003 ", "001-004 ", "001-005 ",
> "001-006 ",
> >> "001-007 "
> >> ), class = "factor"), encounter_date =
> structure(c(9L, 10L,
> >> 11L, 12L, 13L,
> >> 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 4L, 7L, 7L), .Label
> = c("
> >> 2009-03-01 ", "
> >> 2009-03-22 ", " 2009-04-01 ", " 2010-03-01 ", "
> 2010-10-15
> >> ", " 2010-11-15
> >> ", " 2011-03-01 ", " 2011-03-14 ", " 2011-10-10 ",
> "
> >> 2011-10-24 ", "
> >> 2012-09-15 ", " 2012-10-05 ", " 2012-10-17 "
> >> ), class = "factor"), raw = structure(c(9L, 12L,
> 16L, 13L,
> >> 10L, 7L, 6L, 3L,
> >> 2L, 4L, 14L, 15L, 1L, 5L, 8L, 11L), .Label = c("
> ... If
> >> patient KRAS result
> >> is wild type, they will start Erbitux. ... (Several
> lines of
> >> material) ...
> >> Ordered KRAS mutation test 11/11/2011. Results are
> still not
> >> available. ...
> >> ", " ... KRAS (mutated). Therefore did not
> prescribe
> >> Erbitux. ... ", " ...
> >> KRAS (mutated). Will not prescribe Erbitux due to
> mutation.
> >> ... ", " ...
> >> KRAS (Wild). ...", " ... KRAS results are in.
> Patient has
> >> the mutation. ...
> >> ", " ... KRAS results still pending. Note that
> patient was
> >> negative for
> >> Lynch mutation. ...", " ... KRAS test results
> pending. Note
> >> that patient was
> >> negative for Lynch mutation. ...", " ... Ordered
> KRAS
> >> mutation testing on
> >> 02/15/2011. Results came back negative. ...
> (Several lines
> >> of material) ...
> >> Patient KRAS mutation test is negative. Will start
> Erbitux.
> >> ...", " ...
> >> Ordered KRAS testing on 10/10/2010. Results not
> yet
> >> available. If patient
> >> has a mutaton, will start Erbitux. ...", " ...
> Ordered KRAS
> >> testing. Waiting
> >> for results. ...", " ... Patient is KRAS negative.
> Started
> >> Erbitux on
> >> 03/01/2011. ...", " ... Received KRAS results on
> 10/20/2010.
> >> Test results
> >> indicate tumor is wild type. Ua Protein positve.
> ER/PR
> >> positive. HER2/neu
> >> positve. ...", " ... Still need to order KRAS
> mutation
> >> testing. ... ", " ...
> >> Tumor is negative for KRAS mutation. ...", " ...
> Tumor is
> >> wild type. Patient
> >> is eligible to receive Eribtux. ...", " ... Will
> conduct
> >> KRAS mutation
> >> testing prior to initiation of therapy with
> Erbitux. ..."
> >> ), class = "factor")), .Names = c("profile_key",
> >> "encounter_date", "raw"),
> >> row.names = c(NA, -16L), class = "data.frame")
> >>
> >> The following code displays the results of
> so-called
> >> "simple" coding.
> >>
> >> #### Simple coding ####
> >>
> >> KRASpatient <- c("001-001", "001-002",
> "001-003",
> >> "001-004", "001-005",
> >> "001-006",  "001-007") KRAStested <-
> >> c(2,3,2,2,2,3,3) KRASwild <-
> >> c(1,0,2,0,3,1,3) KRASmutant <- c(4,2,2,3,1,2,2)
> >> simpleData <-
> >> data.frame(KRASpatient, KRAStested, KRASwild,
> KRASmutant)
> >> simpleData
> >>
> >> Here, KRAStested is calculated by summing all
> references to
> >> "KRAS" for each
> >> patient. Wild is calculated by summing all
> references to
> >> "wild type",
> >> "wild", and "negative" that come within 20 words of
> the
> >> closest reference to
> >> KRAS. Mutant is calculated by summing all
> references to
> >> "mutant", "mutated",
> >> and "positive" that occur within 20 words of the
> closest
> >> reference to KRAS.
> >>
> >>
> >> The second kind of coding is what I'm referring to
> as
> >> "complex coding".  The
> >> following code displays the results of this type of
> coding.
> >>
> >> #### Complex coding ####
> >>
> >> KRAStested <- c(2,1,0,2,2,2,3)
> >> KRASwild <- c(1,0,0,0,3,0,3)
> >> KRASmutant <- c(0,0,0,3,0,1,0)
> >> complexData <- data.frame(KRASpatient,
> KRAStested,
> >> KRASwild, KRASmutant)
> >> complexData
> >>
> >> The results of "complex coding" differ
> substantially from
> >> those obtained
> >> under "simple coding" and I think illustrate the
> potential
> >> problems with
> >> that approach. With "complex coding", the goal
> would be to
> >> identify and sum
> >> only true references to KRAS testing and true
> references to
> >> the result of
> >> that testing (either wild type/negative or
> >> mutant/positive).
> >>
> >> True references to KRAS testing would be identified
> using a
> >> set of
> >> qualifiers that eliminate the false references. So,
> for
> >> example, one of the
> >> patients in my (made up) sample data has the phrase
> "Will
> >> conduct KRAS
> >> mutation testing prior to initiation of therapy
> with
> >> Erbitux" in their
> >> medical record. In this case, "Will" is a qualifier
> that
> >> indicates this is
> >> not a true reference to KRAS testing. For this
> exercise,
> >> other qualifiers
> >> related to KRAS testing would include "need",
> "order" (but
> >> not the past
> >> tense "ordered"), "wait", "waiting", "await", and
> >> "awaiting".
> >> To be a qualifier, these terms would need to occur
> within 12
> >> words of the
> >> closest true reference to KRAS.
> >>
> >> True references to the results of testing would
> also be
> >> identified using a
> >> set of qualifiers that eliminate false references.
> Here the
> >> list of
> >> qualifiers would include "if", "lynch", "kras
> mutation
> >> test", "kras mutation
> >> testing" and "for kras mutation". Qualifiers would
> need to
> >> come within 12
> >> words of a true reference to KRAS testing.
> >>
> >> There's an additional wrinkle for identifying true
> >> references to the results
> >> of testing. One also needs to take into account the
> presence
> >> of what I'm
> >> calling "nullifiers". For purposes of this
> exercise,
> >> nullfiers include "Ua
> >> Protein", "ER/PR", and "HER2/neu" If "positive" or
> >> "negative" come closer to
> >> one of these words than to a true reference to
> KRAS, then
> >> they should not be
> >> used to identify the results of KRAS testing.
> >>
> >> Help with either type of coding would be greatly
> >> appreciated.
> >>
> >> Thanks,
> >>
> >> Paul
> >>
> >> ______________________________________________
> >> R-help@r-project.org
> >> mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained,
> reproducible
> >> code.
> >>
> >>
> >>
> >>
> >
> > ______________________________________________
> > R-help@r-project.org
> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
> 
> 
> 
> -- 
> Joshua Wiley
> Ph.D. Student, Health Psychology
> Programmer Analyst II, Statistical Consulting Group
> University of California, Los Angeles
> https://joshuawiley.com/
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to