Re: Best practices for documenting NLP versions

2022-10-21 Thread Greg Silverman
e a cleverer > solution? > > Peter > > On Fri, Oct 21, 2022 at 10:18 PM Greg Silverman > wrote: > > > Why not use Docker and versioning by tags? See "C. Boettiger, An > > introduction to Docker for reproducible research, SIGOPS Oper. Syst. Rev. >

Re: Best practices for documenting NLP versions

2022-10-21 Thread Greg Silverman
Why not use Docker and versioning by tags? See "C. Boettiger, An introduction to Docker for reproducible research, SIGOPS Oper. Syst. Rev. 49 (2015) 71–79. doi:10.1145/2723872.2723882. " On Fri, Oct 21, 2022 at 3:15 PM Peter Abramowitsch wrote: > Wel

Re: Segment annotation type

2022-03-23 Thread Greg Silverman
ges and unwanted printing in the log which > I've recently modified. Also I did some optimization of the code which was > wasting compute cycles by re-initializing itself for every document. I > can check it in, but you can get a good flavor of it by trying what's in >

Segment annotation type

2022-03-22 Thread Greg Silverman
How do I modify org.apache.ctakes.typesystem.type.textspan.Segment to actually create annotations for document segments/sections? Also, how do I disable annotations for the SemanticRoleRelation annotation type? Thanks! Greg-- -- Greg M. Silverman Senior Systems Developer NLP/IE

Re: Issue with serializable XML

2022-03-06 Thread Greg Silverman
= [XMLcleaner(x).xmlstring for x in df_text] > > df[‘TEXT_FIELD’] = cleaned > > > > Best, > > John > > > > *From: *Greg Silverman > *Date: *Sunday, March 6, 2022 at 5:10 PM > *To: *jrcas...@medicine.wisc.edu.invalid > > *Cc: *dev@ctakes.apache.org &g

Re: Issue with serializable XML

2022-03-06 Thread Greg Silverman
r 6, 2022 at 2:46 PM JOHN R CASKEY wrote: > I’ve encountered that when the input text file has control characters, for > example ^M > > The fix I used was to remove all control characters from the input text > files ahead of time via python. > > Best, > John Caskey > U

Issue with serializable XML

2022-03-06 Thread Greg Silverman
Got the error during processing of a large set of documents about mid-way through: org.xml.sax.SAXParseException: Trying to serialize non-XML 1.0 character: , 0x1c I encountered this once before, but I don't remember what the fix was. Running apache-ctakes-4.0.1-SNAPSHOT. Thanks! Greg-- -- Gre

Re: rule-based lookup for custom lexicon [EXTERNAL] [SUSPICIOUS]

2021-05-19 Thread Greg Silverman
Ik1haWwiL > > CJXVCI6Mn0%3D%7C1000&sdata=NplkaaVc1VSAzprb2eKYEWDZyjlceT%2FIzx0X9 > > Y23yco%3D&reserved=0 > > >>> > > >> > > https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Furld > > efense.com%2Fv3%2F__https%3A%2F%2Fjavadoc.

Re: rule-based lookup for custom lexicon [EXTERNAL]

2021-05-18 Thread Greg Silverman
at 10:04 AM Finan, Sean < sean.fi...@childrens.harvard.edu> wrote: > To which ctakes component(s) are you referring? > ____ > From: Greg Silverman > Sent: Sunday, May 16, 2021 6:02 PM > To: dev@ctakes.apache.org; Himanshu Shekhar Sahoo &g

rule-based lookup for custom lexicon

2021-05-16 Thread Greg Silverman
I looked all over and could not find any information on how to add this pipeline component to cTAKES. I assume it uses UIMA Ruta? Thanks in advance! Greg-- -- Greg M. Silverman Senior Systems Developer NLP/IE Department of Surgery Universi

Re: access to presentation on cTAKES at scale [EXTERNAL]

2021-02-23 Thread Greg Silverman
lk ended up in the parent > > https://drive.google.com/drive/folders/1ngYeqkNWZNMLNpM69OFC9cDTEXYt5hPz > > > It is named "thayer_miller_ctakes_spark.pdf" > > ____ > From: Greg Silverman > Sent: Tuesday, February 23, 2021 11:07 AM &

Re: access to presentation on cTAKES at scale [EXTERNAL]

2021-02-23 Thread Greg Silverman
Thanks! No available powerpoint, I presume? Greg-- On Tue, Feb 23, 2021 at 10:03 AM Finan, Sean < sean.fi...@childrens.harvard.edu> wrote: > https://www.youtube.com/watch?v=kZw42pGzyHs > ____ > From: Greg Silverman > Sent: Tuesday, Febru

access to presentation on cTAKES at scale

2021-02-23 Thread Greg Silverman
Hi, Is this presentation publically available: "Fault-Tolerant, Distributed, and Scalable Natural Language Processing with cTAKES"? Unfortunately, I'm not an AMIA member. Thanks in advance! Greg-- -- Greg M. Silverman Senior Systems Developer NLP/IE

Re: authentication issue?

2021-02-10 Thread Greg Silverman
, it is working > > 10 Feb 2021 20:18:43 INFO UmlsUserApprover - Checking UMLS Account at > https://utslogin.nlm.nih.gov/cas/v1/api-key: > ..10 Feb 2021 20:18:45 INFO UmlsUserApprover - UMLS Account has been > validated > > > Peter > > On Wed, Feb 10, 2021 at 7:12 PM

authentication issue?

2021-02-10 Thread Greg Silverman
This had been working (4.0.0.1), but now am getting the following error: 10 Feb 2021 18:03:40 INFO SentenceDetector - Sentence detector model file: org/apache/ctakes/core/sentdetect/sd-med-model.zip 10 Feb 2021 18:03:40 INFO TokenizerAnnotatorPTB - Initializing org.apache.ctakes.core.ae.Tokenize

Re: error: CRITICAL [EXTERNAL]

2021-02-10 Thread Greg Silverman
different name this is actually > pretty easy to do - you can find loads of "how to" on the web. > The advantage to #2 is that you just replace the jar file in your bin/ > directory with the new copy. However, there are the disadvantages as > listed above. > > I recomm

Re: error: CRITICAL

2021-02-10 Thread Greg Silverman
r rather than the core cTakes code itself. > > Peter > > On Wed, Feb 10, 2021 at 4:25 PM Greg Silverman > wrote: > > > We're running version 4.0.0.1 on ~12K notes. The first time we ran it I > got > > a heap space error at ~10.5k notes processed (at about ~3

error: CRITICAL

2021-02-10 Thread Greg Silverman
We're running version 4.0.0.1 on ~12K notes. The first time we ran it I got a heap space error at ~10.5k notes processed (at about ~38 hours). I increased the heap space params and then reran. This time it died at the same place, but with a different error (see below): SEVERE: Exception occurred

Re: 4.0.1 build [EXTERNAL]

2021-02-03 Thread Greg Silverman
ady-to-download release of 4.0.1. In any case, thanks for the continuous > work on cTAKES. > > Regards, > Tomasz > > ____ > From: Greg Silverman > Sent: Wednesday, February 3, 2021 2:32 PM > To: dev@ctakes.apache.org > Subject: Re:

Re: 4.0.1 build [EXTERNAL]

2021-02-03 Thread Greg Silverman
easable state. It > requires a team effort, and getting enough volunteers together with > substantial overlapping free time is difficult to say the least. Of > course, a greater number of volunteers helps spread the burden. > > Sean > &g

Re: 4.0.1 build [EXTERNAL]

2021-02-02 Thread Greg Silverman
;latest official > release"? > > Sean > ____ > From: Greg Silverman > Sent: Tuesday, February 2, 2021 10:51 AM > To: dev@ctakes.apache.org > Subject: 4.0.1 build [EXTERNAL] > > * External Email - Caution * > > > An

4.0.1 build

2021-02-02 Thread Greg Silverman
Any chance of pushing 4.0.1 into the available prebuilt download? Had to revert from it to 4.0.0.1 after the authentication switch, and have to say it is painfully slow in comparison. Thanks for the consideration. Greg-- -- Greg M. Silverman Senior Systems Developer NLP/IE

Re: performance report [EXTERNAL]

2021-01-25 Thread Greg Silverman
dd a cpe.getPerformanceReport() after cpe.process() you should > have a ProcessTrace object. This is where my guessing ends as I have never > used a ProcessTrace and don't know exactly what to beg of it. > > I hope that is a decent start, > Sean >

Re: performance report

2021-01-23 Thread Greg Silverman
gt; begins a geometric rise in complexity of internal structures that depend on > sentences and a serious elevation of processing time. > > Peter > > Sent from my iPad > > > On Jan 23, 2021, at 18:09, Greg Silverman wrote: > > > > I found this: > > htt

performance report

2021-01-23 Thread Greg Silverman
I found this: https://medium.com/@felix_chan/install-apache-ctakes-924c40967ce2, which states: "A performance report is generated when the process is done." However, we are running this from the command line and no such report is being generated. Thanks! On Sat, Jan 23, 2021 at 11:

getting job information

2021-01-23 Thread Greg Silverman
Hi all, Is there a way to easily generate a performance report similar to the one generated by MetaMap (with timings for each task, etc.)? Thanks in advance! Greg-- -- Greg M. Silverman Senior Systems Developer NLP/IE Department of Surger

real time indexing of notes

2021-01-22 Thread Greg Silverman
I recall a thread recently about how cTAKES was being used for real-time indexing of notes. Does anyone have the references to which this referred (I believe it was about a SPARC cluster). Thanks! Greg-- -- Greg M. Silverman Senior Systems Developer NLP/IE

Re: Changes to UTS Authentication for Authorized Content Distributors

2020-11-11 Thread Greg Silverman
as above > > > > On Wed, Nov 11, 2020 at 7:27 PM Greg Silverman > wrote: > > > For example, the user installation guide has not been updated to reflect > > the changes NLM is implementing. The impact for our workflow is pretty > > significant, so without a clear

Re: Changes to UTS Authentication for Authorized Content Distributors [EXTERNAL]

2020-11-11 Thread Greg Silverman
s > developers and we haven't yet mapped out the effort. > > We will endeavor to have both an implementation and documentation > available before the current authentication is no longer supported by the > NLM. > > Sean > ________ >

Re: Changes to UTS Authentication for Authorized Content Distributors

2020-11-11 Thread Greg Silverman
-- On Tue, Nov 10, 2020 at 9:18 AM Greg Silverman wrote: > It's still unclear what this means for me as a user of a piece of software > that uses UTS for authentication purposes. Could someone please, in plain > language, describe what we as normal users who use software r

Re: Changes to UTS Authentication for Authorized Content Distributors

2020-11-10 Thread Greg Silverman
It's still unclear what this means for me as a user of a piece of software that uses UTS for authentication purposes. Could someone please, in plain language, describe what we as normal users who use software reliant on this authentication mechanism will have to do in order to not disrupt any runni

Re: Current thinking on new UMLS authentication

2020-09-18 Thread Greg Silverman
I never received the email you mentioned. I assume this will affect the API call to NLM for UMLS validation? If it does, why not take the NLM's model for UMLS and only require UMLS credentials at the time of download? Greg-- On Fri, Sep 18, 2020 at 12:33 PM Peter Abramowitsch wrote: > Hi All

Re: Looking for literature [EXTERNAL]

2020-01-29 Thread Greg Silverman
drens.harvard.edu > Harvard Scholar: http://scholar.harvard.edu/guergana_k_savova > > > -Original Message- > From: Greg Silverman [mailto:g...@umn.edu] > Sent: Wednesday, January 29, 2020 2:54 PM > To: dev@ctakes.apache.org > Subject: Looking for literature [EX

Looking for literature

2020-01-29 Thread Greg Silverman
I'm digging around for literature on the relationship between cTAKES and MiPACQ, and of course found this paper, "The MiPACQ Clinical Question Answering System," which describes how cTAKES was used wrt to the question and answering component of MiPACQ. However, I'm more interested in how cTAKES wa

Re: UMLS version [EXTERNAL]

2020-01-06 Thread Greg Silverman
> It is fairly easy to create a ctakes dictionary based upon other umls > versions. > > Sean > ________ > From: Greg Silverman > Sent: Monday, January 6, 2020 5:11 PM > To: dev@ctakes.apache.org > Subject: UMLS version [EXTERNAL] > >

UMLS version

2020-01-06 Thread Greg Silverman
Which version of the UMLS is cTAKES v4.0 using? Is it 2017ab? Thanks! Greg-- -- Greg M. Silverman Senior Systems Developer NLP/IE Department of Surgery University of Minnesota g...@umn.edu › evaluate-it.org ‹

score attribute in UmlsConcept annotation

2019-10-15 Thread Greg Silverman
I noticed this was always 0. Is this a place holder for future work? Thanks! Greg-- -- Greg M. Silverman Senior Systems Developer NLP/IE Department of Surgery University of Minnesota g...@umn.edu › evaluate-it.org ‹

Re: Large files taking forever to process [EXTERNAL]

2019-09-30 Thread Greg Silverman
build? > > On Sunday, September 29, 2019, Greg Silverman wrote: > > > Trying to do the maven build and getting the following error: "You're not > > authorized to execute any SonarQube analysis. Please contact your > SonarQube > > administrator." > > >

Re: Large files taking forever to process [EXTERNAL]

2019-09-29 Thread Greg Silverman
. On Sun, Sep 29, 2019 at 11:18 AM Greg Silverman wrote: > Trying to do the maven build and getting the following error: "You're not > authorized to execute any SonarQube analysis. Please contact your SonarQube > administrator." > > Please advise. I'm und

Re: Large files taking forever to process [EXTERNAL]

2019-09-29 Thread Greg Silverman
29, 2019 at 10:56 AM Greg Silverman wrote: > Never mind! I see I have to build from source. > > Greg-- > > On Sun, Sep 29, 2019 at 10:44 AM Greg Silverman wrote: > >> Hi Sean, >> I just ran another set of notes through cTAKES and noticed the following >> err

Re: Large files taking forever to process [EXTERNAL]

2019-09-29 Thread Greg Silverman
Never mind! I see I have to build from source. Greg-- On Sun, Sep 29, 2019 at 10:44 AM Greg Silverman wrote: > Hi Sean, > I just ran another set of notes through cTAKES and noticed the following > error: > > log4j: Setting property [conversionPattern] to [%d{dd MMM HH:mm

Re: Large files taking forever to process [EXTERNAL]

2019-09-29 Thread Greg Silverman
so, how do I construct the WindowedAttributeCleartkSubPipe.piper file? Thanks very much in advance! Greg-- On Tue, Sep 24, 2019 at 7:27 PM Greg Silverman wrote: > Sweet! That was definitely it! It's flying now (granted, our files are not > in the > 1 mb realm, like it the jira i

disambiguated feature for UMLSConcept

2019-09-28 Thread Greg Silverman
This harks back to a question. I asked a few months about disambiguation. Is this enabled by default for the UMLSConcept annotation type (I've noticed the value is always "false," so I would guess not). If not, how does this get enabled? Thanks! Greg-- -- Greg M. Silverman Senior Systems Devel

Re: [EXTERNAL] Large files taking forever to process

2019-09-24 Thread Greg Silverman
that you can achieve linear results if you convert these > classes to use TreeMaps. We actually build the tree maps one time and > cache them in ThreadLocal variables which allows us to process multiple > threads simultaneously. > > Hope this helps, > John > > -Original

Re: Large files taking forever to process [EXTERNAL]

2019-09-24 Thread Greg Silverman
lace it with "load > WindowedAttributeCleartkSubPipe". > > It isn't a full fix for the problem, and I don't know if it will make your > processing faster, but you can give it a try. > > Sean > > > From: Greg Si

Large files taking forever to process

2019-09-24 Thread Greg Silverman
Any suggestions on how to speed up processing large clinical text notes approaching 13K lines? This is a very old corpus culled from EPIC notes back in 2009. I thought about splitting the notes into smaller chunks, but then I would have to deal with the offsets when analyzing system output against

Re: acronyms/abbreviations [EXTERNAL]

2019-05-19 Thread Greg Silverman
, or by some kind of rules mechanism, but that > would also be labor intensive - a never-to-be-finished effort. These might > require the creation of an instant/lightweight VMR to structure the > contextual elements from the note that the scoring mechanism would reason > over.But

Re: acronyms/abbreviations [EXTERNAL]

2019-05-17 Thread Greg Silverman
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3111590/ On Fri, May 17, 2019 at 8:23 PM Greg Silverman wrote: > Yes, and regarding your last paragraph: This is where disambiguation comes > into play. Here is one method: > https://www.cs.cmu.edu/afs/cs/project/jair/pub/volume23/montoy

Re: acronyms/abbreviations [EXTERNAL]

2019-05-17 Thread Greg Silverman
is that many acronyms have multiple > meanings. Thus, you may accurately be able to tell that your identified > concept came from an acronym, but it was the wrong concept!! > > Peter > > On Thu, May 16, 2019 at 4:31 AM Greg Silverman wrote: > > > Got it! > > > &g

Re: acronyms/abbreviations [EXTERNAL]

2019-05-15 Thread Greg Silverman
(imio) nlp problem, so call your kudos with a solution! > > Sean > > > From: Greg Silverman > Sent: Wednesday, May 15, 2019 9:21 PM > To: dev@ctakes.apache.org > Subject: Re: acronyms/abbreviations [EXTERNAL] > > I'm just wondering how acronyms are identified as

Re: acronyms/abbreviations [EXTERNAL]

2019-05-15 Thread Greg Silverman
put components that can produce different formats > containing various types of information. > > Do you prefer to parse ml ? Or is columnized text output ok? Does this > go to a post-processing engine or a human user? > > Thanks, > > Sean > ____

acronyms/abbreviations

2019-05-15 Thread Greg Silverman
How can I get these from the XMI annotations? Thanks! Greg-- -- Greg M. Silverman Senior Systems Developer NLP/IE University of Minnesota g...@umn.edu › evaluate-it.org ‹

Re: Negation

2019-03-03 Thread Greg Silverman
Perfect! Thanks. On Fri, Mar 1, 2019 at 9:16 PM Peter Abramowitsch wrote: > In an IdentifiedAnnotation, the attribute "polarity" reflects the negation > value. > > On Fri, Mar 1, 2019 at 5:59 PM Greg Silverman wrote: > > > I found this: > > > https://cw

Re: Negation

2019-03-01 Thread Greg Silverman
I found this: https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0+-+Assertion ... however, I don't see any attributes or types regarding negation. On Fri, Mar 1, 2019 at 5:41 PM Greg Silverman wrote: > I'm extracting CUI/concepts from our annotations and would like an

Negation

2019-03-01 Thread Greg Silverman
I'm extracting CUI/concepts from our annotations and would like any negations associated with these. We're running the default pipeline, but I can't see anything in the XMI output that resembles a negation for the UmlsConcepts. Are negation annotators not part of the default pipeline? Thanks! Gre

Re: uima-as examples [EXTERNAL]

2019-01-18 Thread Greg Silverman
7;s a work in progress but it > might be helpful: > https://github.com/tmills/ctakes-docker > Tim > > > -Original Message- > From: Greg Silverman mailto:greg%20silverman%20%3c...@umn.edu > %3e>> > Reply-to: > To: dev@ctakes.apache.org<mailto:dev@ctakes.

Re: uima-as examples

2019-01-18 Thread Greg Silverman
and not just in-process subsystems to the ctakes server process. > Is that right? > > On Thu, Jan 17, 2019 at 4:09 PM Greg Silverman wrote: > > > Anyone out there developed a pipeline using UIMA-AS, as opposed to the > > CPE/CPM file reader? > > > > Thanks in a

uima-as examples

2019-01-17 Thread Greg Silverman
Anyone out there developed a pipeline using UIMA-AS, as opposed to the CPE/CPM file reader? Thanks in advance! Greg-- -- Greg M. Silverman Senior Systems Developer NLP/IE Cardiovascular Informatics Uni