…or should it be spacyTAKES? Horrible name, but potentially useful tool. Thanks 
again to Tim for the inspiration.

Proof-of-principle FastAPI tool here:
https://github.com/twloehfelm/ctakescy

This stands up a FastAPI endpoint to which you send a cTAKES CAS XMI file 
(output from the clinical pipeline, for example), Typesystem XML file, and a 
list of Types to modify ([…DiseaseDisorderMention, …SignSymptomMention], for 
example). It returns to you an updated CAS with polarity (and optionally: 
uncertainty, historyOf, subject, conditional) set by either the negex 
(negspaCy; polarity only) or ConText (medspacy context: polarity + other 
attributes) spaCy components.

This is proof-of-principle, not production ready – it works from the FastAPI 
test panel at least. In a production environment I’d make some obvious changes 
to make it more efficient and snappier (don’t need to send the typesystem with 
each request, and don’t need to initialize a new Language model with each 
request), but this works as a proof-of-principle and those changes are trivial 
to make.

I was looking for a way to leverage Python-based NLP tools like spaCy while 
preserving the core features and rich annotations of cTAKES. I’m not very 
facile with Java or the cTAKES dev process, so this is my way of moving things 
over to Python where I can iterate and test faster than I can in Java.

The solution I came up with is to piece together tools that allow:

  1.  manipulating CAS in python (dkpro-cassis library)
  2.  accessing the rich spaCy ecosystem
     *   build a spaCy Doc from existing cTAKES-assigned CAS attributes

                                                               i.      super 
useful content for understanding spaCy framework here: 
https://applied-language-technology.mooc.fi/html/about.html

                                                             ii.      For now I 
am just adding cTAKES sentences and Entities (as spaCy Spans) – as far as I can 
tell these are the only required upstream pipeline outputs for the negSpacy and 
medspacy ConText algorithms.

           *   It is not unreasonable to also add POS, Chunks, ConLL 
dependencies, etc to the spaCy Doc object, so if you want to use a spaCy pipe 
component that requires those your options are to map them from cTAKES CAS or 
use an existing spaCy model that includes them.
  1.  returning the updated but still valid CAS

**CONFIDENTIALITY NOTICE** This e-mail communication and any attachments are 
for the sole use of the intended recipient and may contain information that is 
confidential and privileged under state and federal privacy laws. If you 
received this e-mail in error, be aware that any unauthorized use, disclosure, 
copying, or distribution is strictly prohibited. If you received this e-mail in 
error, please contact the sender immediately and destroy/delete all copies of 
this message.

Reply via email to