>Finally an explanation that makes sense.
-- It frequently takes a while to get one of those out of me ...

> I don't have check-in privileges so will keep it private for
now.
-- We shall have to do something about that.

Cheers,
Sean

________________________________________
From: Peter Abramowitsch <pabramowit...@gmail.com>
Sent: Friday, August 14, 2020 1:17 PM
To: dev@ctakes.apache.org
Subject: Re: Need a little more help on dictionaries [EXTERNAL]

* External Email - Caution *


Hurray!
Finally an explanation that makes sense.  I just couldn't figure out how
you could have made sno_rx with that dictionary creator.   Clearly, those
helper files represent a LOT of work.

I have locally modified the dictionary creator code to look for the system
property ctakes.dictgui_helperdata as a way to point it to another of those
directories.  I don't have check-in privileges so will keep it private for
now.

Many thanks for your help.

Peter

On Fri, Aug 14, 2020 at 9:51 AM Finan, Sean <
sean.fi...@childrens.harvard.edu> wrote:

> Hi Peter,
>
> shining a flashlight back into the dark ages ...
>
> You have found the advanced configuration directories!
>
> Those actually precede the gui dictionary creator and were a big part of
> formatting with the previous cli dictionary creator.  The cli was versatile
> but not simple.  The default collection of configuration files for the cli
> had a lot more going on.
>
> I think that I made "tiny/" directory the default for the gui because it
> didn't do as much manipulation and I wanted things to be a greater 1:1
> match with the source.
>
> I obviously used something other than the simple "tiny/" configuration
> when I made sno_rx_16ab.   I remember running repeated tests on some
> corpora as well as manually inspecting the produced databases.
>
> I can't believe that I had forgotten all of this.
>
> You should be able to mix and match files from the different configuration
> directories and just throw them into your own directory (or tiny/) then
> point DEFAULT_.. to your directory and recompile.
>
>
> Sean
>
> ________________________________________
> From: Peter Abramowitsch <pabramowit...@gmail.com>
> Sent: Friday, August 14, 2020 12:22 PM
> To: dev@ctakes.apache.org
> Subject: Re: Need a little more help on dictionaries [EXTERNAL]
>
> * External Email - Caution *
>
>
> Hi Sean
>
> I think I found the answer, and I have one question.
>
> In dictionary creator, the hardwired dir is "tiny" that in fact has an
> empty file for those abbreviations
>
> In DictionaryBuilder.java:
>
> *static private final String DEFAULT_DATA_DIR =
> "org/apache/ctakes/gui/dictionary/data/tiny";*
> *...*
> *final UmlsTermUtil umlsTermUtil = new UmlsTermUtil( DEFAULT_DATA_DIR );*
>
> The command line args are not used in this application, neither are
> sysprops or environment vars so there's no way to change it short of
> recompiling.
>
> So the question is:  do you know why the empty version is the default?
>
> Peter
>
>
>
> On Fri, Aug 14, 2020 at 4:53 AM Finan, Sean <
> sean.fi...@childrens.harvard.edu> wrote:
>
> > Hi Peter,
> >
> > I don't have an answer but I do have a question:
> >
> > In your mrconso.rrf, do you see a snomed line item for "SOB" or only "SOB
> > -Shortness of breath" ?
> >
> > I think that the simple "SOB" and "sob" entries might be from other
> > vocabularies.
> >
> > There is (was?) logic in the dictionary creator to multiply things like
> > "SOB - Shortness of breath", "SOB (Shortness of breath)"  etc. and
> create 3
> > synonym entries: full, left and right.  There is a requirement that the
> > left side be all caps and a fitting acronym for the right side.
> However, I
> > vacillated on the correctness of this behavior as almost all terms
> already
> > had the 3 entries.  I am not sure what the current version of the creator
> > does.
> >
> > Dictionary creation is indeed a touchy operation.
> >
> > Sean
> > ________________________________________
> > From: Peter Abramowitsch <pabramowit...@gmail.com>
> > Sent: Thursday, August 13, 2020 11:57 PM
> > To: dev@ctakes.apache.org
> > Subject: Need a little more help on dictionaries [EXTERNAL]
> >
> > * External Email - Caution *
> >
> >
> > Hi All
> >
> > I'm able to create a subset with the UMLS mmsys tool, use the dictionary
> > creator on the full UMLS release, create, install and tweak the scripts
> > adding or removing aliases etc.  My goal is simply to add HUGO gene terms
> > to SNOMED and RXNORM.
> >
> > However I must be missing some bit of information on the use of mmsys or
> > the dictionary creator, because some very common terms are missing from
> my
> > dictionary but present in the released sno_rx
> >
> > As an example, the acronym SOB
> > in mmsys, the term SOB is present in my subset, and it is mapped into
> > SNOMED with the expected CUI 13404 and SNOMEDIDs same as sno_rx
> > I see the cui_tui mapping it into the correct TUI for a finding  INSERT
> > INTO TUI VALUES(13404,184)
> > I see the cui and the preferred term "dyspnea" in my *script file, and I
> > can resolve it in a note using the default consumer and obtaining the
> > correct SNOMED ID
> > I see lots of cui_term entries for the same CUI, and I can resolve them
> > too.  but  SOB is not present in my cui terms.
> > How did it get there?
> >
> > So either - I am not using one of the tools correctly, or in creating
> > SNO_RX, someone has added SOB by hand rather than using the creator.  And
> > if they have, they have probably also done other tweaks.
> >
> > Sean, Ghandi or Jeff
> > Can you explain this?
> >
> > Peter
> >
>

Reply via email to