Thanks Sean. In no way was the comment "explanation that makes sense" about you! I apologize if it sounded like that.
It is so funny, because in a former company where I was architect, many years ago, Oacis Healthcare (which implemented one of the first HL7 databases and gateways) there was another Sean, and this one too, held the accumulated memory and wisdom about a vital chunk of historical software. Everyone bombarded him with questions all day long because he was the one true source. At the end of the day, his exhaustion was total. My statement was rhetorical. Wracking my brain for an explanation I had possibly missed. Peter On Fri, Aug 14, 2020 at 10:27 AM Finan, Sean < sean.fi...@childrens.harvard.edu> wrote: > > >Finally an explanation that makes sense. > -- It frequently takes a while to get one of those out of me ... > > > I don't have check-in privileges so will keep it private for > now. > -- We shall have to do something about that. > > Cheers, > Sean > > ________________________________________ > From: Peter Abramowitsch <pabramowit...@gmail.com> > Sent: Friday, August 14, 2020 1:17 PM > To: dev@ctakes.apache.org > Subject: Re: Need a little more help on dictionaries [EXTERNAL] > > * External Email - Caution * > > > Hurray! > Finally an explanation that makes sense. I just couldn't figure out how > you could have made sno_rx with that dictionary creator. Clearly, those > helper files represent a LOT of work. > > I have locally modified the dictionary creator code to look for the system > property ctakes.dictgui_helperdata as a way to point it to another of those > directories. I don't have check-in privileges so will keep it private for > now. > > Many thanks for your help. > > Peter > > On Fri, Aug 14, 2020 at 9:51 AM Finan, Sean < > sean.fi...@childrens.harvard.edu> wrote: > > > Hi Peter, > > > > shining a flashlight back into the dark ages ... > > > > You have found the advanced configuration directories! > > > > Those actually precede the gui dictionary creator and were a big part of > > formatting with the previous cli dictionary creator. The cli was > versatile > > but not simple. The default collection of configuration files for the > cli > > had a lot more going on. > > > > I think that I made "tiny/" directory the default for the gui because it > > didn't do as much manipulation and I wanted things to be a greater 1:1 > > match with the source. > > > > I obviously used something other than the simple "tiny/" configuration > > when I made sno_rx_16ab. I remember running repeated tests on some > > corpora as well as manually inspecting the produced databases. > > > > I can't believe that I had forgotten all of this. > > > > You should be able to mix and match files from the different > configuration > > directories and just throw them into your own directory (or tiny/) then > > point DEFAULT_.. to your directory and recompile. > > > > > > Sean > > > > ________________________________________ > > From: Peter Abramowitsch <pabramowit...@gmail.com> > > Sent: Friday, August 14, 2020 12:22 PM > > To: dev@ctakes.apache.org > > Subject: Re: Need a little more help on dictionaries [EXTERNAL] > > > > * External Email - Caution * > > > > > > Hi Sean > > > > I think I found the answer, and I have one question. > > > > In dictionary creator, the hardwired dir is "tiny" that in fact has an > > empty file for those abbreviations > > > > In DictionaryBuilder.java: > > > > *static private final String DEFAULT_DATA_DIR = > > "org/apache/ctakes/gui/dictionary/data/tiny";* > > *...* > > *final UmlsTermUtil umlsTermUtil = new UmlsTermUtil( DEFAULT_DATA_DIR );* > > > > The command line args are not used in this application, neither are > > sysprops or environment vars so there's no way to change it short of > > recompiling. > > > > So the question is: do you know why the empty version is the default? > > > > Peter > > > > > > > > On Fri, Aug 14, 2020 at 4:53 AM Finan, Sean < > > sean.fi...@childrens.harvard.edu> wrote: > > > > > Hi Peter, > > > > > > I don't have an answer but I do have a question: > > > > > > In your mrconso.rrf, do you see a snomed line item for "SOB" or only > "SOB > > > -Shortness of breath" ? > > > > > > I think that the simple "SOB" and "sob" entries might be from other > > > vocabularies. > > > > > > There is (was?) logic in the dictionary creator to multiply things like > > > "SOB - Shortness of breath", "SOB (Shortness of breath)" etc. and > > create 3 > > > synonym entries: full, left and right. There is a requirement that the > > > left side be all caps and a fitting acronym for the right side. > > However, I > > > vacillated on the correctness of this behavior as almost all terms > > already > > > had the 3 entries. I am not sure what the current version of the > creator > > > does. > > > > > > Dictionary creation is indeed a touchy operation. > > > > > > Sean > > > ________________________________________ > > > From: Peter Abramowitsch <pabramowit...@gmail.com> > > > Sent: Thursday, August 13, 2020 11:57 PM > > > To: dev@ctakes.apache.org > > > Subject: Need a little more help on dictionaries [EXTERNAL] > > > > > > * External Email - Caution * > > > > > > > > > Hi All > > > > > > I'm able to create a subset with the UMLS mmsys tool, use the > dictionary > > > creator on the full UMLS release, create, install and tweak the scripts > > > adding or removing aliases etc. My goal is simply to add HUGO gene > terms > > > to SNOMED and RXNORM. > > > > > > However I must be missing some bit of information on the use of mmsys > or > > > the dictionary creator, because some very common terms are missing from > > my > > > dictionary but present in the released sno_rx > > > > > > As an example, the acronym SOB > > > in mmsys, the term SOB is present in my subset, and it is mapped into > > > SNOMED with the expected CUI 13404 and SNOMEDIDs same as sno_rx > > > I see the cui_tui mapping it into the correct TUI for a finding INSERT > > > INTO TUI VALUES(13404,184) > > > I see the cui and the preferred term "dyspnea" in my *script file, and > I > > > can resolve it in a note using the default consumer and obtaining the > > > correct SNOMED ID > > > I see lots of cui_term entries for the same CUI, and I can resolve them > > > too. but SOB is not present in my cui terms. > > > How did it get there? > > > > > > So either - I am not using one of the tools correctly, or in creating > > > SNO_RX, someone has added SOB by hand rather than using the creator. > And > > > if they have, they have probably also done other tweaks. > > > > > > Sean, Ghandi or Jeff > > > Can you explain this? > > > > > > Peter > > > > > >