Hi Erin,
Yes, creating your customized dictionary is the way to go. You can prune by 
semantic types of interest and then remove branches that are not relevant to 
your specific phenotype. I am not aware of cTAKES implementing such a tool for 
a very customized dictionary.

You can also start with  a few terms that you know are relevant to your 
phenotype and then find their synonyms in the UMLS. Then, you can further walk 
a specific ontology and take siblings, parents if you think they are relevant.

Then, there is the whole field of using word embeddings to find 
synonyms/related terms from unlabeled data  if you want to become really fancy 
:-) At this point, cTAKES does not implement any deep learning algorithms, in 
the future we are planning to release a bridge to KERAS. 

I hope this makes sense.

--
Guergana Savova, PhD, FACMI
Associate Professor
PI Natural Language Processing Lab
Boston Children's Hospital and Harvard Medical School
300 Longwood Avenue
Mailstop: BCH3092
Enders 144.1
Boston, MA 02115
Tel: (617) 919-2972
Fax: (617) 730-0817
guergana.sav...@childrens.harvard.edu
Harvard Scholar: http://scholar.harvard.edu/guergana_k_savova/biocv
ctakes.apache.org
thyme.healthnlp.org
cancer.healthnlp.org
share.healthnlp.org


-----Original Message-----
From: Erin Nicole Gustafson [mailto:erin.gustaf...@northwestern.edu] 
Sent: Wednesday, February 15, 2017 1:38 PM
To: dev@ctakes.apache.org
Subject: Phenotype-specific entities

Hi all,

I would like to be able to only identify entities that are relevant for some 
specific phenotype. One step towards achieving this would be to build a custom 
dictionary with a limited set of semantic types. However, this is not quite 
specific enough to only identify mentions related to one disease while ignoring 
those related to some other disease, for example.

Does cTAKES currently have a way to do this sort of filtering? Or, has anyone 
developed their own tools that they'd be willing to share?

Thanks,
Erin

Reply via email to