Hi Akram, As you have guessed, the specific tags that interest you are created by the dictionary lookup. The ones that you don't want come from the annotators in that tokenizer pipeline.
The dictionary lookup requires the annotations created by the tokenizer pipeline. It looks up words within sentences, using the token types to determine validity of the candidates. The tokenizer pipeline should be very fast. Is there some reason that you want to get rid of it? Sean ________________________________________ From: Akram <as...@yahoo.com.INVALID> Sent: Thursday, March 5, 2020 1:59 AM To: dev@ctakes.apache.org Subject: Generating ICD10 without Tokenizing [EXTERNAL] * External Email - Caution * Hi This is how my .piper look like load DefaultTokenizerPipeline load DictionarySubPipe writeHtmlwriteXmis It works fine and I get these tags in .XMI file after running runPiperGui SentenceWordTokenPunctuationTokenSymbolTokenNumTokenContractionToken also these tags MedicationMentionDiseaseDisorderMentionSignSymptomMentionProcedureMentionAnatomicalSiteMentionLabMention/EventMention but when I change .piper to be like this load DictionarySubPipe writeHtmlwriteXmis I get non of the tags above!! How can I get .xmi with only these tags MedicationMentionDiseaseDisorderMentionSignSymptomMentionProcedureMentionAnatomicalSiteMentionLabMention/EventMention Thanks