Hi,

Am 10.03.2016 um 20:29 schrieb Azad Dehghan:
Thanks Peter,

The rules were modeled using the training data.

This means both training data folders? I have access to the data but not to the challenge description.

It would be good to incorporate/refactor (basically, GATE API needs to be
replaced with UIMA API to generate annotation) the two-pass recognition
method for cTAKES - which has a wider application on longitudinal data.
This method is used on-top of a number NERs.

I'll take a look.

I do not know how much time I can invest this month. Let's see how many phases I can translate.

I added the rules for age. Are there jape rules for creating date annotations?

After all rules are translated, they need some major refactoring. Jape and Ruta are quite different in some aspects.

Best,

Peter




Please let me know where I can help. I will be available again in April.

Cheers,
Azad

On 10 March 2016 at 13:13, Peter Klügl <peter.klu...@averbis.com> wrote:

Hi,

sorry, I was quite busy last month.

I added a new patch, which needs to be applied.

No new rules, but it's possible now to evaluate everything against the
labelled data of the challenge.

@Azad:
Which documents exactly did you use to develop the rules?
training-PHI-Gold-Set1, training-PHI-Gold-Set2 or testing-PHI-Gold-fixed?

Best,

Peter

Am 03.02.2016 um 09:05 schrieb Peter Klügl:
Hi,

the last patch fixed almost all problems.

I added another one that adds the csv file for the unit test and extends
svn-ignore.

Best,

Peter

Am 02.02.2016 um 09:16 schrieb Peter Klügl:
Hi,

I added another patch. I missed to manually add one test file to version
control, and there are still duplicate lines.
I hope this patch fixes the remaining problems.

Best,

Peter


Am 29.01.2016 um 10:34 schrieb Peter Klügl:
Hi,

the problems were caused by the svn client in my Eclipse. Sorry for the
trouble, I should have looked more closely at the ciomplete patch.

I attached a new patch created with commandline tools wich looks
correct
now.

Pei, can you apply the new patch?

Best,

Peter

Am 28.01.2016 um 15:57 schrieb Peter Klügl:
Thanks Pei.

I fear there was again a problem with the patch. All new files are
missing (and also the svn-ignore settings).

Can you take a look?

Best,

Peter

Am 28.01.2016 um 14:43 schrieb Pei Chen:
patch applied.
Thanks,
Pei

On Thu, Jan 28, 2016 at 4:14 AM, Peter Klügl <
peter.klu...@averbis.com> wrote:
Hi Pei,

can you commit the recent patch for us?

CTAKES-384-20160120.patch

Best,

Peter

Am 20.01.2016 um 19:35 schrieb Pei Chen:
Hi,
Sorry I was swamped recently.
But yeah, we can even create an extended type system to store
these items temporarily and add them into the main/core type system
afterwards.
There was an existing item to upgrade UIMA, but agreed- it will
require much more testing.  If it works, we can upgrade it in our sandbox
area or create a branch if necessary.
—Pei

On Jan 18, 2016, at 9:06 AM, Peter Klügl <
peter.klu...@averbis.com> wrote:
Hi,

a new patch is attached.

@Pei:
are there suitable annotation types in the cTAKES type system?
Some
project in cTAKES uses something like OntologyMatch... I map it to
IdentifiedAnnotation right now, but there are many empty
features...
@Azad:
I changed the rules a bit, especially the capitalization like I
use it
in ruta normally. The wordlist are compiled to a trie by the maven
plugin. I also added the two regexes for url and email. I
extended the
regex for the url. I also changed the evaluation order of some
rules
(with @). Feel free to add simple examples to examples.csv for
the unit
tests.

Let me know if you need more information about the changes.

Do you wanna have help with the other rule sets? Or should we
split them up?
Best,

Peter

Am 18.01.2016 um 11:04 schrieb Peter Klügl:
Hi,

great. I will integrate them in the project and in the next
patch.
Best,

Peter

Am 18.01.2016 um 00:58 schrieb Azad Dehghan:
Three NERs translated and uploaded.

PS. I will validate all NERs once we have them all completed.

Cheers,
Azad

On 24 November 2015 at 10:37, Azad Dehghan <
azad.dehg...@gmail.com> wrote:
This is on my todo list for Dec. as well. If there are any
more volunteers
for translating JAPE to RUTA, please get in touch.

Cheers,
Azad

On 24 Nov 2015 09:55, "Peter Klügl" <peter.klu...@averbis.com>
wrote:
Hi,

I just wanted to mention that I haven't forgot about it.
Unfortunately,
there is just no spare time right now. I hope I will be able
to provide
the patches in December.

Best,

Peter

Am 06.11.2015 um 16:40 schrieb Pei Chen:
Hi Peter,
I think the ctakes-examples is probably a good starting
point at least
in terms of maven modules, etc.  I think it would be good if
we use
uimaFIT style as primary approach to wiring components
together and
generate desc's as secondary...
I think the actual components that would be required is
probably best
left up to what is actually required for best performing
c-deid.  The
output would be interesting, I'm not sure if we should treat
this as
an independent preprocessing component or part of a pipeline
(in which
case, we may need to propose a change to the type system or
perhaps an
alternative JCas view.  You can probably open up that
discussion to
the dev group as you see fit.)

My 2 cents...


On Fri, Nov 6, 2015 at 3:38 AM, Peter Klügl <
peter.klu...@averbis.com>
wrote:
Hi,

Is there a cTAKES project that may serve as an example on
how the
cTAKES
community develops or how a project should look like?
I learned that different people set up UIMA project in a
quite
different
manner and I do not what to get inspired by "some sort of
out-dated"
approach in the cTAKES repo.

Are there restriction or preferences about the preprocessing
components
that should be used and the kind of "output" of the project.
Components: On which components may the componetns rely:
tokenizer,
...
parser, ... dict lookup?
"output": Should the project provide a pipeline or a single
AE?
More comments below.

Am 03.11.2015 um 16:54 schrieb Azad Dehghan:
Who else plans to provide patches for it? Just to avoid
duplicate
work
and to coordnate the efforts ...

I would like to help with the translating JAPE to RUTA.
You can already go ahead with the UIMA Ruta Workbench if
you want, or
wait until I set up the project with ruta integration.

If any questions arise, just ask :-)

Is there a development dataset which was utilized for the
initial
development, and if yes, is it possible to contribute it
too?
The data set is unfortunately not publicly available; i2b2
<https://www.i2b2.org/NLP/DataSets/Main.php> typically
releases the
data
sets 12 months after a given challenge; this is done on an
individual basis
and involve a Data Use Agreement.

However, I will be able to conduct and coordinate the
validation.
Ok, I'll investigate if we have already access to the
dataset here.

My first step would be:
- set up a maven project
- set up a development pipeline in a test (with cTAKES
components
replacing the previous ANNIE preprocessing)


But one item that we need to review is the 3rd party libs
jars that
were included to ensure compatibility.  I’ll be sure to
take a look
at
that over the next few weeks.

—Pei


@Pei - once ANNIE components are replaced there is should
not be a
need to
worry about the 3rd party libs.

Also, just a thought: we may want to create an independent
component
for
the Two Pass recognition (TwoPass.java) as this method
have shown
useful
for general NER on longitudinal data and surely useful
independent
of the
deid component.


Cheers,
Azad



Reply via email to