HI,

I have the need to perform a syntactical parsing of various legal documents
with the result to identify and extract each article and sub-paragraph.

The documents are text like:

Act. 123 Bla Bla Bla

Art. 1
(Article title)

Article body with sub paragraph (at most three levels of sub
paragraph identified by numbers (1, 2, 3...) and letters (a, b,
c...) and roman literals (i, ii, iii, vi, etc.)

Unfortunately the real life is a bit tougher than this, i.e. in some
documents you have the string Art. in others Article; sometimes the
Article title is present sometimes not, and so on.

Do you think that ANTLR can help in generating a parser that identifies
and extracts the parts of the legal documents labelling  each part with
the proper hierarchical structure?

So far I am doing a prototype in PERL but taking into account all the
possible variations that can be found in the plethora of documents I have
to "ingest" it seems a quite cumbersome activity to code all the
exceptions.

Thanks for your support.

Regards

Marco Bagni



List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to il-antlr-interest@googlegroups.com
To unsubscribe from this group, send email to 
il-antlr-interest+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to