Have you looked at Apache Tika?

Sent from my iPhone

> On Nov 26, 2019, at 9:16 AM, sebb <seb...@gmail.com> wrote:
> 
> I have committed some code to extract the form data from ICLAs.
> 
> For example:
> 
> https://whimsy.apache.org/secretary/icla-parse/yyyymm/hash/icla.pdf
> 
> It would be useful if this could somehow be plugged into the workbench.
> For example when a PDF is classified as an ICLA.
> 
> However I cannot work out how to do this.
> 
> S.

Reply via email to