On Tue, 26 Nov 2019 at 15:21, Dave Fisher <wave4d...@comcast.net> wrote:
> Have you looked at Apache Tika? > > [This is tangential to my query. The Whimsy host does not currently include a JRE, so I did not look at Java solutions. The code now exists, and works well enough.] I would still have the same issue with Tika: how to wire it up in the Secretary workbench? Sent from my iPhone > > > On Nov 26, 2019, at 9:16 AM, sebb <seb...@gmail.com> wrote: > > > > I have committed some code to extract the form data from ICLAs. > > > > For example: > > > > https://whimsy.apache.org/secretary/icla-parse/yyyymm/hash/icla.pdf > > > > It would be useful if this could somehow be plugged into the workbench. > > For example when a PDF is classified as an ICLA. > > > > However I cannot work out how to do this. > > > > S. > >