If you’re suggesting ways to make it easier to use something like YaHPConverter 
with Tika, definitely yes.

If you’re talking about integrating this functionality…my personal view is no.

I think Tika should focus on extracting content from documents, versus format 
transformations.

Tika is an attractive location for functionality like this, since it sits in 
the middle of a lot of data processing pipelines, but I worry about a bloated 
code base, with corresponding challenges in maintenance and support.

Regards,

— Ken


> On Oct 14, 2019, at 4:38 AM, Sergey Beryozkin <sberyoz...@gmail.com> wrote:
> 
> Hi All
> 
> I've seen a Quarkus user asking how to convert to PDF, and one of my
> colleagues pointed to
> http://www.allcolor.org/YaHPConverter/doc/org/allcolor/yahp/converter/IHtmlToPdfTransformer.html
> 
> Does it make sense for Tika to offer something related to the text to PDF
> (for a start, something on top of that transformer), and then may be even
> for other formats ?
> 
> Sergey

--------------------------
Ken Krugler
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr

Reply via email to