I’d like to see the discriminators on the parsers be more about the type of parser, and what it’s going to drag along/impact my system with, and these names reflect more the history of Tika’s evolution.
Starting with the descriptive paragraphs, here is some brainstorming of names: with the exception of optional OCR, these should be lightish weight dependencies in pure java with no parsers/resources that require network calls. —tika-parsers-files —tika-parsers-alljava —tika-parsers-local —tika-parsers-simple —tika-parsers-lightweight —tika-parsers-aluminum these can require native libs and/or have heavier dependencies, including network calls. —tika-parsers-heavy —tika-parsers-complex —tika-parsers-extended-dependencies —tika-parsers-iron anything goes. dl4j as a dependency, etc. —tika-parsers-anything-goes —tika-parsers-sandbox —tika-parsers-deep —tika-parsers-model-driven —tika-parsers-lead > On Mar 9, 2021, at 12:03 PM, Tim Allison <talli...@apache.org> wrote: > > All, > I was recently chatting about Tika 2.x with some Tika friends and > they had some hesitation about the names for the three high level > parser modules. > > They are currently: > > tika-parsers-classic > tika-parsers-extended > tika-parsers-advanced > > The quibbles weren't with the delineation, but with the naming. > > In my mind, this is what I've been thinking as definitions: > > tika-parsers-classic -- with the exception of optional OCR, these > should be lightish weight dependencies in pure java with no > parsers/resources that require network calls. > > tika-parsers-extended -- these can require native libs and/or have > heavier dependencies, including network calls. > > tika-parsers-advanced -- anything goes. dl4j as a dependency, etc. > > Some options for classic-> basic, base, ...what else? > > Any other recommendations for these names? Thank you! > > Best, > > Tim _______________________ Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | My Free/Busy <http://tinyurl.com/eric-cal> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.