krickert commented on PR #2916: URL: https://github.com/apache/tika/pull/2916#issuecomment-4860546207
Closing this in favor of a much smaller contract, reshaped around the feedback here. The new shape drops the per-format metadata taxonomy entirely: one small Document proto (a structured markdown content tree plus typed common metadata plus a lossless multivalue tail), with format specifics living in per-parser mapping code instead of the wire - so metadata churn never forces a client rebuild. New PR coming shortly, cut fresh from main and split into individually reviewable pieces per @nddipiazza's suggestion. Thanks @tballison and @nddipiazza for the review pressure here - it made the design significantly better. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
