Nicholas DiPiazza created TIKA-4264:
---------------------------------------

             Summary: Tika Pipes - Structured output (XHTML) support?
                 Key: TIKA-4264
                 URL: https://issues.apache.org/jira/browse/TIKA-4264
             Project: Tika
          Issue Type: Bug
          Components: tika-pipes
            Reporter: Nicholas DiPiazza


So I am able to use Tika Pipes to extract the text content from a document.

But is it possible to use Tika Pipes to obtain structured documents? I believe 
Tika does this in XHTML.

The plain text extracted from the document is great for indexing into search 
engine. 

But if you want the structured text output like XHTML?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to