[ 
https://issues.apache.org/jira/browse/TIKA-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18086075#comment-18086075
 ] 

Hudson commented on TIKA-4748:
------------------------------

UNSTABLE: Integrated in Jenkins build Tika ยป tika-main-jdk17 #1402 (See 
[https://ci-builds.apache.org/job/Tika/job/tika-main-jdk17/1402/])
TIKA-4748 -- clean up ocr configuration within pdfparser (#2864) (github: 
[https://github.com/apache/tika/commit/48257e37cefb053337f92cda8ade33f0408d6006])
* (edit) 
tika-serialization/src/main/java/org/apache/tika/serialization/TikaModule.java
* (edit) 
tika-server/tika-server-standard/src/test/java/org/apache/tika/server/standard/UnpackerResourceWithConfigTest.java
* (edit) docs/modules/ROOT/pages/advanced/integration-testing/tika-app.adoc
* (edit) 
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pdf-module/src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java
* (edit) 
tika-parsers/tika-parsers-standard/tika-parsers-standard-integration-tests/src/test/java/org/apache/tika/parser/crypto/TSDParserTest.java
* (edit) 
tika-serialization/src/test/java/org/apache/tika/config/loader/TikaJsonConfigTest.java
* (edit) 
tika-parsers/tika-parsers-standard/tika-parsers-standard-integration-tests/src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java
* (edit) 
tika-parsers/tika-parsers-standard/tika-parsers-standard-integration-tests/src/test/resources/configs/tika-config-non-primitives.json
* (edit) 
tika-serialization/src/main/java/org/apache/tika/config/loader/TikaJsonConfig.java
* (edit) 
tika-serialization/src/main/java/org/apache/tika/config/loader/TikaLoader.java
* (edit) docs/modules/ROOT/pages/advanced/integration-testing/tika-server.adoc
* (edit) 
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pdf-module/src/main/java/org/apache/tika/parser/pdf/PDFParserConfig.java
* (edit) docs/modules/ROOT/pages/developers/serialization.adoc
* (edit) 
tika-parsers/tika-parsers-ml/tika-inference/src/main/java/org/apache/tika/inference/OpenAIImageEmbeddingParser.java
* (edit) docs/modules/ROOT/pages/using-tika/server/index.adoc
* (edit) 
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pdf-module/src/test/resources/org/apache/tika/parser/pdf/tika-inline-config.json
* (edit) 
tika-server/tika-server-standard/src/test/java/org/apache/tika/server/standard/UnpackerResourceTest.java
* (edit) 
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pdf-module/src/main/java/org/apache/tika/parser/pdf/AbstractPDF2XHTML.java
* (edit) 
tika-app/src/test/java/org/apache/tika/cli/XmlToJsonConfigConverterTest.java
* (edit) 
tika-server/tika-server-standard/src/test/java/org/apache/tika/server/standard/TikaResourceTest.java
* (edit) tika-serialization/src/test/resources/configs/example-tika-config.json
* (edit) 
tika-parsers/tika-parsers-standard/tika-parsers-standard-integration-tests/src/test/resources/configs/tika-config-ocr-for-pdf.json
* (edit) docs/modules/ROOT/pages/migration-to-4x/migrating-tika-server-4x.adoc
* (edit) tika-core/src/main/java/org/apache/tika/parser/ParseContext.java
* (edit) 
tika-server/docker-build/sample-configs/customocr/tika-config-rendered.json
* (edit) 
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pdf-module/src/main/java/org/apache/tika/renderer/pdf/pdfbox/PDFBoxRenderer.java
* (edit) 
tika-server/tika-server-core/src/test/resources/config-examples/server-with-parsers.json
* (edit) 
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pdf-module/src/main/java/org/apache/tika/parser/pdf/PDFParser.java
* (edit) 
tika-serialization/src/test/java/org/apache/tika/serialization/RoundTripSerializationTest.java
* (edit) 
tika-app/src/main/java/org/apache/tika/cli/XmlToJsonConfigConverter.java
* (edit) docs/modules/ROOT/pages/migration-to-4x/migrating-to-4x.adoc
* (edit) 
tika-serialization/src/test/java/org/apache/tika/serialization/TestParseContextSerialization.java


> Clean up pdf+ocr config in 4.x
> ------------------------------
>
>                 Key: TIKA-4748
>                 URL: https://issues.apache.org/jira/browse/TIKA-4748
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Minor
>
> [~birdya22] ran into a two-path config issue on TIKA-4747 in how we set ocr 
> options in the pdfconfig. We should clean up our code to allow only a single 
> (non-flat) option.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to