Copilot commented on code in PR #2881:
URL: https://github.com/apache/tika/pull/2881#discussion_r3369329069
##########
tika-eval/tika-eval-app/src/main/java/org/apache/tika/eval/app/ExtractComparerRunner.java:
##########
@@ -131,12 +132,16 @@ public static void main(String[] args) throws Exception {
ResultsReporter.main(new String[]{"-d", dbPath, "-rd",
reportsDir});
Path reportsDirPath = Paths.get(reportsDir);
if (Files.isDirectory(reportsDirPath)) {
- Path tgzPath = reportsDirPath.resolveSibling(reportsDir +
".tar.gz");
+ Path tgzPath = reportsDirPath.resolveSibling(reportsDir +
".tgz");
LOG.info("Creating {}", tgzPath);
createTarGz(reportsDirPath, tgzPath);
Review Comment:
`resolveSibling(reportsDir + ".tgz")` can produce an incorrect path when
`-rd/--reportsDir` contains path separators (e.g., `foo/bar` becomes
`foo/foo/bar.tgz`). Use the reports directory path's filename for the sibling
archive name instead.
##########
tika-eval/tika-eval-app/src/main/java/org/apache/tika/eval/app/ExtractComparerRunner.java:
##########
@@ -263,6 +268,40 @@ private static void deleteDirectory(Path dir) throws
IOException {
}
}
+ /**
+ * Gzip the H2 db file (<dbPath>.mv.db ->
<dbPath>.mv.db.gz) so it can be
+ * transferred. The db connection is already closed by {@link #execute}
before
+ * this runs, so the file is unlocked. No-op (with a warning) when there
is no
+ * on-disk file db to gzip: a temp db (no -d) or an explicit jdbc
connection string.
+ */
+ private static void gzipDb(String dbPath, boolean usesTempDb) throws
IOException {
+ if (usesTempDb) {
+ LOG.warn("-z (gzip) ignored: no -d db specified, so there is no db
file to transfer");
+ return;
+ }
+ if (dbPath.startsWith("jdbc:")) {
+ LOG.warn("-z (gzip) ignored: db is an explicit jdbc connection
({}), not a local file", dbPath);
+ return;
+ }
+ Path dbFile = Paths.get(dbPath + ".mv.db");
+ if (!Files.isRegularFile(dbFile)) {
Review Comment:
`-z/--gzip` is ignored for any JDBC connection string, including
`jdbc:h2:file:...` URLs that do point at an on-disk DB file. Since `-d` already
accepts JDBC URLs, consider supporting the common `jdbc:h2:file:` case by
extracting the file base path (stripping any `;` options) instead of
unconditionally warning and returning.
##########
docs/modules/ROOT/pages/advanced/integration-testing/tika-eval-regression.adoc:
##########
@@ -191,7 +191,8 @@ java -jar
tika-eval/tika-eval-app/target/tika-eval-app-{tika-version}.jar \
The `Compare` subcommand keyword is optional — the CLI infers it from
the `-a` / `-b` flags. The `-r` flag both runs the Report stage and
-zips the resulting reports directory for easy archiving.
+tgz's the resulting reports directory (`<reportsDir>.tgz`) for easy
+archiving.
Review Comment:
“tgz's” reads informal/awkward in docs. Consider rephrasing to describe
creating a `.tgz` archive instead.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]