[
https://issues.apache.org/jira/browse/IGNITE-28843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nikolay Izhikov updated IGNITE-28843:
-------------------------------------
Labels: IEP-109 ise (was: )
> Improve diagnostics when snapshot/dump directory creation fails
> ---------------------------------------------------------------
>
> Key: IGNITE-28843
> URL: https://issues.apache.org/jira/browse/IGNITE-28843
> Project: Ignite
> Issue Type: Task
> Reporter: Anton Vinogradov
> Assignee: Anton Vinogradov
> Priority: Major
> Labels: IEP-109, ise
> Time Spent: 10m
> Remaining Estimate: 0h
>
> When snapshot or dump directory creation fails, Ignite reports only the
> target path with no OS-level cause, e.g.:
> class org.apache.ignite.IgniteCheckedException: Dump directory can't be
> created: /opt/ignite/nvme5/snapshot/dump_.../db/cell_2_node_2/
> cache-replication.replications_v1
> The real reason (permissions, no space, read-only FS, a file where a
> directory is expected, missing mount) is lost because the code uses
> java.io.File.mkdirs(), which returns a boolean and discards the cause.
> mkdirs() also returns false when the directory already exists, which can
> produce a misleading error on retry.
> Fix: create directories via NIO Files.createDirectories(Path). It is
> idempotent for existing directories and throws a typed IOException
> (AccessDeniedException; FileSystemException "No space left on device";
> ReadOnlyFileSystemException; NotDirectoryException / FileAlreadyExists-
> Exception; NoSuchFileException) that carries the OS reason. Include the
> exception class and message in the thrown Ignite exception and attach the
> original exception as the cause.
> Updated:
> - IgniteUtils.ensureDirectory(File, String, IgniteLogger) - shared helper
> used by WAL, binary metadata, maintenance, snapshot and dump config.
> - CreateDumpFutureTask.prepare() - the exact spot from the report; now
> routed through ensureDirectory (also removes the inconsistency with
> saveCacheConfigs, which already used ensureDirectory).
> - SnapshotFutureTask - temp cache-configuration directory.
> - SharedFileTree.mkdir(File, String) - remaining raw-mkdirs throwing site.
> Not changed: IgniteUtils.mkdirs(File) keeps its boolean contract - a
> boolean cannot carry a reason, and 9 call-sites (3 of them ignoring the
> result) rely on it.
> Resulting message example:
> Failed to create dump directory: /opt/ignite/nvme5/snapshot/.../
> cell_2_node_2/cache-replication.replications_v1 [reason=Access-
> DeniedException, detail=/opt/ignite/.../cell_2_node_2: Permission denied]
> Testing:
> IgniteUtilsUnitTest, 3 new cases (plain JUnit, no grid started):
> - ensureDirectoryIsIdempotentForExistingDirectory - an existing directory
> must not raise an error (guards the mkdirs()-returns-false regression).
> - ensureDirectorySurfacesReasonAndCauseWhenPathComponentIsFile - the
> message contains "reason=" and the cause is an IOException
> (deterministic, independent of permissions or the running user).
> - ensureDirectoryReportsAccessDeniedWhenParentNotWritable - a non-writable
> parent yields AccessDeniedException in both the message and the cause.
> Guarded with assumeTrue(POSIX) and an assumeFalse writable-probe so it
> self-skips on non-POSIX filesystems or when running as root.
> Verified locally against the compiled class: OK (3 tests).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)