Copilot commented on code in PR #2875:
URL: https://github.com/apache/tika/pull/2875#discussion_r3363955646


##########
tika-core/src/main/java/org/apache/tika/io/FilenameUtils.java:
##########
@@ -268,6 +270,51 @@ public static String getSanitizedEmbeddedFilePath(Metadata 
metadata,
         return retPath;
     }
 
+    /**
+     * Resolves {@code name} against {@code dir} and returns the result only 
if it stays
+     * within {@code dir}.
+     * <p>
+     * The result is {@link Path#normalize() normalized} before it is checked. 
This is the
+     * load-bearing step: without it, resolving a name such as {@code ../x} 
yields the literal
+     * path {@code dir/../x}, whose path elements still begin with {@code 
dir}, so the
+     * {@link Path#startsWith(Path)} check below would wrongly accept it. 
Normalizing first
+     * collapses the {@code ..} so the check sees the real location.
+     * <p>
+     * Containment is tested with {@link Path#startsWith(Path)} (element by 
element) rather
+     * than by comparing path strings, so a sibling whose name merely shares a 
textual prefix
+     * with {@code dir} (for example {@code /a/bc} against {@code /a/b}) is 
correctly treated
+     * as being outside {@code dir}.
+     *
+     * @param dir  the directory the resolved path must stay within
+     * @param name the child name to resolve against {@code dir}; may contain 
relative
+     *             segments such as {@code ..}
+     * @return the resolved, normalized path, guaranteed to start with the 
normalized
+     *         {@code dir}
+     * @throws IOException if {@code name} resolves to a location outside 
{@code dir}
+     */

Review Comment:
   The Javadoc for `resolveWithin` says it throws `IOException` only when 
`name` resolves outside `dir`, but the implementation can also throw 
`IOException` from the `toRealPath()` calls in the symlink defense-in-depth 
block (e.g., permission issues, broken symlinks, other I/O errors) even when 
the resolved path is inside `dir`. Please widen the `@throws` description to 
match the actual behavior.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to