tballison commented on code in PR #2875:
URL: https://github.com/apache/tika/pull/2875#discussion_r3363917637
##########
tika-core/src/main/java/org/apache/tika/io/FilenameUtils.java:
##########
@@ -268,6 +270,39 @@ public static String getSanitizedEmbeddedFilePath(Metadata
metadata,
return retPath;
}
+ /**
+ * Resolves {@code name} against {@code dir} and returns the result only
if it stays
+ * within {@code dir}.
+ * <p>
+ * The result is {@link Path#normalize() normalized} before it is checked.
This is the
+ * load-bearing step: without it, resolving a name such as {@code ../x}
yields the literal
+ * path {@code dir/../x}, whose path elements still begin with {@code
dir}, so the
+ * {@link Path#startsWith(Path)} check below would wrongly accept it.
Normalizing first
+ * collapses the {@code ..} so the check sees the real location.
+ * <p>
+ * Containment is tested with {@link Path#startsWith(Path)} (element by
element) rather
+ * than by comparing path strings, so a sibling whose name merely shares a
textual prefix
+ * with {@code dir} (for example {@code /a/bc} against {@code /a/b}) is
correctly treated
+ * as being outside {@code dir}.
+ *
+ * @param dir the directory the resolved path must stay within
+ * @param name the child name to resolve against {@code dir}; may contain
relative
+ * segments such as {@code ..}
+ * @return the resolved, normalized path, guaranteed to start with the
normalized
+ * {@code dir}
+ * @throws IOException if {@code name} resolves to a location outside
{@code dir}
+ */
+ public static Path resolveWithin(Path dir, String name) throws IOException
{
+ Path normalizedDir = dir.normalize();
+ Path resolved = normalizedDir.resolve(name).normalize();
+ if (!resolved.startsWith(normalizedDir)) {
+ throw new IOException(
+ "'" + name + "' resolves to '" + resolved + "', which is
outside of '" +
+ normalizedDir + "'");
+ }
+ return resolved;
+ }
Review Comment:
This doesn't buy us anything for isatab, but it is useful to have in a
general check.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]