https://bz.apache.org/bugzilla/show_bug.cgi?id=66878
Bug ID: 66878 Summary: Invalid URL entered by user as hyperlink target causes exception when parsing. Product: POI Version: unspecified Hardware: Other OS: Linux Status: NEW Severity: minor Priority: P2 Component: OPC Assignee: dev@poi.apache.org Reporter: th...@egnyte.com Target Milestone: --- An invalid value entered into the document for a hyperlink causes an exception. This is the destination of the hyperlink. As this is a user entered value, I'm not sure why it should ever be looked at. I don't believe there is any validation, so this field can have any garbage in it. It should be ignored by POI. Version is whatever comes with Apache Tika 2.7.0 org.apache.poi.openxml4j.opc.PackageRelationshipCollection - Cannot convert https://cloud.google.com/bigtable/docs/backups#what-for%20https://cloud.google.com/bigtable/docs/release-notes#December_08_2022 in a valid relationship URI-> dummy-URI used java.net.URISyntaxException: Illegal character in fragment at index 110: https://cloud.google.com/bigtable/docs/backups#what-for%20https://cloud.google.com/bigtable/docs/release-notes#December_08_2022 at java.base/java.net.URI$Parser.fail(URI.java:2974) at java.base/java.net.URI$Parser.checkChars(URI.java:3145) at java.base/java.net.URI$Parser.parse(URI.java:3189) at java.base/java.net.URI.<init>(URI.java:623) at org.apache.poi.openxml4j.opc.PackagingURIHelper.toURI(PackagingURIHelper.java:723) at org.apache.poi.openxml4j.opc.PackageRelationshipCollection.parseRelationshipsPart(PackageRelationshipCollection.java:358) at org.apache.poi.openxml4j.opc.PackageRelationshipCollection.<init>(PackageRelationshipCollection.java:160) at org.apache.poi.openxml4j.opc.PackageRelationshipCollection.<init>(PackageRelationshipCollection.java:130) at org.apache.poi.openxml4j.opc.PackagePart.loadRelationships(PackagePart.java:565) at org.apache.poi.openxml4j.opc.OPCPackage.getParts(OPCPackage.java:751) at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:322) at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:123) at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:115) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:195) at org.apache.tika.Tika.parseToString(Tika.java:525) at org.apache.tika.Tika.parseToString(Tika.java:495) at com.purato.index.documenthandler.TikaDocumentHandler.getText(TikaDocumentHandler.java:52) -- You are receiving this mail because: You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org For additional commands, e-mail: dev-h...@poi.apache.org