dependabot[bot] opened a new pull request, #1492: URL: https://github.com/apache/incubator-stormcrawler/pull/1492
Bumps [org.jsoup:jsoup](https://github.com/jhy/jsoup) from 1.18.3 to 1.19.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/jhy/jsoup/releases">org.jsoup:jsoup's releases</a>.</em></p> <blockquote> <h2>jsoup 1.19.1</h2> <h3>Changes</h3> <ul> <li>Added support for <strong>http/2</strong> requests in <code>Jsoup.connect()</code>, when running on Java 11+, via the Java HttpClient implementation. <a href="https://redirect.github.com/jhy/jsoup/pull/2257">#2257</a>. <ul> <li>In this version of jsoup, the default is to make requests via the HttpUrlConnection implementation: use <strong><code>System.setProperty("jsoup.useHttpClient", "true");</code></strong> to enable making requests via the HttpClient instead , which will enable http/2 support, if available. This will become the default in a later version of jsoup, so now is a good time to validate it.</li> <li>If you are repackaging the jsoup jar in your deployment (i.e. creating a shaded- or a fat-jar), make sure to specify that as a Multi-Release JAR.</li> <li>If the <code>HttpClient</code> impl is not available in your JRE, requests will continue to be made via <code>HttpURLConnection</code> (in <code>http/1.1</code> mode).</li> </ul> </li> <li>Updated the minimum Android API Level validation from 10 to <strong>21</strong>. As with previous jsoup versions, Android developers need to enable core library desugaring. The minimum Java version remains Java 8. <a href="https://redirect.github.com/jhy/jsoup/pull/2173">#2173</a></li> <li>Removed previously deprecated class: <code>org.jsoup.UncheckedIOException</code> (replace with <code>java.io.UncheckedIOException</code>); moved previously deprecated method <code>Element Element#forEach(Consumer)</code> to <code>void Element#forEach(Consumer())</code>. <a href="https://redirect.github.com/jhy/jsoup/pull/2246">#2246</a></li> <li>Deprecated the methods <code>Document#updateMetaCharsetElement(bool)</code> and <code>#Document#updateMetaCharsetElement()</code>, as the setting had no effect. When <code>Document#charset(Charset)</code> is called, the document's meta charset or XML encoding instruction is always set. <a href="https://redirect.github.com/jhy/jsoup/pull/2247">#2247</a></li> </ul> <h3>Improvements</h3> <ul> <li>When cleaning HTML with a <code>Safelist</code> that preserves relative links, the <code>isValid()</code> method will now consider these links valid. Additionally, the enforced attribute <code>rel=nofollow</code> will only be added to external links when configured in the safelist. <a href="https://redirect.github.com/jhy/jsoup/pull/2245">#2245</a></li> <li>Added <code>Element#selectStream(String query)</code> and <code>Element#selectStream(Evaluator)</code> methods, that return a <code>Stream</code> of matching elements. Elements are evaluated and returned as they are found, and the stream can be terminated early. <a href="https://redirect.github.com/jhy/jsoup/pull/2092">#2092</a></li> <li><code>Element</code> objects now implement <code>Iterable</code>, enabling them to be used in enhanced for loops.</li> <li>Added support for fragment parsing from a <code>Reader</code> via <code>Parser#parseFragmentInput(Reader, Element, String)</code>. <a href="https://redirect.github.com/jhy/jsoup/issues/1177">#1177</a></li> <li>Reintroduced CLI executable examples, in <code>jsoup-examples.jar</code>. <a href="https://redirect.github.com/jhy/jsoup/issues/1702">#1702</a></li> <li>Optimized performance of selectors like <code>#id .class</code> (and other similar descendant queries) by around 4.6x, by better balancing the Ancestor evaluator's cost function in the query planner. <a href="https://redirect.github.com/jhy/jsoup/issues/2254">#2254</a></li> <li>Removed the legacy parsing rules for <code><isindex></code> tags, which would autovivify a <code>form</code> element with labels. This is no longer in the spec.</li> <li>Added <code>Elements.selectFirst(String cssQuery)</code> and <code>Elements.expectFirst(String cssQuery)</code>, to select the first matching element from an <code>Elements</code> list. <a href="https://redirect.github.com/jhy/jsoup/pull/2263/">#2263</a></li> <li>When parsing with the XML parser, XML Declarations and Processing Instructions are directly handled, vs bouncing through the HTML parser's bogus comment handler. Serialization for non-doctype declarations no longer end with a spurious <code>!</code>. <a href="https://redirect.github.com/jhy/jsoup/pull/2275">#2275</a></li> <li>When converting parsed HTML to XML or the W3C DOM, element names containing <code><</code> are normalized to <code>_</code> to ensure valid XML. For example, <code><foo<bar></code> becomes <code><foo_bar></code>, as XML does not allow <code><</code> in element names, but HTML5 does. <a href="https://redirect.github.com/jhy/jsoup/pull/2276">#2276</a></li> <li>Reimplemented the HTML5 Adoption Agency Algorithm to the current spec. This handles mis-nested formating / structural elements. <a href="https://redirect.github.com/jhy/jsoup/pull/2278">#2278</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/jhy/jsoup/blob/master/CHANGES.md">org.jsoup:jsoup's changelog</a>.</em></p> <blockquote> <h2>1.19.1 (2025-03-04)</h2> <h3>Changes</h3> <ul> <li>Added support for <strong>http/2</strong> requests in <code>Jsoup.connect()</code>, when running on Java 11+, via the Java HttpClient implementation. <a href="https://redirect.github.com/jhy/jsoup/pull/2257">#2257</a>. <ul> <li>In this version of jsoup, the default is to make requests via the HttpUrlConnection implementation: use <strong><code>System.setProperty("jsoup.useHttpClient", "true");</code></strong> to enable making requests via the HttpClient instead , which will enable http/2 support, if available. This will become the default in a later version of jsoup, so now is a good time to validate it.</li> <li>If you are repackaging the jsoup jar in your deployment (i.e. creating a shaded- or a fat-jar), make sure to specify that as a Multi-Release JAR.</li> <li>If the <code>HttpClient</code> impl is not available in your JRE, requests will continue to be made via <code>HttpURLConnection</code> (in <code>http/1.1</code> mode).</li> </ul> </li> <li>Updated the minimum Android API Level validation from 10 to <strong>21</strong>. As with previous jsoup versions, Android developers need to enable core library desugaring. The minimum Java version remains Java 8. <a href="https://redirect.github.com/jhy/jsoup/pull/2173">#2173</a></li> <li>Removed previously deprecated class: <code>org.jsoup.UncheckedIOException</code> (replace with <code>java.io.UncheckedIOException</code>); moved previously deprecated method <code>Element Element#forEach(Consumer)</code> to <code>void Element#forEach(Consumer())</code>. <a href="https://redirect.github.com/jhy/jsoup/pull/2246">#2246</a></li> <li>Deprecated the methods <code>Document#updateMetaCharsetElement(boolean)</code> and <code>Document#updateMetaCharsetElement()</code>, as the setting had no effect. When <code>Document#charset(Charset)</code> is called, the document's meta charset or XML encoding instruction is always set. <a href="https://redirect.github.com/jhy/jsoup/pull/2247">#2247</a></li> </ul> <h3>Improvements</h3> <ul> <li>When cleaning HTML with a <code>Safelist</code> that preserves relative links, the <code>isValid()</code> method will now consider these links valid. Additionally, the enforced attribute <code>rel=nofollow</code> will only be added to external links when configured in the safelist. <a href="https://redirect.github.com/jhy/jsoup/pull/2245">#2245</a></li> <li>Added <code>Element#selectStream(String query)</code> and <code>Element#selectStream(Evaluator)</code> methods, that return a <code>Stream</code> of matching elements. Elements are evaluated and returned as they are found, and the stream can be terminated early. <a href="https://redirect.github.com/jhy/jsoup/pull/2092">#2092</a></li> <li><code>Element</code> objects now implement <code>Iterable</code>, enabling them to be used in enhanced for loops.</li> <li>Added support for fragment parsing from a <code>Reader</code> via <code>Parser#parseFragmentInput(Reader, Element, String)</code>. <a href="https://redirect.github.com/jhy/jsoup/issues/1177">#1177</a></li> <li>Reintroduced CLI executable examples, in <code>jsoup-examples.jar</code>. <a href="https://redirect.github.com/jhy/jsoup/issues/1702">#1702</a></li> <li>Optimized performance of selectors like <code>#id .class</code> (and other similar descendant queries) by around 4.6x, by better balancing the Ancestor evaluator's cost function in the query planner. <a href="https://redirect.github.com/jhy/jsoup/issues/2254">#2254</a></li> <li>Removed the legacy parsing rules for <code><isindex></code> tags, which would autovivify a <code>form</code> element with labels. This is no longer in the spec.</li> <li>Added <code>Elements.selectFirst(String cssQuery)</code> and <code>Elements.expectFirst(String cssQuery)</code>, to select the first matching element from an <code>Elements</code> list. <a href="https://redirect.github.com/jhy/jsoup/pull/2263/">#2263</a></li> <li>When parsing with the XML parser, XML Declarations and Processing Instructions are directly handled, vs bouncing through the HTML parser's bogus comment handler. Serialization for non-doctype declarations no longer end with a spurious <code>!</code>. <a href="https://redirect.github.com/jhy/jsoup/pull/2275">#2275</a></li> <li>When converting parsed HTML to XML or the W3C DOM, element names containing <code><</code> are normalized to <code>_</code> to ensure valid XML. For example, <code><foo<bar></code> becomes <code><foo_bar></code>, as XML does not allow <code><</code> in element names, but HTML5 does. <a href="https://redirect.github.com/jhy/jsoup/pull/2276">#2276</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/jhy/jsoup/commit/5c4c09a5cbf1271ceeac32452cbc77d0795b81f9"><code>5c4c09a</code></a> [maven-release-plugin] prepare release jsoup-1.19.1</li> <li><a href="https://github.com/jhy/jsoup/commit/7de25be1ed3ee28c1500e465f60ec9968e9e9aeb"><code>7de25be</code></a> Updated changelog in preparation of release</li> <li><a href="https://github.com/jhy/jsoup/commit/6d7a058efb6f3ab97fdd262602b6331dd8750b2d"><code>6d7a058</code></a> Use 'el' instead of 'node' in adoption agency</li> <li><a href="https://github.com/jhy/jsoup/commit/0679bef07f1e29ae72ae54102d5af9a1f80d45d4"><code>0679bef</code></a> Perf: removed redundant lowercase normalization</li> <li><a href="https://github.com/jhy/jsoup/commit/d80275e16ebd34bae5b48f29f3e4437e1b207955"><code>d80275e</code></a> Performance tweak when appending tag names</li> <li><a href="https://github.com/jhy/jsoup/commit/4b733b16a8ac0dd417c1c57db78743a2b13ea1a1"><code>4b733b1</code></a> Updated InScope search basetypes to be namespace aware</li> <li><a href="https://github.com/jhy/jsoup/commit/d89d75794a3daffb5806eaf683d2ad07b66b9647"><code>d89d757</code></a> Changelog tidy</li> <li><a href="https://github.com/jhy/jsoup/commit/5fde3d992626d0d255793d030c241f2021d39faa"><code>5fde3d9</code></a> Changelog for <a href="https://redirect.github.com/jhy/jsoup/issues/2281">#2281</a></li> <li><a href="https://github.com/jhy/jsoup/commit/d55469aa637a3fea7d2a8a7a6291f6ca1774df4f"><code>d55469a</code></a> Clone the Parser when cloning a Document</li> <li><a href="https://github.com/jhy/jsoup/commit/11a033427bd7e69f485222998ab7ab60e017dbe3"><code>11a0334</code></a> Concurrency note</li> <li>Additional commits viewable in <a href="https://github.com/jhy/jsoup/compare/jsoup-1.18.3...jsoup-1.19.1">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@stormcrawler.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org