rzo1 commented on code in PR #1587: URL: https://github.com/apache/stormcrawler/pull/1587#discussion_r2178397592
########## core/pom.xml: ########## @@ -264,6 +265,12 @@ under the License. <scope>test</scope> </dependency> + <dependency> + <groupId>org.apache.httpcomponents.core5</groupId> Review Comment: StormCrawler is currently using HttpClient version 4, along with core components also on version 4. We should either upgrade to version 5 or maintain consistency by sticking with version 4 throughout. ########## core/src/main/java/org/apache/stormcrawler/protocol/okhttp/HttpProtocol.java: ########## @@ -352,7 +352,7 @@ public ProtocolResponse getProtocolOutput(String url, final Metadata metadata) final String lastModified = metadata.getFirstValue(HttpHeaders.LAST_MODIFIED); if (StringUtils.isNotBlank(lastModified)) { - rb.header("If-Modified-Since", HttpHeaders.formatHttpDate(lastModified)); + rb.header("If-Modified-Since", formatHttpDate(lastModified)); Review Comment: Can this be replaced by https://hc.apache.org/httpcomponents-core-4.4.x/current/httpcore/apidocs/org/apache/http/HttpHeaders.html#IF_MODIFIED_SINCE ? ########## core/src/main/java/org/apache/stormcrawler/protocol/okhttp/HttpProtocol.java: ########## @@ -352,7 +352,7 @@ public ProtocolResponse getProtocolOutput(String url, final Metadata metadata) final String lastModified = metadata.getFirstValue(HttpHeaders.LAST_MODIFIED); if (StringUtils.isNotBlank(lastModified)) { - rb.header("If-Modified-Since", HttpHeaders.formatHttpDate(lastModified)); + rb.header("If-Modified-Since", formatHttpDate(lastModified)); } final String ifNoneMatch = metadata.getFirstValue("etag", protocolMDprefix); Review Comment: https://hc.apache.org/httpcomponents-core-4.4.x/current/httpcore/apidocs/org/apache/http/HttpHeaders.html#ETAG ? ########## core/src/main/java/org/apache/stormcrawler/Metadata.java: ########## @@ -109,14 +109,15 @@ public String[] getValues(String key, String prefix) { } public String[] getValues(String key) { - String[] values = md.get(key); + if (key == null || key.isEmpty()) return null; + String[] values = md.getOrDefault(key, md.get(key.toLowerCase())); Review Comment: I would be fine to drop the backward compatibility here. We could mention in the changelog, that the next SC release have fixed odd http headers (or we stick with the compatibility, but we should drop it some day because it makes our code a bit more complex to re-think / track why it was added). If people would like to have the backward compatibility for those headers, we should maybe add a code comment referencing the issue? wdyt? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@stormcrawler.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org