Re: [I] HttpProtocol (both okhttp and apache) race condition while having different proxies in different threads [incubator-stormcrawler]

2024-07-05 Thread via GitHub
jnioche commented on issue #1247: URL: https://github.com/apache/incubator-stormcrawler/issues/1247#issuecomment-2210397038 thanks @chhsiao90, are you able to suggest a fix for it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [I] To allow ProxyManager return null (or empty proxy) for not using a proxy for some specific requests [incubator-stormcrawler]

2024-07-05 Thread via GitHub
jnioche commented on issue #1246: URL: https://github.com/apache/incubator-stormcrawler/issues/1246#issuecomment-2210396039 Yes please. [ProxyManager](https://github.com/apache/incubator-stormcrawler/blob/dc84c569bc4fd94d3935a63478c00c8f3bfcbaca/core/src/main/java/org/apache/stormcrawler

[PR] #1248 - Use pre-compiled patterns for mime type matching in TikaParser [incubator-stormcrawler]

2024-07-05 Thread via GitHub
rzo1 opened a new pull request, #1249: URL: https://github.com/apache/incubator-stormcrawler/pull/1249 Thank you for contributing to Apache StormCrawler. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For al

Re: [I] HttpProtocol (both okhttp and apache) race condition while having different proxies in different threads [incubator-stormcrawler]

2024-07-05 Thread via GitHub
chhsiao90 commented on issue #1247: URL: https://github.com/apache/incubator-stormcrawler/issues/1247#issuecomment-2210792017 Sure, I can have a PR for it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] #1248 - Use pre-compiled patterns for mime type matching in TikaParser [incubator-stormcrawler]

2024-07-05 Thread via GitHub
jnioche merged PR #1249: URL: https://github.com/apache/incubator-stormcrawler/pull/1249 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...

Re: [PR] #1248 - Use pre-compiled patterns for mime type matching in TikaParser [incubator-stormcrawler]

2024-07-05 Thread via GitHub
jnioche commented on PR #1249: URL: https://github.com/apache/incubator-stormcrawler/pull/1249#issuecomment-2210972037 thanks @rzo1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] #626: Add routing field in metadata - Solr StatusUpdaterBolt [incubator-stormcrawler]

2024-07-05 Thread via GitHub
jnioche commented on PR #1242: URL: https://github.com/apache/incubator-stormcrawler/pull/1242#issuecomment-2210995521 > > and point it to the key field, otherwise it will take the unique document key. Is that correct? > > Yes, you are right - thanks for noting! where should `