Re: Release?
Hi folks, I quickly chatted with Julien off-list. If no one objects, I am going to propose a first release candidate next week! Best Richard On 2024/05/02 16:30:22 Richard Zowalla wrote: > Ok so anything else? > We have release docs available ;-) > Anyone want to act as Release Manager for our first ASF release? > > Think we are ready issue-wise. > > Gruß > Richard > > > Am 29. April 2024 16:48:42 MESZ schrieb Julien Nioche > : > >Thanks Ayush > > > >I have fixed the license headers in > >https://github.com/apache/incubator-stormcrawler/pull/1201 > > > >Julien > > > >On Mon, 29 Apr 2024 at 15:18, Ayush Saxena wrote: > > > >> Should be great. > >> Was looking around the code & see if there are any potential issues which > >> can block the vote. > >> Little bit curious around some files having "Licensed to DigitalPebble Ltd > >> under one or more" [1] > >> > >> Should we ditch such LICENSE headers, not sure if it is allowed or not, [2] > >> just mentions the standard License header > >> > >> There are some files here in this directory [3] referring to DigitalPebble, > >> if not required we can consider dropping before the release > >> > >> Some files tend to have different header as compared to one mentioned in > >> the official doc [4], it mentions reading the NOTICE file & stuff > >> > >> Just reading the incubator vote checklist [4], if everything is good as per > >> this doc, We should be good to go. > >> > >> Thanx Richard for initiating the discussion!!! > >> > >> -Ayush > >> > >> [1] > >> > >> https://github.com/apache/incubator-stormcrawler/blob/main/core/src/test/java/org/apache/stormcrawler/indexer/BasicIndexingTest.java > >> [2] https://www.apache.org/legal/src-headers#headers > >> [3] > >> > >> https://github.com/apache/incubator-stormcrawler/tree/main/core/src/test/resources > >> [4] > >> > >> https://cwiki.apache.org/confluence/display/INCUBATOR/Incubator+Release+Checklist > >> > >> On Mon, 29 Apr 2024 at 19:16, Richard Zowalla wrote: > >> > >> > Hi all, > >> > > >> > what do we need to do to run our first ASF release? > >> > Personally, I would love to see [1] in 3.0. > >> > > >> > Don't think we have any other formal blockers? > >> > > >> > Gruß > >> > Richard > >> > > >> > > >> > [1] https://github.com/apache/incubator-stormcrawler/pull/1199 > >> > > >> >
Re: [I] Newer Elasticsearch Version deprecate the REST High Level Client in favour of the Java API Client [incubator-stormcrawler]
rzo1 closed issue #945: Newer Elasticsearch Version deprecate the REST High Level Client in favour of the Java API Client URL: https://github.com/apache/incubator-stormcrawler/issues/945 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@stormcrawler.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Newer Elasticsearch Version deprecate the REST High Level Client in favour of the Java API Client [incubator-stormcrawler]
rzo1 commented on issue #945: URL: https://github.com/apache/incubator-stormcrawler/issues/945#issuecomment-2092820610 We dropped ES, so closing this issue is ok now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@stormcrawler.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Add config to shard based on instance number instead of field [incubator-stormcrawler]
rzo1 closed issue #489: Add config to shard based on instance number instead of field URL: https://github.com/apache/incubator-stormcrawler/issues/489 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@stormcrawler.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] ES IndexerBold - Fix behaviour of afterBulk [incubator-stormcrawler]
rzo1 closed issue #992: ES IndexerBold - Fix behaviour of afterBulk URL: https://github.com/apache/incubator-stormcrawler/issues/992 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@stormcrawler.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] #1207 -- add forbidden-apis [incubator-stormcrawler]
tballison opened a new pull request, #1208: URL: https://github.com/apache/incubator-stormcrawler/pull/1208 this just adds the plugin. I'll update the repo… o pass it in follow-on commits. This is just a WIP. Thank you for contributing to Apache StormCrawler. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [ ] Is there a issue associated with this PR? Is it referenced in the commit message? - [ ] Does your PR title start with `#` where `` is the issue number you are trying to resolve? - [ ] Has your PR been rebased against the latest commit within the target branch (typically main)? - [ ] Is your initial contribution a single, squashed commit? - [ ] Is the code properly formatted with `mvn git-code-format:format-code -Dgcf.globPattern=**/*`? ### For code changes: - [ ] Have you ensured that the full suite of tests is executed via `mvn clean verify`? - [ ] Have you written or updated unit tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file? - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file? ### Note: Please ensure that once the PR is submitted, you check GitHub Actions for build issues and submit an update to your PR as soon as possible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@stormcrawler.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Add forbidden-apis [incubator-stormcrawler]
tballison commented on issue #1207: URL: https://github.com/apache/incubator-stormcrawler/issues/1207#issuecomment-2093053737 Working on this here: https://github.com/apache/incubator-stormcrawler/pull/1208 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@stormcrawler.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Add forbidden-apis [incubator-stormcrawler]
tballison commented on issue #1207: URL: https://github.com/apache/incubator-stormcrawler/issues/1207#issuecomment-2093158410 K, that's ready for review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@stormcrawler.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] #1207 -- add forbidden-apis [incubator-stormcrawler]
jnioche merged PR #1208: URL: https://github.com/apache/incubator-stormcrawler/pull/1208 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@stormcrawler.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] #1207 -- add forbidden-apis [incubator-stormcrawler]
jnioche commented on PR #1208: URL: https://github.com/apache/incubator-stormcrawler/pull/1208#issuecomment-2093285732 thanks @tballison -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@stormcrawler.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] Apple Silicon emulation issue in unit tests [incubator-stormcrawler]
joshfischer1108 opened a new issue, #1209: URL: https://github.com/apache/incubator-stormcrawler/issues/1209 When compiling Stormcrawler from source on Apple Silicon we are hitting timeout issues in selenium tests due to emulation issues. ## Steps to reproduce: Using an Apple M3: From the top level directory run: ``` mvn clean install ``` First we get this warning. ``` The architecture 'amd64' for image 'selenium/standalone-chrome:120.0' (ID sha256:deff784da2138b912b66e2941cc976ced4ecba3a4e6941ca3bfa2b8c6b75) does not match the Docker server architecture 'arm64'. This will cause the container to execute much more slowly due to emulation and may lead to timeout failures. ``` Then we get this error: ``` [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 27.48 s <<< FAILURE! -- in org.apache.stormcrawler.protocol.selenium.ProtocolTest [ERROR] org.apache.stormcrawler.protocol.selenium.ProtocolTest.testBlocking -- Time elapsed: 27.44 s <<< ERROR! org.awaitility.core.ConditionTimeoutException: Condition with org.apache.stormcrawler.protocol.selenium.ProtocolTest was not fulfilled within 10 seconds. ``` Then the error should appear -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@stormcrawler.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Add forbidden-apis [incubator-stormcrawler]
rzo1 closed issue #1207: Add forbidden-apis URL: https://github.com/apache/incubator-stormcrawler/issues/1207 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@stormcrawler.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Apple Silicon emulation issue in unit tests [incubator-stormcrawler]
joshfischer1108 commented on issue #1209: URL: https://github.com/apache/incubator-stormcrawler/issues/1209#issuecomment-2093827076 I'm looking at the test and see the below. Are these the timeouts? I've changed them to much higher values such as `10` and the tests seem to timeout about the same time on my machine (which is around 27 seconds) ``` timeouts.put("implicit", 1); timeouts.put("pageLoad", 1); timeouts.put("script", 1); conf.put("selenium.timeouts", timeouts); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@stormcrawler.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] Apple Silicon emulation issue in unit tests [incubator-stormcrawler]
joshfischer1108 commented on issue #1209: URL: https://github.com/apache/incubator-stormcrawler/issues/1209#issuecomment-2093832637 I forgot to add the link. [Here is where I am looking](https://github.com/apache/incubator-stormcrawler/blob/c1088fb3ff3ca9ca99bcce108d8bb2b40b97c094/core/src/test/java/org/apache/stormcrawler/protocol/selenium/ProtocolTest.java#L90) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@stormcrawler.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] #1209 fix for emulation error in tests run on silicon [incubator-stormcrawler]
joshfischer1108 opened a new pull request, #1210: URL: https://github.com/apache/incubator-stormcrawler/pull/1210 This addresses the container emulation issue referenced in #1209 ### For all changes: - [x] Is there a issue associated with this PR? Is it referenced in the commit message? - [x] Does your PR title start with `#` where `` is the issue number you are trying to resolve? - [x] Has your PR been rebased against the latest commit within the target branch (typically main)? - [x] Is your initial contribution a single, squashed commit? - [x] Is the code properly formatted with `mvn git-code-format:format-code -Dgcf.globPattern=**/*`? ### For code changes: - [x] Have you ensured that the full suite of tests is executed via `mvn clean verify`? - [x] Have you written or updated unit tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file? - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file? ### Note: Please ensure that once the PR is submitted, you check GitHub Actions for build issues and submit an update to your PR as soon as possible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@stormcrawler.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] #1209 fix for emulation error in tests run on silicon [incubator-stormcrawler]
rzo1 commented on code in PR #1210: URL: https://github.com/apache/incubator-stormcrawler/pull/1210#discussion_r1589882794 ## core/src/test/java/org/apache/stormcrawler/protocol/selenium/ProtocolTest.java: ## @@ -51,7 +51,8 @@ public class ProtocolTest extends AbstractProtocolTest { private static final Logger LOG = LoggerFactory.getLogger(ProtocolTest.class); private static final DockerImageName SELENIUM_IMAGE = -DockerImageName.parse("selenium/standalone-chrome:120.0"); +DockerImageName.parse("seleniarm/standalone-chromium:latest") Review Comment: Wonder if we can use a fixed tag? Reasoning would be, that "latest" can vary between environments / runs making reproducability difficult. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@stormcrawler.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org