Re: [VOTE] Apache StormCrawler (Incubating) 3.0 Release Candidate 2

2024-05-09 Thread Tim Allison
+1 shasum checks out for source Built locally on ubuntu with Java 17 Thank you! On 2024/05/07 09:12:55 Richard Zowalla wrote: > Hi folks, > > I have posted a 2nd release candidate for the Apache StormCrawler > (Incubating) 3.0 release and it is ready for testing. > > The previous VOTE was ca

Next release?

2024-11-04 Thread Tim Allison
There's been quite a bit of recent work. Should we aim for the next release in the next week or so? Are there any blockers? WDYT? Thank you! Best, Tim

Re: Next release?

2024-11-04 Thread Tim Allison
Are the Solr mods ready to go? On Mon, Nov 4, 2024 at 1:39 PM Richard Zowalla wrote: > +1 (no blockers from my side) > > Am 4. November 2024 19:16:10 MEZ schrieb Tim Allison >: > >There's been quite a bit of recent work. Should we aim for the next > release > >i

Re: log4j2 2.24.1 issues?

2024-11-12 Thread Tim Allison
ded in containers, i.e. but rather in the Storm Runtime environment. > > I cannot remember on which version Storm 2.7.0 is running but if there is > a difference, we should most likely downgrade. > > Gruß and Thx > Richard > > Am 12. November 2024 18:19:47 MEZ schrieb Tim Allison

Re: Next release?

2024-11-12 Thread Tim Allison
Looks like we should wait for https://github.com/apache/incubator-stormcrawler/issues/1401 before the release? On Sun, Nov 10, 2024 at 5:21 PM Tim Allison wrote: > 3.2.0 sounds good to me. I’m afk tomorrow, but can try to start the > release process on Tuesday (ET). > > On Sun, Nov

log4j2 2.24.1 issues?

2024-11-12 Thread Tim Allison
Over on POI and in other libraries throughout the Java land, there's a problem with log4j2 that may or may not affect us. The POI discussion is here: https://lists.apache.org/thread/bkb5y7mj3v3sld9sbk4r6jgmccs4k61j The log4j2 issues are here: https://github.com/apache/logging-log4j2/issues/3143 h

Re: Next release?

2024-11-10 Thread Tim Allison
; > > there? > > > > > > Julien > > > > > > On Mon, 4 Nov 2024 at 22:19, Julien Nioche > > wrote: > > > > > >> Hi, > > >> > > >> Needs more testing but nearly there I think. > > >> > > >> J >

[PPMC RESULT] [VOTE] Apache StormCrawler (Incubating) 3.2.0 Release Candidate #1

2024-11-18 Thread Tim Allison
All, The PPMC vote has passed. I'll start the IPMC vote shortly. Thank you. +1s PPMC Julien Nioche (binding) PPMC Markos Volikas (binding) PPMC/IPMC Richard Zowalla (binding) PPMC/IPMC Tim Allison (binding) Best, Tim P.S. I forgot to include the steps that I took to reach

[VOTE] Apache StormCrawler (Incubating) 3.2.0 Release Candidate #1

2024-11-15 Thread Tim Allison
t the necessary IPMC votes. Here's my +1 Thanks! Tim Allison

[RESULT][VOTE] Apache StormCrawler (Incubating) 3.2.0 Release Candidate #2

2024-11-26 Thread Tim Allison
The PPMC vote has passed with 4 +1s and no -1s. Julien Nioche (PPMC) Markos Volikas (PPMC) Richard Zowalla (IPMC + PPMC) Tim Allison (IPMC + PPMC) I'll start the IPMC vote shortly. On Sat, Nov 23, 2024 at 2:25 PM Markos Volikas wrote: > +1 (mvolikas PPMC) > > *

[VOTE] Apache StormCrawler (Incubating) 3.2.0 Release Candidate #3

2024-12-03 Thread Tim Allison
ed license, readme, etc. - confirmed full build locally Thanks! Tim Allison

Re: Inquiry About Contributing to Stormcrawler

2025-01-31 Thread Tim Allison
Hi Yongjun, I'm sorry for my delay. One resource that would be handy to have (if it doesn't exist) would be documentation on what parameters can be set per domain in the `seeds.txt` file. I've found a few by looking through the source code, and this might be a useful exercise for you. The output w

playwright and headers?

2025-02-05 Thread Tim Allison
This should be a question for our user@ mailing list, but I don't think we've set that up yet? I'm not sure if this is user error. The playwright protocol does not appear to do anything with "storeHttpHeaders" as the httpclient and okhttp protocols do (e.g. https://github.com/apache/incubator-sto

[USER] refetched content behavior

2025-02-07 Thread Tim Allison
Again, this is more for a user@ list Sorry. I want to confirm I understand refetching correctly. When the crawler goes to refetch a page, it adds the If-Modified-Since and the If-None-Match (if an etag exists) headers. If the host respects those, it will return a 200 and new content if someth

Re: playwright and headers?

2025-02-07 Thread Tim Allison
Thank you! On Thu, Feb 6, 2025 at 2:39 AM Richard Zowalla wrote: > > Hi Tim, > > I think, that it is just not implemented. Feel free to add code to support > that configuration. > > Gruß > Richard > > > Am 05.02.2025 um 15:31 schrieb Tim Allison : > > >

Re: user list?

2025-02-07 Thread Tim Allison
Doh. Sorry. Please disregard. Need more coffee. We have a user list. I'm going to step away from the keyboard. On Fri, Feb 7, 2025 at 6:59 AM Tim Allison wrote: > > Should I open an infra ticket to add a user@list? If so, any > preference for user@ or users@? > >

user list?

2025-02-07 Thread Tim Allison
Should I open an infra ticket to add a user@list? If so, any preference for user@ or users@? Thank you. Best, Tim

[ANNOUNCE] Apache StormCrawler (Incubating) 3.2.0 released

2024-12-10 Thread Tim Allison
The Apache StormCrawler (Incubating) team is pleased to announce the release of version 3.2.0 of Apache StormCrawler. StormCrawler is a collection of resources for building low-latency, customisable and scalable web crawlers on Apache Storm. Apache StormCrawler (Incubating) 3.2.0 source distributi

[RESULT][VOTE] Apache StormCrawler (Incubating) 3.2.0 Release Candidate

2024-11-22 Thread Tim Allison
Thank you, Justin and all. IIUC, we need a majority +1s for the release. If there's a -1, the release manager has discretion to cancel the vote. Based on Justin's findings, I'm changing my vote to a -1 (IPMC binding) and canceling this vote. I'll roll an RC2 shortly. Thank you. Best,

[VOTE] Apache StormCrawler (Incubating) 3.2.0 Release Candidate #2

2024-11-22 Thread Tim Allison
ed license, readme, etc. - confirmed full build locally Thanks! Tim Allison

[RESULT][VOTE] Apache StormCrawler (Incubating) 3.2.0 Release Candidate #3

2024-12-06 Thread Tim Allison
The PPMC vote has passed with 3 +1s and no -1s. Julien Nioche (PPMC) Richard Zowalla (IPMC + PPMC) Tim Allison (IPMC + PPMC) I'll start the IPMC vote shortly. On Fri, Dec 6, 2024 at 9:53 AM Julien Nioche wrote: > Thanks Richard > > Checked hashes and signatures >

[ANNOUNCE] Apache StormCrawler (Incubating) 3.3.0 released

2025-03-24 Thread Tim Allison
The Apache StormCrawler (Incubating) team is pleased to announce the release of version 3.3.0 of Apache StormCrawler. StormCrawler is a collection of resources for building low-latency, customisable and scalable web crawlers on Apache Storm. Apache StormCrawler (Incubating) 3.3.0 source distributi

Re: Graduation?

2025-03-18 Thread Tim Allison
+1 On Tue, Mar 18, 2025 at 8:14 PM PJ Fanning wrote: > > Hi everyone, > > Is it time to consider becoming a TLP? The community is small but the > releases are running smoothly. > > Regards, > PJ

[VOTE] Apache StormCrawler (Incubating) 3.3.0 Release Candidate #1

2025-03-18 Thread Tim Allison
Hi folks, I have posted a first release candidate for the Apache StormCrawler (Incubating) 3.3.0 release and it is ready for testing. * Upgrade to Storm to 2.8.0 and other dependency upgrades * Fixed a bug in custom configuration of sitemap processing * Fixed a bug when multiple processes attempt

[RESULT][VOTE] Apache StormCrawler (Incubating) 3.3.0 Release Candidate 1

2025-03-18 Thread Tim Allison
The vote has passed with 4 +1s and no -1s Ayush Saxena (IPMC + PPMC) Markos Volikas (PPMC) Richard Zowalla (IPMC + PPMC) Tim Allison (IPMC + PPMC) I'll start the vote in the incubator shortly. On Tue, Mar 18, 2025 at 6:37 AM Ayush Saxena wrote: > > +1 (Binding) > > * Bui

FileSpout and "Local Server connection should not send BackPressure status"

2025-04-05 Thread Tim Allison
All, I recently upgraded to our latest release, and I also bumped OpenSearch to the latest. When injecting several thousand seeds with the FileSpout, I started getting the stacktrace below. I'm not confident that the upgrade caused the problem... The injector spins up a local zookeeper+storm in

Re: Candidates for PMC chair?

2025-04-08 Thread Tim Allison
I am totally willing for Richard to serve as chair. LOL. Thank you! On Fri, Apr 4, 2025 at 12:24 PM Richard Zowalla wrote: > Hi all, > > I would be willing to serve as chair. > If anyone else wants to do it, I am also fine with it. > > Gruß > Richard > > Am 4. April 2025 17:47:54 MESZ schrieb J

Re: [VOTE] Graduate Apache StormCrawler (Incubating) as a Top level Project

2025-04-18 Thread Tim Allison
StormCrawler" be and > hereby is created, the person holding such office to serve at the direction > of the Board of Directors as the chair of the Apache StormCrawler Project, > and to have primary responsibility for management of the projects within > the scope of responsibility of th

Re: Graduation?

2025-04-16 Thread Tim Allison
+1 sounds good to me On Sat, Apr 12, 2025 at 2:54 PM Julien Nioche wrote: > thanks PJ. > > We need to discuss the PMC membership as part of the graduation. I would > suggest that in order to be eligible for PMC membership, PPMC members > should have made 3 contributions to code, reviews or discu

Re: Release?

2025-03-13 Thread Tim Allison
Should I wait for: https://github.com/apache/incubator-stormcrawler/pull/1488 before cutting RC1? On Wed, Mar 5, 2025 at 8:54 AM Richard Zowalla wrote: > +1 > > Am 5. März 2025 14:50:38 MEZ schrieb Tim Allison : > >I'm hoping to kick off the release process towards t

Re: Release?

2025-03-05 Thread Tim Allison
-stormcrawler/issues/621 in this > release. > > There is still testing and performance comparison to be done, and > probably some design decisions to be made when I open the PR. > > Best, > > Markos > > On 2/21/25 13:36, Richard Zowalla wrote: > > Feel free :-) &

[VOTE] Apache StormCrawler (Incubating) 3.3.0 Release Candidate 1

2025-03-11 Thread Tim Allison
Hi folks, I have posted a first release candidate for the Apache StormCrawler (Incubating) 3.3.0 release and it is ready for testing. * Upgrade to Storm to 2.8.0 and other dependency upgrades * Fixed a bug in custom configuration of sitemap processing * Fixed a bug when multiple processes attempt

Re: Release?

2025-02-21 Thread Tim Allison
I’d like to get two bug fixes in if possible. Early this coming week? I’m happy to be release manager again unless anyone else would like it? On Fri, Feb 21, 2025 at 4:15 AM Richard Zowalla wrote: > Hi, > > Since Storm 2.8.0 is out a few weeks now, what do you think about doing a > SC release?

[RESULT][VOTE] Apache StormCrawler (Incubating) 3.3.0 Release Candidate #1

2025-03-24 Thread Tim Allison
The vote was successful with three +1 from IPMCs and no -1s. Ayush Saxena Richard Zowalla (carried over from the PPMC vote) Tim Allison I'll release the artifacts and update the website shortly. Thank you, all. Best, Tim On Fri, Mar 21, 2025 at 12:27 PM Ayush Saxena wrote: &

Re: First TLP Release?

2025-06-18 Thread Tim Allison
+1 On Tue, Jun 17, 2025 at 7:16 AM Richard Zowalla wrote: > > Hi all, > > What do you think about running a release soon? > > We have a few things: > > 1.) Removed incubator notes > 2.) Possibly LLM Support (pending PR) > 3.) Updates in the SOLR area (pending PR) > 4.) Multiple dependency updates