mvolikas commented on PR #1343:
URL: 
https://github.com/apache/incubator-stormcrawler/pull/1343#issuecomment-2453477771

   An update from my side:
   
   - I have now tested in local mode with 1 and 4 shards.
   - I have updated the `SolrSpout` code so that the query param for shards is 
not added for just one shard.
   - I tested that the compilation with the Java topologies included succeeds.
   - Improved the scripts to handle the case of commented-out properties.
   - After the latest commit, the following workflow worked locally without any 
issues for me:
   
     - `mvn archetype:generate -DarchetypeGroupId=org.apache.stormcrawler 
-DarchetypeArtifactId=stormcrawler-solr-archetype 
-DarchetypeVersion=3.1.1-SNAPSHOT`
     - `cd test && mvn clean compile`
     - `/opt/apache-storm-2.6.4/bin/storm local target/test-1.0-SNAPSHOT.jar  
org.apache.storm.flux.Flux crawler.flux --local-ttl 3600` - the provided flux 
topology starts from the StormCrawler webpage URL and indexes documents in Solr.
     
   Still to do/decide:
   
   - Should we keep any of the Java topologies? (probably just the 
`SeedInjector.java` one)
   - Testing with Storm 2.7.0. (I have tested with Storm 2.6.4 and Solr 9.7.0)
   - Review the changes and READMEs one more time after the previous 2 are done.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@stormcrawler.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to