Re: [PR] SOLR-17685: Remove script creation of solr url based on SOLR_TOOL_HOST in favour of java code in CLI tools [solr]

2025-03-11 Thread via GitHub


epugh commented on PR #3223:
URL: https://github.com/apache/solr/pull/3223#issuecomment-2714863862

   > So instead of using SOLR_TOOL_HOST, and instead using SOLR_HOST in the 
java code, I think we can remove the SOLR_TOOL_HOST all together, since we set 
it in `bin/solr`:
   > 
   > ```shell
   > SOLR_TOOL_HOST="${SOLR_HOST:-localhost}"
   > export SOLR_TOOL_HOST
   > ```
   > 
   > So since we are already defaulting to `localhost` in the code, I think we 
can remove `SOLR_TOOL_HOST` altogether and change `String host = 
EnvUtils.getProperty("solr.tool.host", "localhost");` to `String host = 
EnvUtils.getProperty("solr.host", "localhost");` in `CLIUtils`
   > 
   > But I agree that unfortunately the AUTH_PORT is a backwards incompatible 
change. Maybe instead we still check for the `auth` command in the 9.x code and 
do `export SOLR_PORT="${AUTH_PORT}"`. And we can keep the code to get that port 
the same. But in 10, we can remove it and the additional export.
   
   Got a green on the scripts, so now looking at the AUTH_PORT.   So, I am 
thinking we on this PR do the check for AUTH_PORT and do what you suggested.   
Commit and backport that.  And then in a NEW PR strip out the export 
SOLR_PORT="${AUTH_PORT}"` that only goes on main?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17655: Remove ExternalFileField [solr]

2025-03-11 Thread via GitHub


epugh commented on PR #3244:
URL: https://github.com/apache/solr/pull/3244#issuecomment-2714841124

   Doh, you are right: 
https://solr.apache.org/guide/solr/latest/indexing-guide/field-types-included-with-solr.html#deprecated-field-types
 doesn't actually list ExternalFileField!
   
   If I make a seperate PR to go on `branch_9x` that updates the Ref Guide and 
adds the `@Deprecated` tags, does that work?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [I] Security.json is never copied to Zookeeper while creating Solr 9.8.0 Clusters [solr-operator]

2025-03-11 Thread via GitHub


gerlowskija closed issue #762: Security.json is never copied to Zookeeper while 
creating Solr 9.8.0 Clusters
URL: https://github.com/apache/solr-operator/issues/762


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17674: Refresh bin/solr instructions to down play SolrCloud and introduce --user-managed example. [solr]

2025-03-11 Thread via GitHub


gerlowskija commented on code in PR #3190:
URL: https://github.com/apache/solr/pull/3190#discussion_r1989607793


##
solr/core/src/java/org/apache/solr/cli/SolrCLI.java:
##
@@ -411,21 +411,25 @@ private static void printHelp() {
 print(
 "zk ls, zk cp, zk rm , zk mv, zk 
mkroot, zk upconfig, zk downconfig,");
 print(
-"snapshot-create, snapshot-list, 
snapshot-delete, snapshot-export, snapshot-prepare-export");
+"snapshot-create, snapshot-list, 
snapshot-delete, snapshot-export");
 print("");
-print("  Standalone server example (start Solr running in the background 
on port 8984):");
+print("  Start Solr on default port 8983:");

Review Comment:
   I think I’m fine with --user-managed being second.  Between the help-text 
ordering and the “user-managed” name itself (which IMO correctly implies more 
responsibility+work for the user) - we’re giving new users a pretty substantial 
nudge towards SolrCloud.  Personally, I think that’s great.
   
   But I'm a little uncomfortable with the wording that's here in the “first 
spot”.  It doesn't really clarify the mode one way or another.  And that feels 
a little dangerous in a release that also happens to switch the "default" mode! 
 90% of users coming from 9.x standalone will read that and (incorrectly) miss 
that the default mode has changed.  
   
   Please consider adding a tweak here so that the "First spot" does 
_something_ to indicate the mode ("SolrCloud", mention ZooKeeper, etc.)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17655: Remove ExternalFileField [solr]

2025-03-11 Thread via GitHub


gerlowskija commented on PR #3244:
URL: https://github.com/apache/solr/pull/3244#issuecomment-2714808644

   > ExternalFileField is deprecated, remove it.
   
   Is it?  SOLR-17655 looks like it was created to put that deprecation in 
place, but it's not there yet afaict.
   
   (That's not to say that we can't remove it on 'main' in anticipation of 
deprecating it in an eventual 9.9.  I'm not saying we shouldn't remove it 
necessarily - just making sure we don't have wires crossed on what's deprecated 
at the moment.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [I] Setup SOLR using Basic Authentication doesn't work. [solr-operator]

2025-03-11 Thread via GitHub


gerlowskija commented on issue #757:
URL: https://github.com/apache/solr-operator/issues/757#issuecomment-2714274457

   Hi @irwan-verint - this Github repo is specific to issues and improvements 
related to the Solr Operator: a particular way of running and managing Solr 
deployments in Kubernetes.
   
   Don't worry though - there's plenty of other places to ask questions and get 
help!  You might want to check out the "Slack" and "User List" sections of our 
[Community page](https://solr.apache.org/community.html) here.
   
   
   
   As to your particular issue, `Caused by: 
org.noggit.JSONParser$ParseException` suggests that the contents of your 
security.json file aren't valid JSON.  It might be worth running your file 
through a standalone JSON parser like ['jq'](https://jqlang.org/) to 
double-check.
   
   Best of luck! 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] fix the solr zk invocation [solr-operator]

2025-03-11 Thread via GitHub


gerlowskija merged PR #756:
URL: https://github.com/apache/solr-operator/pull/756


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] fix the solr zk invocation [solr-operator]

2025-03-11 Thread via GitHub


gerlowskija commented on PR #756:
URL: https://github.com/apache/solr-operator/pull/756#issuecomment-2714450348

   Thanks again for all your work on this @elangelo ! 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [I] Setup SOLR using Basic Authentication doesn't work. [solr-operator]

2025-03-11 Thread via GitHub


gerlowskija closed issue #757: Setup SOLR using Basic Authentication doesn't 
work.
URL: https://github.com/apache/solr-operator/issues/757


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17679) Request for Documentation/Feature Improvement on Hybrid Lexical and Vector Search with Score Breakdown and Cutoff Logic

2025-03-11 Thread Alessandro Benedetti (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17934215#comment-17934215
 ] 

Alessandro Benedetti commented on SOLR-17679:
-

Hi,
that's interesting and doable.
Right now, you can check the scores through the debug Solr functionality.
Feel free to contribute to the documentation, I'll be happy to review it.
If you want to sponsor it, feel free to reach out to me or other committers, 
and we'll help.

I wrote a blog about it: 
https://sease.io/2023/12/hybrid-search-with-apache-solr.html

> Request for Documentation/Feature Improvement on Hybrid Lexical and Vector 
> Search with Score Breakdown and Cutoff Logic
> ---
>
> Key: SOLR-17679
> URL: https://issues.apache.org/jira/browse/SOLR-17679
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 9.6.1
>Reporter: Khaled Alkhouli
>Priority: Minor
>  Labels: hybrid-search, search, solr, vector-based-search
> Attachments: Screenshot from 2025-02-20 16-31-48.png
>
>
> Hello Apache Solr team,
> I was able to implement a hybrid search engine that combines *lexical search 
> (edismax)* and *vector search (KNN-based embeddings)* within a single 
> request. The idea is simple:
>  * *Lexical Search* retrieves results based on text relevance.
>  * *Vector Search* retrieves results based on semantic similarity.
>  * *Hybrid Scoring* sums both scores, where a missing score (if a document 
> appears in only one search) should be treated as zero.
> This approach is working, but *there is a critical lack of documentation* on 
> how to properly return individual score components of lexical search (score1) 
> vs. vector search (score2 from cosine similarity). Right now, Solr only 
> returns the final combined score, but there is no clear way to see {*}how 
> much of that score comes from lexical search vs. vector search{*}. This is 
> essential for debugging and for fine-tuning ranking strategies.
>  
> I have implemented the following logic using Python:
> {code:java}
> def hybrid_search(query, top_k=10):
>     embedding = np.array(embed([query]), dtype=np.float32
> embedding = list(embedding[0])
>     lxq= rf"""{{!type=edismax 
>                 qf='text'
>                 q.op=OR
>                 tie=0.1
>                 bq=''
>                 bf=''
>                 boost=''
>             }}({query})"""
>     solr_query = {"params": {
>         "q": "{!bool filter=$retrievalStage must=$rankingStage}",
>         "rankingStage": 
> "{!func}sum(query($normalisedLexicalQuery),query($vectorQuery))",
>         "retrievalStage":"{!bool should=$lexicalQuery should=$vectorQuery}", 
> # Union
>         "normalisedLexicalQuery": "{!func}scale(query($lexicalQuery),0,1)",
>         "lexicalQuery": lxq,
>         "vectorQuery": f"{{!knn f=all_v512 topK={top_k}}}{embedding}",
>         "fl": "text",
>         "rows": top_k,
>         "fq": [""],
>         "rq": "{!rerank reRankQuery=$rqq reRankDocs=100 reRankWeight=3}",
>         "rqq": "{!frange l=$cutoff}query($rankingStage)",
>         "sort": "score desc",
>     }}
>     response = requests.post(SOLR_URL, headers=HEADERS, json=solr_query)
>     response = response.json()
>     return response {code}
> h3. *Issues & Missing Documentation*
>  # *No Way to Retrieve Individual Scores in a Hybrid Search*
> There is no clear documentation on how to return:
>  * 
>  ** The *lexical search score* separately.
>  ** The *vector search score* separately.
>  ** The *final combined score* (which Solr already provides).
> Right now, we’re left guessing whether the sum of these scores works as 
> expected, making debugging and tuning unnecessarily difficult.
>  # *No Clear Way to Implement Cutoff Logic in Solr*
> In a hybrid search, I need to filter out results that don’t meet a {*}minimum 
> score threshold{*}. Right now, I have to implement this in Python, {*}which 
> defeats the purpose of using Solr for ranking in the first place{*}.
>  * 
>  ** How can we enforce a {*}score-based cutoff directly in Solr{*}, without 
> external filtering?
>  ** The \{!frange} function is mentioned in the documentation but lacks 
> {*}clear examples on how to apply it to hybrid search{*}.
> h3. *Feature Request / Documentation Improvement*
>  * *Provide a way to return individual scores for lexical and vector search 
> in the response.* This should be as simple as adding fields like 
> {{{}fl=score,lexical_score,vector_score{}}}.
>  * *Clarify how to apply cutoff logic in a hybrid search.* This is an 
> essential ranking mechanism, and yet, there’s little guidance on how to do 
> this efficiently within Solr itself.
> Looking forward to a response.



--
This message

Re: [PR] fix the solr zk invocation [solr-operator]

2025-03-11 Thread via GitHub


gerlowskija commented on PR #756:
URL: https://github.com/apache/solr-operator/pull/756#issuecomment-2714415769

   The PR jobs are failing currently due to a setup issue:
   
   > Error: This request has been automatically failed because it uses a 
deprecated version of `actions/cache: v2`. Please update your workflow to use 
v3/v4 of actions/cache to avoid interruptions. Learn more: 
https://github.blog/changelog/2024-12-05-notice-of-upcoming-releases-and-breaking-changes-for-github-actions/#actions-cache-v1-v2-and-actions-toolkit-cache-package-closing-down
   
   I'll tackle updating this in a separate PR, so it doesn't get bundled in to 
this.  I've run the unit and integration-tests locally, as well as doing a good 
deal of manual testing, so this should be ready to merge and backport!
   
   (I've skipped adding a helm chart changelog entry for this PR, as that would 
make it look like the fix was first made available in 0.10.0, when really it'll 
most likely get released in a 0.9.1.  I *will* add a changelog when I backport 
to the `release-0.9` branch however.  @HoustonPutman does that sound right to 
you, or am I missing something?)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] DefaultPackageRepository: simplify HTTP & JSON [solr]

2025-03-11 Thread via GitHub


epugh commented on code in PR #3253:
URL: https://github.com/apache/solr/pull/3253#discussion_r1989361358


##
solr/packaging/test/test_packages.bats:
##
@@ -58,22 +58,24 @@ teardown() {
 }
 
 # This test is useful if you are debugging/working with packages.
-# We have commented it out for now since it depends on a live internet
+# We have disabled it for now since it depends on a live internet
 # connection to run.  This could be updated with a local Repo server if we had
 # a package that is part of the Solr project to use.
-# @test "deploying and undeploying a cluster level package" {
-#  run solr start -Denable.packages=true
+@test "deploying and undeploying a cluster level package" {
+  skip "For developing package infra; requires a connection to github"

Review Comment:
   Nice!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] Enable picocli for a few tools [solr]

2025-03-11 Thread via GitHub


janhoy commented on PR #3247:
URL: https://github.com/apache/solr/pull/3247#issuecomment-2713858887

   I made a [JIRA](https://issues.apache.org/jira/browse/SOLR-17697) and a 
feature branch with only the gradle stuff.
   Re-purposed this PR to add the first few tools as a POC.
   More PRs should follow targeting that branch.
   
   To see status of the overall feature branch, I made a draft PR #3254 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17143) Streaming with multiple shards can trigger unexpected IdleTimeout

2025-03-11 Thread Alex Deparvu (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17934221#comment-17934221
 ] 

Alex Deparvu commented on SOLR-17143:
-

>  2. Are there any low level "keep-alive" that is already built in? I couldn't 
> find any so far 

yes. it seems you are hitting the `idleTimeout` of the client [0]. could you 
try setting this to a higher value?
I am running your test successfully by only adding the following static block
```
  static {
  System.setProperty("socketTimeout", "12");
  }
```
(I removed the `jetty.withConnectorIdleTimeout` and bumped the nodes up to 75k 
to get it to fail in my machine.)

it would be good to identify if this works first, then we can look at the 
available options. if this setting this as a system propery is not an option 
you could also pass a custom SolrClientCache to the StreamContext with a client 
that is already configured for a higher timeout. 


[0] 
https://github.com/apache/solr/blob/main/solr/solrj-streaming/src/java/org/apache/solr/client/solrj/io/SolrClientCache.java#L50

> Streaming with multiple shards can trigger unexpected IdleTimeout
> -
>
> Key: SOLR-17143
> URL: https://issues.apache.org/jira/browse/SOLR-17143
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 9.4.1
>Reporter: Patson Luk
>Priority: Critical
>
> With the new [test case 
> submitted|https://github.com/cowpaths/fullstory-solr/commit/383134928e372f19d96b1b16459a3566169d3ff4]
>  , we re-produced an issue with streaming in our production cloud 
> environment. 
> The test case creates a collection of 2 shards, which 20k docs are indexed. 
> 10k docs have id with routing prefix `a`, while the other 10k with `c`. Each 
> of those prefix would hash to different shard, producing 2 shards of 10k docs 
> each.
> Now, if we stream by sorting on the id, both shards would send back some data 
> initially, however only one shard (that hosts prefix `a`) will have continued 
> traffic due to the sorted iteration, the other shard would eventually throw 
> {{IdleTimeout}} as the stream was pending w/o network activity.
> If we change the test case `SHARD_COUNT` from 2 to 1, then the case runs 
> fine. 
> In our environment, we have jetty http connector timeout as 120 secs, yet we 
> still run into that occasionally, the client does consume the data in a 
> reasonable rate, however with up to 1024 shards per collection, it's quite 
> easy that some shards might not have data streamed within 120 secs hence 
> triggering the mentioned timeout.
> We assume such issue with streaming is not uncommon for any distributed 
> system, and am wondering what could be done to fix or mitigate that. 
> Several ideas that we have:
> 1. If possible, we might want to stream per shard instead of per collection. 
> However, there are cases that we do want to stream on the whole collection 
> with sorted ordering
> 2. Are there any low level "keep-alive" that is already built in? I couldn't 
> find any so far :)
> 3. Keep the stream alive by pushing small amount of dummy data from the 
> aggregator (the solr node which distributes the stream request as /export to 
> other nodes) but it got very hacky and is still not working. Didn't dig too 
> deep as I wish to surface this issue to the Solr community and gather some 
> thoughts first!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Comment Edited] (SOLR-17143) Streaming with multiple shards can trigger unexpected IdleTimeout

2025-03-11 Thread Alex Deparvu (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17934221#comment-17934221
 ] 

Alex Deparvu edited comment on SOLR-17143 at 3/11/25 1:54 PM:
--

>  2. Are there any low level "keep-alive" that is already built in? I couldn't 
> find any so far 

yes. it seems you are hitting the `idleTimeout` of the client [0]. could you 
try setting this to a higher value?
I am running your test successfully by only adding the following static block 
at line 94 of your test (just before the `@BeforeClass`).

{code}
  static {
  System.setProperty("socketTimeout", "12");
  }
{code}
(I removed the `jetty.withConnectorIdleTimeout` and bumped the nodes up to 75k 
to get it to fail in my machine.)

it would be good to identify if this works first, then we can look at the 
available options. if this setting this as a system propery is not an option 
you could also pass a custom SolrClientCache to the StreamContext with a client 
that is already configured for a higher timeout. 


[0] 
https://github.com/apache/solr/blob/main/solr/solrj-streaming/src/java/org/apache/solr/client/solrj/io/SolrClientCache.java#L50


was (Author: alex.parvulescu):
>  2. Are there any low level "keep-alive" that is already built in? I couldn't 
> find any so far 

yes. it seems you are hitting the `idleTimeout` of the client [0]. could you 
try setting this to a higher value?
I am running your test successfully by only adding the following static block

{code}
  static {
  System.setProperty("socketTimeout", "12");
  }
{code}
(I removed the `jetty.withConnectorIdleTimeout` and bumped the nodes up to 75k 
to get it to fail in my machine.)

it would be good to identify if this works first, then we can look at the 
available options. if this setting this as a system propery is not an option 
you could also pass a custom SolrClientCache to the StreamContext with a client 
that is already configured for a higher timeout. 


[0] 
https://github.com/apache/solr/blob/main/solr/solrj-streaming/src/java/org/apache/solr/client/solrj/io/SolrClientCache.java#L50

> Streaming with multiple shards can trigger unexpected IdleTimeout
> -
>
> Key: SOLR-17143
> URL: https://issues.apache.org/jira/browse/SOLR-17143
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 9.4.1
>Reporter: Patson Luk
>Priority: Critical
>
> With the new [test case 
> submitted|https://github.com/cowpaths/fullstory-solr/commit/383134928e372f19d96b1b16459a3566169d3ff4]
>  , we re-produced an issue with streaming in our production cloud 
> environment. 
> The test case creates a collection of 2 shards, which 20k docs are indexed. 
> 10k docs have id with routing prefix `a`, while the other 10k with `c`. Each 
> of those prefix would hash to different shard, producing 2 shards of 10k docs 
> each.
> Now, if we stream by sorting on the id, both shards would send back some data 
> initially, however only one shard (that hosts prefix `a`) will have continued 
> traffic due to the sorted iteration, the other shard would eventually throw 
> {{IdleTimeout}} as the stream was pending w/o network activity.
> If we change the test case `SHARD_COUNT` from 2 to 1, then the case runs 
> fine. 
> In our environment, we have jetty http connector timeout as 120 secs, yet we 
> still run into that occasionally, the client does consume the data in a 
> reasonable rate, however with up to 1024 shards per collection, it's quite 
> easy that some shards might not have data streamed within 120 secs hence 
> triggering the mentioned timeout.
> We assume such issue with streaming is not uncommon for any distributed 
> system, and am wondering what could be done to fix or mitigate that. 
> Several ideas that we have:
> 1. If possible, we might want to stream per shard instead of per collection. 
> However, there are cases that we do want to stream on the whole collection 
> with sorted ordering
> 2. Are there any low level "keep-alive" that is already built in? I couldn't 
> find any so far :)
> 3. Keep the stream alive by pushing small amount of dummy data from the 
> aggregator (the solr node which distributes the stream request as /export to 
> other nodes) but it got very hacky and is still not working. Didn't dig too 
> deep as I wish to surface this issue to the Solr community and gather some 
> thoughts first!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] Simplify BinaryResponseWriter.getParsedResponse [solr]

2025-03-11 Thread via GitHub


dsmiley merged PR #3243:
URL: https://github.com/apache/solr/pull/3243


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] POC: Test picocli [solr]

2025-03-11 Thread via GitHub


epugh commented on PR #3247:
URL: https://github.com/apache/solr/pull/3247#issuecomment-2713435505

   > Eric, you have really lifted the CLI in the last versions, none of which 
is wasted if/when moving to picocli!
   > 
   > Also, we have pretty good test coverage in bats tests, which is a great 
way to validate parity. It's not a goal to retain the exact same help/usage 
output. Let the tool generate based on best practices. However, given a CLI 
invocation `MYENV=123 bin/solr foo --bar baz` it should result in the same tool 
behavior as before. It could be a 10.0 only or also a 9.x feature depending on 
whether we feel that tool usage output is part of our back-compat guarantees. 
Perhaps 10.0 is safest, although backporting CLI bugfixes will be harder if 9x 
is still on commons-cli.
   > 
   > I'd be interested in chipping away on this in a central collaborative 
feature branch over the course of a several weeks..
   
   I'd love to work with you on this...I lean towards this being a 10x 
feature, just so we don't have to backport, and we have some time to make it 
"perfect".I think CLI bugfixes on 9x will just being and end on 9x branch 
in this case.
   
   So what is next?  A branch in the main asf repo?   Or do we just push to 
this one?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17632) Text to Vector Update Request Processor

2025-03-11 Thread Alessandro Benedetti (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17934200#comment-17934200
 ] 

Alessandro Benedetti commented on SOLR-17632:
-

Done!

> Text to Vector Update Request Processor
> ---
>
> Key: SOLR-17632
> URL: https://issues.apache.org/jira/browse/SOLR-17632
> Project: Solr
>  Issue Type: New Feature
>  Components: UpdateRequestProcessors
>Reporter: Alessandro Benedetti
>Assignee: Alessandro Benedetti
>Priority: Major
>  Labels: pull-request-available, vector-based-search
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Scope of this issue is to introduce support for automatic text vectorisation 
> in Apache Solr, directly in a update request processor.
> A LLM fine-tuned for sentence similarity will be accessed to embed the text.
> Apache Solr will host the configuration parameters to access embedding 
> services and the update request processor will use such services to directly 
> encode the document field as a vector.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Updated] (SOLR-17632) Text to Vector Update Request Processor

2025-03-11 Thread Alessandro Benedetti (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alessandro Benedetti updated SOLR-17632:

Labels: pull-request-available vector-based-search  (was: 
pull-request-available)

> Text to Vector Update Request Processor
> ---
>
> Key: SOLR-17632
> URL: https://issues.apache.org/jira/browse/SOLR-17632
> Project: Solr
>  Issue Type: New Feature
>  Components: UpdateRequestProcessors
>Reporter: Alessandro Benedetti
>Assignee: Alessandro Benedetti
>Priority: Major
>  Labels: pull-request-available, vector-based-search
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Scope of this issue is to introduce support for automatic text vectorisation 
> in Apache Solr, directly in a update request processor.
> A LLM fine-tuned for sentence similarity will be accessed to embed the text.
> Apache Solr will host the configuration parameters to access embedding 
> services and the update request processor will use such services to directly 
> encode the document field as a vector.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]

2025-03-11 Thread via GitHub


alessandrobenedetti commented on PR #3151:
URL: https://github.com/apache/solr/pull/3151#issuecomment-2714076350

   Ok, no updates, comments or help in the last three weeks, so at the end of 
the week, I'll proceed fixing the tests and merging. any help with the test 
clean up is still welcome!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[PR] Use picocli instead of commons-cli [solr]

2025-03-11 Thread via GitHub


janhoy opened a new pull request, #3254:
URL: https://github.com/apache/solr/pull/3254

   https://issues.apache.org/jira/browse/SOLR-17697
   
   This PR is just a way to visualize the status of the feature branch 
`jira/SOLR-17697-picocli`.
   Create PRs targeting that branch to tackle individual tasks, and then merge 
into this feature branch once done.
   
   ## Tasks/milestones:
   - [ ] Add framework
   - [ ] Implement POC for a few initial tools, in parallel with commons-cli
   - [ ] Solve option inheritance / MixIn
   - [ ] Solve option mutual exclusivity
   - [ ] Solve value-fallback to ENV
   - [ ] Document in RefGuide - Auto generate?
   - [ ] All tools covered
   - [ ] BATS tests green
   - [ ] Proof read Usage helps
   - [ ] Remove commons-cli
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Created] (SOLR-17696) Support Jetty 12 for Solr latest

2025-03-11 Thread Dhoka Pramod (Jira)
Dhoka Pramod created SOLR-17696:
---

 Summary: Support Jetty 12 for Solr latest
 Key: SOLR-17696
 URL: https://issues.apache.org/jira/browse/SOLR-17696
 Project: Solr
  Issue Type: Improvement
Reporter: Dhoka Pramod


Need the latest solr to support jetty-client-12.0.x. 
Jetty 10 and Jetty 11 End of support.
[https://github.com/jetty/jetty.project/issues/10485]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17655: Remove ExternalFileField [solr]

2025-03-11 Thread via GitHub


epugh commented on PR #3244:
URL: https://github.com/apache/solr/pull/3244#issuecomment-2713457158

   Going to merge later this week barring any feedback!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17683: Remove CurrencyField [solr]

2025-03-11 Thread via GitHub


epugh commented on PR #3212:
URL: https://github.com/apache/solr/pull/3212#issuecomment-2713458846

   Going to merge later this week barring any feedback!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17540: Remove Hadoop Auth Module [solr]

2025-03-11 Thread via GitHub


dsmiley commented on PR #2835:
URL: https://github.com/apache/solr/pull/2835#issuecomment-2711845689

   There are ~4 classes (+ 5 tests) containing "DelegationToken".  @risdenk are 
these now obsolete/useless?  Would that belong here in this issue (for Hadoop 
Auth) or would you recommend another issue?  If another issue, can you please 
create it so that the wording is correct?  These classes were added in 
[SOLR-9200](https://issues.apache.org/jira/browse/SOLR-9200)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] fix the solr zk invocation [solr-operator]

2025-03-11 Thread via GitHub


HoustonPutman commented on PR #756:
URL: https://github.com/apache/solr-operator/pull/756#issuecomment-2711081795

   I will work on adding an integration test, but since you tested it manually, 
it shouldn't block this PR or a release IMO.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17537) Let Curator handle state compression in ZooKeeper

2025-03-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17933951#comment-17933951
 ] 

ASF subversion and git services commented on SOLR-17537:


Commit 5c9664c4eac8e4d29cf048369580ca51487f93e7 in solr's branch 
refs/heads/main from Houston Putman
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=5c9664c4eac ]

SOLR-17537: Manage ZK Compression through Curator (#2849)



> Let Curator handle state compression in ZooKeeper
> -
>
> Key: SOLR-17537
> URL: https://issues.apache.org/jira/browse/SOLR-17537
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Houston Putman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Curator has an interface {{{}CompressionProvider{}}}, that can be provided 
> when building a {{CuratorFramework}} client. We can create a 
> CompressionProvider that wraps our custom compression logic, so that Curator 
> will handle it for us.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] DefaultPackageRepository: simplify HTTP & JSON [solr]

2025-03-11 Thread via GitHub


epugh commented on PR #3253:
URL: https://github.com/apache/solr/pull/3253#issuecomment-2713409914

   There is a commented out .bats test in `test_packages.bats` that you could 
run.  It requires a live internet connection which is why we don't run it in CI 
that tests out github..  
   > This test is useful if you are debugging/working with packages.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17043: Remove SolrClient path pattern matching [solr]

2025-03-11 Thread via GitHub


jkmuriithi commented on code in PR #3238:
URL: https://github.com/apache/solr/pull/3238#discussion_r1987459862


##
solr/solrj/src/java/org/apache/solr/client/solrj/SolrRequest.java:
##
@@ -185,8 +188,31 @@ public void setQueryParams(Set queryParams) {
 this.queryParams = queryParams;
   }
 
-  /** This method defines the type of this Solr request. */
-  public abstract String getRequestType();
+  /**
+   * Defines the intended type of this Solr request.
+   *
+   * Subclasses should typically override this method instead of {@link
+   * SolrRequest#getRequestType}. Note that changing request type can 
break/impact request routing
+   * within various clients (i.e. {@link CloudSolrClient}).
+   */
+  protected SolrRequestType getBaseRequestType() {
+return SolrRequestType.UNSPECIFIED;
+  }
+
+  /**
+   * Pattern matches on the underlying {@link SolrRequest} to identify ADMIN 
requests and other
+   * special cases. If no special case is identified, {@link 
SolrRequest#getBaseRequestType()} is
+   * returned.
+   */
+  public SolrRequestType getRequestType() {

Review Comment:
   Good suggestion!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] Make test connectionLoss logic the same, add waitForLoss option [solr]

2025-03-11 Thread via GitHub


HoustonPutman commented on PR #3225:
URL: https://github.com/apache/solr/pull/3225#issuecomment-2711096019

   > Can/should we just always wait or recommend that we do so? Seems safer to 
wait.
   
   Yeah, that should be fine. I'll remove the option, and always wait.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17688: Http2SolrClient: use Request.Listener not HttpListenerFactory [solr]

2025-03-11 Thread via GitHub


stillalex commented on PR #3233:
URL: https://github.com/apache/solr/pull/3233#issuecomment-2714527929

   interesting change, I think it looks good and the new version of the code 
looks much cleaner, +1 from me.
   
   just 2 thoughts (feel free to ignore if not relevant):
   - there are some places where we create new clients based on existing 
clients settings, so would we need to make sure the listeners are being copied 
over? (not sure if this is a problem here, I remember running into this pattern 
a while back).
   - removing a solr abstraction in favor of a jetty abstraction brings the 
code closer to jetty clients by creating this hard dependency. and while I have 
not been around enought to tell, it feels like long term it would be better to 
depend less on specific http clients, giving the project some freedom in 
picking a new impl if needed (would this work with the new jdk http client?).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] DefaultPackageRepository: simplify HTTP & JSON [solr]

2025-03-11 Thread via GitHub


dsmiley commented on PR #3253:
URL: https://github.com/apache/solr/pull/3253#issuecomment-2714548138

   Thanks for pointing out the commented out test.  It passes now, after I 
updated it (the URL didn't seem to work exactly as it was).
   
   As you can hopefully agree/observe here, Jackson is enough to fetch parse 
Json from a URL; Solr doesn't need its own utility.
   
   The code that I encountered (replaced) created a new Http2SolrClient, and 
thus had no security/credential configuration.  To use Solr's configured auth, 
it'd have to get the underlying HttpClient from elsewhere in Solr's 
infrastructure.  But it wasn't and I'd rather just keep the simple thing of my 
PR until some day (if ever) someone wants to add auth.  Again; I didn't take 
away auth here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Comment Edited] (SOLR-17143) Streaming with multiple shards can trigger unexpected IdleTimeout

2025-03-11 Thread Alex Deparvu (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17934221#comment-17934221
 ] 

Alex Deparvu edited comment on SOLR-17143 at 3/11/25 1:53 PM:
--

>  2. Are there any low level "keep-alive" that is already built in? I couldn't 
> find any so far 

yes. it seems you are hitting the `idleTimeout` of the client [0]. could you 
try setting this to a higher value?
I am running your test successfully by only adding the following static block

{code}
  static {
  System.setProperty("socketTimeout", "12");
  }
{code}
(I removed the `jetty.withConnectorIdleTimeout` and bumped the nodes up to 75k 
to get it to fail in my machine.)

it would be good to identify if this works first, then we can look at the 
available options. if this setting this as a system propery is not an option 
you could also pass a custom SolrClientCache to the StreamContext with a client 
that is already configured for a higher timeout. 


[0] 
https://github.com/apache/solr/blob/main/solr/solrj-streaming/src/java/org/apache/solr/client/solrj/io/SolrClientCache.java#L50


was (Author: alex.parvulescu):
>  2. Are there any low level "keep-alive" that is already built in? I couldn't 
> find any so far 

yes. it seems you are hitting the `idleTimeout` of the client [0]. could you 
try setting this to a higher value?
I am running your test successfully by only adding the following static block
```
  static {
  System.setProperty("socketTimeout", "12");
  }
```
(I removed the `jetty.withConnectorIdleTimeout` and bumped the nodes up to 75k 
to get it to fail in my machine.)

it would be good to identify if this works first, then we can look at the 
available options. if this setting this as a system propery is not an option 
you could also pass a custom SolrClientCache to the StreamContext with a client 
that is already configured for a higher timeout. 


[0] 
https://github.com/apache/solr/blob/main/solr/solrj-streaming/src/java/org/apache/solr/client/solrj/io/SolrClientCache.java#L50

> Streaming with multiple shards can trigger unexpected IdleTimeout
> -
>
> Key: SOLR-17143
> URL: https://issues.apache.org/jira/browse/SOLR-17143
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 9.4.1
>Reporter: Patson Luk
>Priority: Critical
>
> With the new [test case 
> submitted|https://github.com/cowpaths/fullstory-solr/commit/383134928e372f19d96b1b16459a3566169d3ff4]
>  , we re-produced an issue with streaming in our production cloud 
> environment. 
> The test case creates a collection of 2 shards, which 20k docs are indexed. 
> 10k docs have id with routing prefix `a`, while the other 10k with `c`. Each 
> of those prefix would hash to different shard, producing 2 shards of 10k docs 
> each.
> Now, if we stream by sorting on the id, both shards would send back some data 
> initially, however only one shard (that hosts prefix `a`) will have continued 
> traffic due to the sorted iteration, the other shard would eventually throw 
> {{IdleTimeout}} as the stream was pending w/o network activity.
> If we change the test case `SHARD_COUNT` from 2 to 1, then the case runs 
> fine. 
> In our environment, we have jetty http connector timeout as 120 secs, yet we 
> still run into that occasionally, the client does consume the data in a 
> reasonable rate, however with up to 1024 shards per collection, it's quite 
> easy that some shards might not have data streamed within 120 secs hence 
> triggering the mentioned timeout.
> We assume such issue with streaming is not uncommon for any distributed 
> system, and am wondering what could be done to fix or mitigate that. 
> Several ideas that we have:
> 1. If possible, we might want to stream per shard instead of per collection. 
> However, there are cases that we do want to stream on the whole collection 
> with sorted ordering
> 2. Are there any low level "keep-alive" that is already built in? I couldn't 
> find any so far :)
> 3. Keep the stream alive by pushing small amount of dummy data from the 
> aggregator (the solr node which distributes the stream request as /export to 
> other nodes) but it got very hacky and is still not working. Didn't dig too 
> deep as I wish to surface this issue to the Solr community and gather some 
> thoughts first!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Comment Edited] (SOLR-17143) Streaming with multiple shards can trigger unexpected IdleTimeout

2025-03-11 Thread Alex Deparvu (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17934221#comment-17934221
 ] 

Alex Deparvu edited comment on SOLR-17143 at 3/11/25 1:56 PM:
--

>  2. Are there any low level "keep-alive" that is already built in? I couldn't 
> find any so far 

yes. it seems you are hitting the `idleTimeout` of the client [0]. could you 
try setting this to a higher value?
I am running your test successfully by only adding the following static block 
at line 94 of your test (just before the `@BeforeClass`).

{code}
  static {
  System.setProperty("socketTimeout", "12");
  }
{code}
(I removed the `jetty.withConnectorIdleTimeout` and bumped the nodes up to 75k 
to get it to fail in my machine.)

it would be good to confirm if this works first, then we can look at the 
available options. if this setting this as a system propery is not an option 
you could also pass a custom SolrClientCache to the StreamContext with a client 
that is already configured for a higher timeout. 


[0] 
https://github.com/apache/solr/blob/main/solr/solrj-streaming/src/java/org/apache/solr/client/solrj/io/SolrClientCache.java#L50


was (Author: alex.parvulescu):
>  2. Are there any low level "keep-alive" that is already built in? I couldn't 
> find any so far 

yes. it seems you are hitting the `idleTimeout` of the client [0]. could you 
try setting this to a higher value?
I am running your test successfully by only adding the following static block 
at line 94 of your test (just before the `@BeforeClass`).

{code}
  static {
  System.setProperty("socketTimeout", "12");
  }
{code}
(I removed the `jetty.withConnectorIdleTimeout` and bumped the nodes up to 75k 
to get it to fail in my machine.)

it would be good to identify if this works first, then we can look at the 
available options. if this setting this as a system propery is not an option 
you could also pass a custom SolrClientCache to the StreamContext with a client 
that is already configured for a higher timeout. 


[0] 
https://github.com/apache/solr/blob/main/solr/solrj-streaming/src/java/org/apache/solr/client/solrj/io/SolrClientCache.java#L50

> Streaming with multiple shards can trigger unexpected IdleTimeout
> -
>
> Key: SOLR-17143
> URL: https://issues.apache.org/jira/browse/SOLR-17143
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 9.4.1
>Reporter: Patson Luk
>Priority: Critical
>
> With the new [test case 
> submitted|https://github.com/cowpaths/fullstory-solr/commit/383134928e372f19d96b1b16459a3566169d3ff4]
>  , we re-produced an issue with streaming in our production cloud 
> environment. 
> The test case creates a collection of 2 shards, which 20k docs are indexed. 
> 10k docs have id with routing prefix `a`, while the other 10k with `c`. Each 
> of those prefix would hash to different shard, producing 2 shards of 10k docs 
> each.
> Now, if we stream by sorting on the id, both shards would send back some data 
> initially, however only one shard (that hosts prefix `a`) will have continued 
> traffic due to the sorted iteration, the other shard would eventually throw 
> {{IdleTimeout}} as the stream was pending w/o network activity.
> If we change the test case `SHARD_COUNT` from 2 to 1, then the case runs 
> fine. 
> In our environment, we have jetty http connector timeout as 120 secs, yet we 
> still run into that occasionally, the client does consume the data in a 
> reasonable rate, however with up to 1024 shards per collection, it's quite 
> easy that some shards might not have data streamed within 120 secs hence 
> triggering the mentioned timeout.
> We assume such issue with streaming is not uncommon for any distributed 
> system, and am wondering what could be done to fix or mitigate that. 
> Several ideas that we have:
> 1. If possible, we might want to stream per shard instead of per collection. 
> However, there are cases that we do want to stream on the whole collection 
> with sorted ordering
> 2. Are there any low level "keep-alive" that is already built in? I couldn't 
> find any so far :)
> 3. Keep the stream alive by pushing small amount of dummy data from the 
> aggregator (the solr node which distributes the stream request as /export to 
> other nodes) but it got very hacky and is still not working. Didn't dig too 
> deep as I wish to surface this issue to the Solr community and gather some 
> thoughts first!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17685: Remove script creation of solr url based on SOLR_TOOL_HOST in favour of java code in CLI tools [solr]

2025-03-11 Thread via GitHub


HoustonPutman commented on PR #3223:
URL: https://github.com/apache/solr/pull/3223#issuecomment-2714754015

   > Right... So I can remove SOLR_HOST from the `bin/solr` and `bin/solr.cmd` 
because we default to localhost in the Java code!
   
   SOLR_TOOL_HOST, not SOLR_HOST 😅 but yes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17685: Remove script creation of solr url based on SOLR_TOOL_HOST in favour of java code in CLI tools [solr]

2025-03-11 Thread via GitHub


epugh commented on PR #3223:
URL: https://github.com/apache/solr/pull/3223#issuecomment-2714780914

   > > Right... So I can remove SOLR_HOST from the `bin/solr` and 
`bin/solr.cmd` because we default to localhost in the Java code!
   > 
   > SOLR_TOOL_HOST, not SOLR_HOST 😅 but yes
   
   I was thinking we **COULD** remove SOLR_HOST because we look it up as teh 
default in the CLIUtils, however, I see that we use SOLR_HOST in lots of other 
places in `bin/solr` script...If those were all in Java code, then we might 
not need it.   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Updated] (SOLR-17698) Remove deprecated EnumField

2025-03-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SOLR-17698:
--
Labels: pull-request-available  (was: )

> Remove deprecated EnumField
> ---
>
> Key: SOLR-17698
> URL: https://issues.apache.org/jira/browse/SOLR-17698
> Project: Solr
>  Issue Type: Task
>Affects Versions: main (10.0)
>Reporter: Eric Pugh
>Assignee: Eric Pugh
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> EnumField is deprecated in favour of EnumFieldType, but the code lingers.   
> Solve the linger.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[PR] SOLR-17698: Remove deprecated EnumField [solr]

2025-03-11 Thread via GitHub


epugh opened a new pull request, #3256:
URL: https://github.com/apache/solr/pull/3256

   https://issues.apache.org/jira/browse/SOLR-17698
   
   
   # Description
   
   Removing deprecated EnumField
   
   # Solution
   EnumField has been replaced by EnumFieldType.  Simplified tests and 
supporting test configurations.
   
   
   # Tests
   
   Existing
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17694) LeaderElector not able to parse node ID correctly when it has a leading dash

2025-03-11 Thread Gus Heck (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17934004#comment-17934004
 ] 

Gus Heck commented on SOLR-17694:
-

{quote}I'm inclined to think Solr should reject such node names instead of 
handle it gracefully. The work-around of avoiding such high server IDs seems 
straight-forward.
{quote}
+1

> LeaderElector not able to parse node ID correctly when it has a leading dash
> 
>
> Key: SOLR-17694
> URL: https://issues.apache.org/jira/browse/SOLR-17694
> Project: Solr
>  Issue Type: Bug
>Reporter: Patrick
>Priority: Trivial
>
> This issue was 
> [reported|https://lists.apache.org/thread/0pwxw1rzdffmbxctdzv2rmplzgwt6lpl] 
> on us...@solr.apache.org.
> There could be time (e.g.: ZOOKEEPER-4904) when the node ID contains a 
> leading dash
> {noformat}
> -5188057493699159958-1.1.1.15:8983_solr-n_192189
> {noformat}
> instead of just
> {noformat}
> 5188057493699159958-1.1.1.15:8983_solr-n_192189
> {noformat}
> In such case, LeaderElector.getNodeName returns 
> *5188057493699159958-1.1.1.15:8983_solr* instead of just 
> {*}1.1.1.15:8983_solr{*}.
> The problem is that the regex LeaderElector.NODE_NAME was not designed to 
> handle the leading dash. LeaderElector.LEADER_SEQ and 
> LeaderElector.SESSION_ID seem to have the same problem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Created] (SOLR-17697) Use picocli instead of commons-cli

2025-03-11 Thread Jira
Jan Høydahl created SOLR-17697:
--

 Summary: Use picocli instead of commons-cli
 Key: SOLR-17697
 URL: https://issues.apache.org/jira/browse/SOLR-17697
 Project: Solr
  Issue Type: Improvement
  Components: cli
Reporter: Jan Høydahl


Apache commons-cli has served us well for years, but our CLI has out-grown its 
capabilities, with multiple sub commands and a plethora of options and 
arguments. We have much custom code to work around limitations.

By embracing [Picocli|https://picocli.info/], an annotation based cli 
framework, it will be easier to maintain the cli and add further tools. We 
propose to target 10.0 and do work gradually in a feature branch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Resolved] (SOLR-17537) Let Curator handle state compression in ZooKeeper

2025-03-11 Thread Houston Putman (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Houston Putman resolved SOLR-17537.
---
Fix Version/s: main (10.0)
 Assignee: Houston Putman
   Resolution: Fixed

> Let Curator handle state compression in ZooKeeper
> -
>
> Key: SOLR-17537
> URL: https://issues.apache.org/jira/browse/SOLR-17537
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Houston Putman
>Assignee: Houston Putman
>Priority: Major
>  Labels: pull-request-available
> Fix For: main (10.0)
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Curator has an interface {{{}CompressionProvider{}}}, that can be provided 
> when building a {{CuratorFramework}} client. We can create a 
> CompressionProvider that wraps our custom compression logic, so that Curator 
> will handle it for us.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] POC: Test picocli [solr]

2025-03-11 Thread via GitHub


janhoy commented on PR #3247:
URL: https://github.com/apache/solr/pull/3247#issuecomment-2711920203

   Eric, you have really lifted the CLI in the last versions, none of which is 
wasted if/when moving to picocli!
   
   Also, we have pretty good test coverage in bats tests, which is a great way 
to validate parity. It's not a goal to retain the exact same help/usage output. 
Let the tool generate based on best practices. However, given a CLI invocation 
`MYENV=123 bin/solr foo --bar baz` it should result in the same tool behavior 
as before. It could be a 10.0 only or also a 9.x feature depending on whether 
we feel that tool usage output is part of our back-compat guarantees. Perhaps 
10.0 is safest, although backporting CLI bugfixes will be harder if 9x is still 
on commons-cli.
   
   I'd be interested in chipping away on this in a central collaborative 
feature branch over the course of a several weeks..


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[PR] (no intended to be merged to branch_9_8) SOLR-17626: add RawTFSimilarity[Factory] class(es) [solr]

2025-03-11 Thread via GitHub


cpoerschke opened a new pull request, #3255:
URL: https://github.com/apache/solr/pull/3255

   The `RawTFSimilarityFactory` from (not yet released) Solr 9.9 backported to 
9.8 branch with a local `RawTFSimilarity` variant since Solr 9.8's Lucene 
version does not have the `RawTFSimilarity` class.
   
   https://issues.apache.org/jira/browse/SOLR-17626


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17626) add RawTFSimilarityFactory class

2025-03-11 Thread Christine Poerschke (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17934295#comment-17934295
 ] 

Christine Poerschke commented on SOLR-17626:


https://github.com/apache/solr/pull/3255 is not-intended-to-be-merged to 
branch_9_8 but shares how one might have `RawTFSimilarityFactory` from (not yet 
released) Solr 9.9 backported to 9.8 branch with a local `RawTFSimilarity` 
variant since Solr 9.8's Lucene version does not have the `RawTFSimilarity` 
class.

> add RawTFSimilarityFactory class
> 
>
> Key: SOLR-17626
> URL: https://issues.apache.org/jira/browse/SOLR-17626
> Project: Solr
>  Issue Type: Task
>Reporter: Christine Poerschke
>Assignee: Christine Poerschke
>Priority: Minor
>  Labels: pull-request-available
> Fix For: main (10.0), 9.9
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Factory class for the RawTFSimilarity added in Lucene 9.12 version.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] (no intended to be merged to branch_9_8) SOLR-17626: add RawTFSimilarity[Factory] class(es) [solr]

2025-03-11 Thread via GitHub


cpoerschke closed pull request #3255: (no intended to be merged to branch_9_8) 
SOLR-17626: add RawTFSimilarity[Factory] class(es)
URL: https://github.com/apache/solr/pull/3255


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Resolved] (SOLR-17695) Misleading warning in bin/solr start -e examples about solr url format

2025-03-11 Thread Eric Pugh (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Pugh resolved SOLR-17695.
--
Fix Version/s: 9.9
   Resolution: Fixed

> Misleading warning in bin/solr start -e examples about solr url format
> --
>
> Key: SOLR-17695
> URL: https://issues.apache.org/jira/browse/SOLR-17695
> Project: Solr
>  Issue Type: Improvement
>  Components: cli
>Affects Versions: 9.8
>Reporter: Eric Pugh
>Assignee: Eric Pugh
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 9.9
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We look up the solr host from Solr, and the "baseUrl" has a trailing /solr, 
> and then in the CLI we flag that you aren't supposed to do this, but since 
> the user isn't submitting this value, it's missleading!
>  
> This cleans up what we output to the console.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17698: Remove deprecated EnumField [solr]

2025-03-11 Thread via GitHub


epugh commented on PR #3256:
URL: https://github.com/apache/solr/pull/3256#issuecomment-2715392532

   Looks like we could eliminate `AbstractEnumField` now?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[PR] SOLR-17655: Mark ExternalFileField as deprecated [solr]

2025-03-11 Thread via GitHub


epugh opened a new pull request, #3257:
URL: https://github.com/apache/solr/pull/3257

   https://issues.apache.org/jira/browse/SOLR-17655
   
   
   # Description
   
   Mark ExternalFileField as deprecated.
   
   # Solution
   
   This pairs with https://github.com/apache/solr/pull/3244, and is meant to go 
on `branch_9x` to mark this as deprecated.  Then #3244 can be committed to 
`main`.
   
   # Tests
   
   no change.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17685: Remove script creation of solr url based on SOLR_TOOL_HOST in favour of java code in CLI tools [solr]

2025-03-11 Thread via GitHub


epugh commented on PR #3223:
URL: https://github.com/apache/solr/pull/3223#issuecomment-2715442544

   Successful build!If anyone wants to eyeball solr.cmd, that would be 
great.   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17607) HTTP ClusterStateProvider should defer talking to Solr until first use

2025-03-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17934398#comment-17934398
 ] 

ASF subversion and git services commented on SOLR-17607:


Commit ff0f9050075543d64d6ccc760466a233c00863e8 in solr's branch 
refs/heads/branch_9x from David Smiley
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=ff0f9050075 ]

SOLR-17607: Http ClusterStateProvider, lazy connect (#3249)

i.e. stop eagerly connecting when CloudSolrClient is created.
Update urlScheme on successful getLiveNodes

(cherry picked from commit 9918ee640dce7123ff51eb38d9eb1c6f4e06b487)


> HTTP ClusterStateProvider should defer talking to Solr until first use
> --
>
> Key: SOLR-17607
> URL: https://issues.apache.org/jira/browse/SOLR-17607
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud, SolrJ
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> When using CloudSolrClient with HTTP URLs to get the ClusterState (HTTP 
> ClusterStateProvider), it will talk to Solr when the client is created to 
> fetch the live nodes.  But maybe Solr isn't available at this time; it's 
> annoying to require that Solr is available at the time that the 
> CloudSolrClient is constructed.  The ZK one is deferred till first use; I 
> propose the same behavior for HTTP.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Resolved] (SOLR-17607) HTTP ClusterStateProvider should defer talking to Solr until first use

2025-03-11 Thread David Smiley (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley resolved SOLR-17607.
-
Fix Version/s: 9.9
   Resolution: Fixed

> HTTP ClusterStateProvider should defer talking to Solr until first use
> --
>
> Key: SOLR-17607
> URL: https://issues.apache.org/jira/browse/SOLR-17607
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud, SolrJ
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 9.9
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> When using CloudSolrClient with HTTP URLs to get the ClusterState (HTTP 
> ClusterStateProvider), it will talk to Solr when the client is created to 
> fetch the live nodes.  But maybe Solr isn't available at this time; it's 
> annoying to require that Solr is available at the time that the 
> CloudSolrClient is constructed.  The ZK one is deferred till first use; I 
> propose the same behavior for HTTP.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Updated] (SOLR-17515) Recovery fails in Solr 9.7.0 if basic-auth is enabled

2025-03-11 Thread Houston Putman (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Houston Putman updated SOLR-17515:
--
Fix Version/s: 9.8

> Recovery fails in Solr 9.7.0 if basic-auth is enabled
> -
>
> Key: SOLR-17515
> URL: https://issues.apache.org/jira/browse/SOLR-17515
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 9.7
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
>  Labels: pull-request-available
> Fix For: 9.8, 9.7.1
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Several reporters on the users@ list, recently shared a bug they noticed on 
> upgrading to Solr 9.7.  Replicas would try to recover, but fail with a 
> NullPointerException:
> {code}
> 2024-09-18 09:36:31.238 ERROR 
> (recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr 
> dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts 
> s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:] 
> o.a.s.c.RecoveryStrategy Error while trying to recover. 
> core=dovecot_fts_shard5_replica_n61 => java.lang.NullPointerException: Cannot 
> invoke 
> "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
>  because "this.authenticationStore" is null
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
> java.lang.NullPointerException: Cannot invoke 
> "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
>  because "this.authenticationStore" is null
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:97)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:85)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.httpClientBuilderSetup(Http2SolrClient.java:1093)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.build(Http2SolrClient.java:1062)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:907)
>  ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:633)
>  ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:333) 
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
> - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:309) 
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
> - 2024-09-03 15:05:20]
> ...
> {code}
> It turns out that the issue isn't specific to upgrading clusters: *any 9.7.0 
> cluster (new or existing/upgrading) that uses basic-auth will hit this NPE on 
> during replica recovery*.  The result is that replicas will fail to recover, 
> and sit marked as "recovering" indefinitely.
> The issue can be reproduced locally in a source-checkout using the following 
> steps:
> {code}
> git checkout branch_9_7
> ./gradlew clean assemble
> cd solr/packaging/build/solr-9.7.0-SNAPSHOT
> # At prompts, I chose: 4 nodes, "gettingstarted", 1 shard, 2 replicas, 
> "_default" configset
> bin/solr start -e cloud
> bin/solr post -c gettingstarted example/exampledocs/books.json
> # Stop the node containing the non-leader replica
> bin/solr stop -p 
> bin/solr post -c gettingstarted example/exampledocs/books.csv
> # Enable auth and trigger recovery by turning the node back on
> bin/solr auth enable -type basicAuth -credentials solr:solrRocks 
> -blockUnknown true
> # This line will need tweaked based on which Solr node was previously stopped
> "bin/solr" start --cloud -p  -s "example/cloud//solr" -z 
> 127.0.

[jira] [Closed] (SOLR-17515) Recovery fails in Solr 9.7.0 if basic-auth is enabled

2025-03-11 Thread Houston Putman (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Houston Putman closed SOLR-17515.
-

> Recovery fails in Solr 9.7.0 if basic-auth is enabled
> -
>
> Key: SOLR-17515
> URL: https://issues.apache.org/jira/browse/SOLR-17515
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 9.7
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
>  Labels: pull-request-available
> Fix For: 9.8, 9.7.1
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Several reporters on the users@ list, recently shared a bug they noticed on 
> upgrading to Solr 9.7.  Replicas would try to recover, but fail with a 
> NullPointerException:
> {code}
> 2024-09-18 09:36:31.238 ERROR 
> (recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr 
> dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts 
> s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:] 
> o.a.s.c.RecoveryStrategy Error while trying to recover. 
> core=dovecot_fts_shard5_replica_n61 => java.lang.NullPointerException: Cannot 
> invoke 
> "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
>  because "this.authenticationStore" is null
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
> java.lang.NullPointerException: Cannot invoke 
> "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
>  because "this.authenticationStore" is null
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:97)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:85)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.httpClientBuilderSetup(Http2SolrClient.java:1093)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.build(Http2SolrClient.java:1062)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:907)
>  ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:633)
>  ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:333) 
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
> - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:309) 
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
> - 2024-09-03 15:05:20]
> ...
> {code}
> It turns out that the issue isn't specific to upgrading clusters: *any 9.7.0 
> cluster (new or existing/upgrading) that uses basic-auth will hit this NPE on 
> during replica recovery*.  The result is that replicas will fail to recover, 
> and sit marked as "recovering" indefinitely.
> The issue can be reproduced locally in a source-checkout using the following 
> steps:
> {code}
> git checkout branch_9_7
> ./gradlew clean assemble
> cd solr/packaging/build/solr-9.7.0-SNAPSHOT
> # At prompts, I chose: 4 nodes, "gettingstarted", 1 shard, 2 replicas, 
> "_default" configset
> bin/solr start -e cloud
> bin/solr post -c gettingstarted example/exampledocs/books.json
> # Stop the node containing the non-leader replica
> bin/solr stop -p 
> bin/solr post -c gettingstarted example/exampledocs/books.csv
> # Enable auth and trigger recovery by turning the node back on
> bin/solr auth enable -type basicAuth -credentials solr:solrRocks 
> -blockUnknown true
> # This line will need tweaked based on which Solr node was previously stopped
> "bin/solr" start --cloud -p  -s "example/cloud//solr" -z 
> 127.0.0.1:9983
> {code}



--
T

[jira] [Closed] (SOLR-17530) Prometheus response writer didn't recognize TLOG & PULL replicas

2025-03-11 Thread Houston Putman (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Houston Putman closed SOLR-17530.
-
Assignee: David Smiley

> Prometheus response writer didn't recognize TLOG & PULL replicas
> 
>
> Key: SOLR-17530
> URL: https://issues.apache.org/jira/browse/SOLR-17530
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 9.7
>Reporter: Matthew Biscocho
>Assignee: David Smiley
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 9.8, 9.7.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The regex pattern for Solr cloud mode assumed all core names ended with a 
> {{replica_n[0-9]+}} which is incorrect. Some core names should be able to 
> have any single character letter before the numbers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Updated] (SOLR-17530) Prometheus response writer didn't recognize TLOG & PULL replicas

2025-03-11 Thread Houston Putman (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Houston Putman updated SOLR-17530:
--
Fix Version/s: 9.8

> Prometheus response writer didn't recognize TLOG & PULL replicas
> 
>
> Key: SOLR-17530
> URL: https://issues.apache.org/jira/browse/SOLR-17530
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 9.7
>Reporter: Matthew Biscocho
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 9.8, 9.7.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The regex pattern for Solr cloud mode assumed all core names ended with a 
> {{replica_n[0-9]+}} which is incorrect. Some core names should be able to 
> have any single character letter before the numbers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[PR] SOLR-17651: Make sure CLI unit tests don't call System.exit() [solr]

2025-03-11 Thread via GitHub


psalagnac opened a new pull request, #3258:
URL: https://github.com/apache/solr/pull/3258

   https://issues.apache.org/jira/browse/SOLR-17651
   
   
   # Description
   
   End goal is to make sure unit tests for CLI tools never make a call to 
`System.exit()`.
   
   This introduces class `ToolRuntime` which is initiated and passed as a 
context for all CLI tools. When running tests, a sub-classes is used instead 
and blocks calls to `System.exit()`.
   
   For now, I did only the minimal (large enough!) change to intercept calls to 
`System.exit()`. The newly introduced `ToolRuntime` may be leveraged later to 
remove some static calls, I'm mostly thinking about IOs which are pretty messy 
for now. We could improve testability by always using the runtime context for 
outputs.
   
   # Tests
   
   `BasicAuthIntegrationTest` was invoking `System.exit()`, which causes a very 
long timeout when running with Gradle. I figured out this test was doing what 
it should (will comment inline). I added `StatusToolTest` to keep same coverage.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Assigned] (SOLR-17651) Obfuscate System.exit(), remove direct usage

2025-03-11 Thread Pierre Salagnac (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pierre Salagnac reassigned SOLR-17651:
--

Assignee: Pierre Salagnac

> Obfuscate System.exit(), remove direct usage
> 
>
> Key: SOLR-17651
> URL: https://issues.apache.org/jira/browse/SOLR-17651
> Project: Solr
>  Issue Type: Improvement
>  Components: cli, Tests
>Reporter: Houston Putman
>Assignee: Pierre Salagnac
>Priority: Major
>
> In SOLR-17379, we are removing the {{TestSecurityManager}} defined in Lucene 
> that prohibits the usage of System.exit() in unit testing. This is important, 
> since gradle will fail with very little information when a test does a 
> System.exit() call.
> In Solr, the CLI will use System.exit() to fail and show the user. This makes 
> sense as a CLI, but not in a unit test. It would be great if we could create 
> a utility that can be called in Solr to mimic System.exit(), but in unit 
> testing throw an exception instead. Then we could add System.exit() to 
> forbidden-apis, and ensure that our testing code won't run into this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Updated] (SOLR-17651) Obfuscate System.exit(), remove direct usage

2025-03-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SOLR-17651:
--
Labels: pull-request-available  (was: )

> Obfuscate System.exit(), remove direct usage
> 
>
> Key: SOLR-17651
> URL: https://issues.apache.org/jira/browse/SOLR-17651
> Project: Solr
>  Issue Type: Improvement
>  Components: cli, Tests
>Reporter: Houston Putman
>Assignee: Pierre Salagnac
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In SOLR-17379, we are removing the {{TestSecurityManager}} defined in Lucene 
> that prohibits the usage of System.exit() in unit testing. This is important, 
> since gradle will fail with very little information when a test does a 
> System.exit() call.
> In Solr, the CLI will use System.exit() to fail and show the user. This makes 
> sense as a CLI, but not in a unit test. It would be great if we could create 
> a utility that can be called in Solr to mimic System.exit(), but in unit 
> testing throw an exception instead. Then we could add System.exit() to 
> forbidden-apis, and ensure that our testing code won't run into this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SolrJ ResponseParser API improvements, minor [solr]

2025-03-11 Thread via GitHub


dsmiley merged PR #3248:
URL: https://github.com/apache/solr/pull/3248


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] Reimplement CancellableQueryTracker [solr]

2025-03-11 Thread via GitHub


reubent closed pull request #3250: Reimplement CancellableQueryTracker
URL: https://github.com/apache/solr/pull/3250


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17607) HTTP ClusterStateProvider should defer talking to Solr until first use

2025-03-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17934372#comment-17934372
 ] 

ASF subversion and git services commented on SOLR-17607:


Commit 9918ee640dce7123ff51eb38d9eb1c6f4e06b487 in solr's branch 
refs/heads/main from David Smiley
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=9918ee640dc ]

SOLR-17607: Http ClusterStateProvider, lazy connect (#3249)

i.e. stop eagerly connecting when CloudSolrClient is created.
Update urlScheme on successful getLiveNodes

> HTTP ClusterStateProvider should defer talking to Solr until first use
> --
>
> Key: SOLR-17607
> URL: https://issues.apache.org/jira/browse/SOLR-17607
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud, SolrJ
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> When using CloudSolrClient with HTTP URLs to get the ClusterState (HTTP 
> ClusterStateProvider), it will talk to Solr when the client is created to 
> fetch the live nodes.  But maybe Solr isn't available at this time; it's 
> annoying to require that Solr is available at the time that the 
> CloudSolrClient is constructed.  The ZK one is deferred till first use; I 
> propose the same behavior for HTTP.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17607: Http ClusterStateProvider, lazy connect [solr]

2025-03-11 Thread via GitHub


dsmiley merged PR #3249:
URL: https://github.com/apache/solr/pull/3249


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Resolved] (SOLR-17696) Support Jetty 12 for Solr latest

2025-03-11 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SOLR-17696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl resolved SOLR-17696.

Resolution: Duplicate

> Support Jetty 12 for Solr latest
> 
>
> Key: SOLR-17696
> URL: https://issues.apache.org/jira/browse/SOLR-17696
> Project: Solr
>  Issue Type: Improvement
>Reporter: Dhoka Pramod
>Priority: Major
>
> Need the latest solr to support jetty-client-12.0.x. 
> Jetty 10 and Jetty 11 End of support.
> [https://github.com/jetty/jetty.project/issues/10485]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17688: Http2SolrClient: use Request.Listener not HttpListenerFactory [solr]

2025-03-11 Thread via GitHub


dsmiley commented on PR #3233:
URL: https://github.com/apache/solr/pull/3233#issuecomment-2715838346

   Thanks for your feedback Alex!
   
   > there are some places where we create new clients based on existing 
clients settings, so would we need to make sure the listeners are being copied 
over? (not sure if this is a problem here, I remember running into this pattern 
a while back).
   
   I *think* this will be less of an issue now because the HttpClient (thus 
everything attached to it) goes along for the ride.  This PR shows that we no 
longer need to copy them when the httpClient is provided in the builder.  But I 
suppose that if a user creates an HttpClient on their own for whatever reason, 
then the onus is on them to configure it with whatever listeners are desired; 
previously this was separate.  But that's true with all the other settings in 
the builder associated with the HttpClient construction (e.g. everything 
createHttpClient looks at in the builder).  It underscores a point that if you 
provide an HttpClient to the builder, it's an expert use-case; you know what 
you are doing because we don't document every last setting, wether it applies 
or not if someone provides the client.
   
   > removing a solr abstraction in favor of a jetty abstraction brings the 
code closer to jetty clients by creating this hard dependency. and while I have 
not been around enought to tell, it feels like long term it would be better to 
depend less on specific http clients, giving the project some freedom in 
picking a new impl if needed (would this work with the new jdk http client?).
   
   I've only seen one switch in the project's history (Apache -> Jetty) so I 
don't think it's worth insulating the differences.  New abstractions are not 
free; abstractions should be documented as well; don't benefit from knowledge 
the user might already have in Jetty and they can get in the way.  I'd rather 
see Solr reduce code wherever it can; removing frivolous abstractions is one.  
Solr's listener factory thing being removed here doesn't apply to the 
HttpJdkSolrClient (or else you'd have seen the impact in your review).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Created] (SOLR-17699) High Query Time in Solr When Using OR with frange in fq

2025-03-11 Thread Puneet Sharma (Jira)
Puneet Sharma created SOLR-17699:


 Summary: High Query Time in Solr When Using OR with frange in fq
 Key: SOLR-17699
 URL: https://issues.apache.org/jira/browse/SOLR-17699
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 9.6.1
Reporter: Puneet Sharma


h3. High Query Time in Solr When Using {{OR}} with {{frange}} in {{fq}}

I am experiencing high query times in Solr when using an {{fq}} filter that 
combines an {{OR}} condition with {{{}frange{}}}. The response time 
significantly increases compared to queries that do not use this combination.
h4. Query Example

 
{{fq=\{!cache=false tag=prm}field:value OR \{!frange l=1 u=1 v=$funcQuery}}}
Here, {{$funcQuery}} is a function query that retrieves documents dynamically 
based on dynamic parameters.
h4. Observations
 * When I use just {{{}{!frange l=1 u=1 v=$funcQuery}{}}}, the query executes 
quickly[20ms].
 * When I use {{{!cache=false tag=prm}field:value}} alone, the query is also 
fast.
 * However, when combining both with {{{}OR{}}}, the query time increases 
significantly [700ms].
 * The dataset is relatively large, but other queries with similar complexity 
run efficiently.

h4. Questions
 # Why does the {{OR}} operation with {{frange}} cause a significant increase 
in query time?
 # Are there any optimizations or alternative query structures that could 
improve performance?
 # Would it help to restructure how function queries and filters are applied in 
this case?
 # Is there any alternative for frange for this use case because in our use 
case, this cannot be done at index time because the function parameters change 
very frequently?

Any insights or suggestions would be greatly appreciated!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17562) Unify v2 API streaming support

2025-03-11 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17934354#comment-17934354
 ] 

David Smiley commented on SOLR-17562:
-

This was merged; shouldn't the issue be closed against the corresponding 
release [~gerlowskija] 

In retrospect having looked at the changes here and thinking about Jackson, I'm 
inclined to propose something different.  It's the job of the ResponseParser to 
parse the response (obviously).  InputStreamResponseParser is a *very* special 
response parser that I don't think should be used unless you _need_ to process 
the response _after_ the SolrClient.request method returns.  It came into 
existence with streaming expressions, which makes sense as you have this stream 
that outlives the request method.  Note that InputStreamResponseParser 
short-circuits lots of error checking that the Http SolrClients do, thus 
putting the responsibility of the user of InputStreamResponser to do (or forget 
to do) that.  We could have a general JacksonResponseParser, instantiated with 
a Class instance to tell Jackson what Java class to instantiate based on the 
JSON.  Then this gets placed onto a holder NamedList as "response" to satisfy 
the ResponseParser contract.  Use of Map would mean to basically replicate 
JsonMapResponseParser functionally.  Not using the special 
InputStreamResponseParser means SolrClient's error handling would work if the 
server returns an error instead of the intended object.  JacksonParsingResponse 
(that which implements SolrResponse) would not exist; the SolrResponse is to 
hold the response, not to parse it.  I suppose a subclass of SimpleSolrResponse 
that trivially returns the "response" value as the expected type via a 
getParsed method would be used instead.

> Unify v2 API streaming support
> --
>
> Key: SOLR-17562
> URL: https://issues.apache.org/jira/browse/SOLR-17562
> Project: Solr
>  Issue Type: Improvement
>  Components: v2 API
>Reporter: Jason Gerlowski
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Several v2 APIs return raw files or streams of data, including: 
> {{ZooKeeperReadAPI}}, {{NodeFileStore}}, and {{CoreReplication.fetchFile}}.
> But the APIs vary slightly in how they support this: ZooKeeperReadAPI uses 
> the deprecated "ContentStream" with "RawResponseWriter", NodeFileStore 
> directly attaches a "SolrCore.RawWriter" to the underlying SolrQueryResponse, 
> and CoreReplication follows the JAX-RS best practice of using the 
> "StreamingOutput" interface.
> This ticket aims to align all of these approaches and document our approach 
> in {{dev-docs/apis.adoc}} or a similar file.
> The preferred approach ([see discussion 
> here|https://github.com/apache/solr/pull/2734]) at the time of writing is to 
> use StreamingOutput.  If this doesn't change, this ticket will need to:
> * modify our Java codegen template and related code to support this new 
> response type.  (Java codegen currently requires that all responses subclass 
> SolrJerseyResponse)
> * remove the "x-omitFromCodegen" tag from any APIs using StreamingOutput (see 
> ReplicationApis.fetchFile for an example)
> * Switch other raw-file/streaming APIs over to using StreamingOutput.  
> Validate v1 and v2 responses. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] Deprecations [solr]

2025-03-11 Thread via GitHub


dsmiley commented on PR #3159:
URL: https://github.com/apache/solr/pull/3159#issuecomment-2715871761

   Looking to merge this maybe tomorrow; certainly won't be the last of 
deprecations.  I'll end up creating another PR.
   Some changes here are not just deprecations but using a new thing, but 
should be 9x compatible anyway.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17651: Make sure CLI unit tests don't call System.exit() [solr]

2025-03-11 Thread via GitHub


psalagnac commented on code in PR #3258:
URL: https://github.com/apache/solr/pull/3258#discussion_r1990122985


##
solr/core/src/test/org/apache/solr/security/BasicAuthIntegrationTest.java:
##
@@ -298,23 +295,10 @@ public void testBasicAuth() throws Exception {
   verifySecurityStatus(cl, baseUrl + "/admin/info/key", "key", 
NOT_NULL_PREDICATE, 20);
   assertAuthMetricsMinimums(17, 8, 8, 1, 0, 0);
 
-  String[] toolArgs =
-  new String[] {"status", "--solr-url", baseUrl, "--credentials", 
"harry:HarryIsUberCool"};
-  ByteArrayOutputStream baos = new ByteArrayOutputStream();
-  PrintStream stdoutSim = new PrintStream(baos, true, 
StandardCharsets.UTF_8.name());
-  StatusTool tool = new StatusTool(stdoutSim);
-  try {
-tool.runTool(SolrCLI.processCommandLineArgs(tool, toolArgs));
-Map obj = (Map) Utils.fromJSON(new 
ByteArrayInputStream(baos.toByteArray()));
-assertTrue(obj.containsKey("version"));
-assertTrue(obj.containsKey("startTime"));
-assertTrue(obj.containsKey("uptime"));
-assertTrue(obj.containsKey("memory"));
-  } catch (Exception e) {
-log.error(
-"StatusTool failed due to: {}; stdout from tool prior to failure: 
{}",
-e,
-baos.toString(StandardCharsets.UTF_8.name())); // nowarn
+  String[] toolArgs = new String[] {"status", "--solr-url", baseUrl};

Review Comment:
   Here, I removed credential parameters to force a failure of the tool.
   
   My understanding is these parameters were wrongly added in #3154. Before 
that change, first execution of the tool was failing, and the test expects that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [I] Solr registers with incorrect port 80 in Zookeeper [solr-operator]

2025-03-11 Thread via GitHub


edaemon commented on issue #84:
URL: https://github.com/apache/solr-operator/issues/84#issuecomment-2715771545

   I'm seeing a similar issue with version 0.9.0, but I'm wondering if there's 
some configuration I'm missing. The problem I'm having is that requests to the 
common service seem to internally rely on a connection to the headless service, 
but those connections are attempted over port 80 instead of 8983. What would 
cause it to use port 80 for the headless service requests and can that be 
configured?
   
   The headless service is correctly listening on port 8983:
   
   ```
   NAME  TYPE
CLUSTER-IP   EXTERNAL-IP   PORT(S)
AGE
   [redacted]-solr-solrcloud-headlessClusterIP   
None 8983/TCP   
3m18s
   ```
   
   However, when I interact with the common service to try and add a collection 
I get an error like this:
   Error message
   
   
   ```
 Creating collection [redacted] failed with error code 400: Solr HTTP 
error: OK (400)
 {
   "responseHeader":{
 "status":400,
 "QTime":931},
   "failure":{
 
"[redacted]-solr-solrcloud-0.[redacted]-solr-solrcloud-headless.[redacted]:80_solr":"org.apache.solr.client.solrj.SolrServerException:
 Server refused connection at: 
http://[redacted]-solr-solrcloud-0.[redacted]-solr-solrcloud-headless.[redacted]:80/solr"},
   "Operation create caused 
exception:":"org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
 Underlying core creation failed while creating collection: [redacted]",
   "exception":{
 "msg":"Underlying core creation failed while creating collection: 
[redacted]",
 "rspCode":400},
   "error":{
 "metadata":[
   "error-class","org.apache.solr.common.SolrException",
   "root-error-class","org.apache.solr.common.SolrException"],
 "msg":"Underlying core creation failed while creating collection: 
[redacted]",
 "code":400}}
   ```
   
   

   
   If I try to create a collection directly with the headless service over port 
8983 I get the exact same behavior. Here's the SolrCloud YAML I'm using:
   
   SolrCloud YAML
   
   
   ```
   # Solr Cloud for the Drupal environment; relies on Solr Operator
   ---
   apiVersion: solr.apache.org/v1beta1
   kind: SolrCloud
   metadata:
 name: "{{ .Release.Name }}-solr"
   spec:
 dataStorage:
   persistent:
 reclaimPolicy: "{{ .Values.solr.reclaimPolicy }}"
 pvcTemplate:
   spec:
 storageClassName: "ebs-sc"
 resources:
   requests:
 storage: "{{ .Values.solr.storage }}"
 replicas: {{ .Values.solr.replicas }}
 solrImage:
   tag: {{ .Values.solr.solrImage }}
 solrJavaMem: "{{ .Values.solr.solrJavaMem }}"
 solrModules:
   - jaegertracer-configurator
   - ltr
 customSolrKubeOptions:
   podOptions:
 resources:
   limits:
 memory: "{{ .Values.solr.resources.limits.memory }}"
   requests:
 cpu: "{{ .Values.solr.resources.requests.cpu }}"
 memory: "{{ .Values.solr.resources.requests.memory }}"
 zookeeperRef:
   provided:
 chroot: "/this/will/be/auto/created"
 persistence:
   spec:
 storageClassName: "ebs-sc"
 resources:
   requests:
 storage: "{{ .Values.solr.zookeeper.storage }}"
 replicas: {{ .Values.solr.zookeeper.replicas }}
 zookeeperPodPolicy:
   resources:
 limits:
   memory: "{{ .Values.solr.zookeeper.resources.limits.memory }}"
 requests:
   cpu: "{{ .Values.solr.zookeeper.resources.requests.cpu }}"
   memory: "{{ .Values.solr.zookeeper.resources.requests.memory }}"
 solrOpts: "{{ .Values.solr.solrOpts }}"
 solrGCTune: "{{ .Values.solr.solrGCTune }}"
 solrAddressability:
   commonServicePort: 8983
   ```
   
   
   
   
   I also tried setting `SolrCloud.Spec.solrAddressability.podPort` to 80 and 
that did change the port that the headless service was listening on to 80 which 
may have worked, but that threw errors because the port is privileged and the 
pod wasn't able to use it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[PR] Reimplement CancellableQueryTracker [solr]

2025-03-11 Thread via GitHub


reubent opened a new pull request, #3250:
URL: https://github.com/apache/solr/pull/3250

   I don't have access to JIRA to create a ticket but please see below. I've 
posted in the users list but not received any replies to I've attempted to fix 
myself.
   
   
   
   
   # Description
   
   We noticed increasing memory usage over long periods of time in our Solr 
Clouds. Running some synthetic tests against such showed that it was directly 
linked to the number of abnormally terminated requests received - either 
exceeding resource limits or client timeout closed connection.
   
   Investigation showed that the length of the response received to the 
tasks/list endpoints was growing. The resource leak was minimal but the time 
spent processing them was rising therefore a cleanup mechanism was needed.
   
   Further investigation of why they were not being removed showed that when 
code ran out of time to run the calls to `checkLimitsBefore` would cause them 
to return before the finaliser that removes the active query from the list 
runs. This issue only appears to affect 
   1. Explicitly set query IDs
   2. Queries sent to shards which get the node name prepended to the generated 
query ID when adding to the list but not when being removed
   
   # Solution
   
   1. Prevent CancellableQueryTracker from filling with abnormally terminated 
requests by adding a timeout mechanism and making it configurable
   3. Improve logging in that class
   4. Add tests for that class
   5. Add an additional release for activeQueries when the request has been 
finished early
   
   Additionally there are clearly issues with the cancellation mechanism more 
generally that I'm not going to tackle now but am more than happy to look at in 
more depth later. Namely:
   
   1. Multithreaded queries are not cancellable
   2. Cancellability is dependent on the cancellation being sent to the same 
node that created the task to be cancelled. It also seems there's something 
weird with shard selection going on here too but I've not got to the bottom of 
it.
   3. activeTaskList is again node dependent so the same UUID can be 
simultaneously sent to two nodes
   
   # Tests
   
   I've added tests for the class and run nearly half a million "broken" and 
half a million non broken requests against a cloud of four nodes 
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [X ] I have reviewed the guidelines for [How to 
Contribute](https://github.com/apache/solr/blob/main/CONTRIBUTING.md) and my 
code conforms to the standards described there to the best of my ability.
   - [ ] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [ ] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended, not available for 
branches on forks living under an organisation)
   - [X ] I have developed this patch against the `main` branch.
   - [X ] I have run `./gradlew check`.
   - [X ] I have added tests for my changes.
   - [ n/a] I have added documentation for the [Reference 
Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[PR] Reimplement CancellableQueryTracker [solr]

2025-03-11 Thread via GitHub


reubent opened a new pull request, #3251:
URL: https://github.com/apache/solr/pull/3251

   I don't have access to JIRA to create a ticket but please see below. I've 
posted in the users list but not received any replies to I've attempted to fix 
myself.
   
   
   
   
   # Description
   
   We noticed increasing memory usage over long periods of time in our Solr 
Clouds. Running some synthetic tests against such showed that it was directly 
linked to the number of abnormally terminated requests received - either 
exceeding resource limits or client timeout closed connection.
   
   Investigation showed that the length of the response received to the 
tasks/list endpoints was growing. The resource leak was minimal but the time 
spent processing them was rising therefore a cleanup mechanism was needed.
   
   Further investigation of why they were not being removed showed that when 
code ran out of time to run the calls to `checkLimitsBefore` would cause them 
to return before the finaliser that removes the active query from the list 
runs. This issue only appears to affect 
   1. Explicitly set query IDs
   2. Queries sent to shards which get the node name prepended to the generated 
query ID when adding to the list but not when being removed
   
   # Solution
   
   1. Prevent CancellableQueryTracker from filling with abnormally terminated 
requests by adding a timeout mechanism and making it configurable
   3. Improve logging in that class
   4. Add tests for that class
   5. Add an additional release for activeQueries when the request has been 
finished early
   
   Additionally there are clearly issues with the cancellation mechanism more 
generally that I'm not going to tackle now but am more than happy to look at in 
more depth later. Namely:
   
   1. Multithreaded queries are not cancellable
   2. Cancellability is dependent on the cancellation being sent to the same 
node that created the task to be cancelled. It also seems there's something 
weird with shard selection going on here too but I've not got to the bottom of 
it.
   3. activeTaskList is again node dependent so the same UUID can be 
simultaneously sent to two nodes
   
   # Tests
   
   I've added tests for the class and run nearly half a million "broken" and 
half a million non broken requests against a cloud of four nodes 
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [X ] I have reviewed the guidelines for [How to 
Contribute](https://github.com/apache/solr/blob/main/CONTRIBUTING.md) and my 
code conforms to the standards described there to the best of my ability.
   - [ ] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [ ] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended, not available for 
branches on forks living under an organisation)
   - [X ] I have developed this patch against the `main` branch.
   - [X ] I have run `./gradlew check`.
   - [X ] I have added tests for my changes.
   - [ n/a] I have added documentation for the [Reference 
Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17537: Manage ZK Compression through Curator [solr]

2025-03-11 Thread via GitHub


HoustonPutman merged PR #2849:
URL: https://github.com/apache/solr/pull/2849


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Created] (SOLR-17698) Remove deprecated EnumField

2025-03-11 Thread Eric Pugh (Jira)
Eric Pugh created SOLR-17698:


 Summary: Remove deprecated EnumField
 Key: SOLR-17698
 URL: https://issues.apache.org/jira/browse/SOLR-17698
 Project: Solr
  Issue Type: Task
Affects Versions: main (10.0)
Reporter: Eric Pugh
Assignee: Eric Pugh


EnumField is deprecated in favour of EnumFieldType, but the code lingers.   
Solve the linger.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Created] (SOLR-17692) DELETEREPLICA should preempt full-recovery instead of waiting for completion

2025-03-11 Thread Jason Gerlowski (Jira)
Jason Gerlowski created SOLR-17692:
--

 Summary: DELETEREPLICA should preempt full-recovery instead of 
waiting for completion
 Key: SOLR-17692
 URL: https://issues.apache.org/jira/browse/SOLR-17692
 Project: Solr
  Issue Type: Bug
  Components: replication (java), SolrCloud
Reporter: Jason Gerlowski


I recently deleted a NRT replica that was in the middle of a full-recovery and 
was a bit surprised to see that the "delete" blocked waiting for the recovery 
to finish.  This is a minor pain when the index is small, but becomes a huge 
waste of administrator time (and network bandwidth!) as index sizes grow.

There's some plumbing in Solr that attempts to preempt recovery during a 
DELETE, but it appears that it seems that it mostly comes into play during 
peer-sync and "background replication" scenarios (i.e. PULL and TLOG replicas 
that do full-recovery during normal operation).  Preemption doesn't seem to 
work once a recovering core is in the midst of a "full recovery".  We should 
modify this code that it stops full-recovery as well, unless there's some 
compelling reason this was avoided in the initial implementation?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org