[PR] Update README.md [solr]

2024-01-29 Thread via GitHub


whoami-anoint opened a new pull request, #2228:
URL: https://github.com/apache/solr/pull/2228

   https://issues.apache.org/jira/browse/SOLR-X
   
   
   
   
   # Description
   
   Please provide a short description of the changes you're making with this 
pull request.
   
   # Solution
   
   Please provide a short description of the approach taken to implement your 
solution.
   
   # Tests
   
   Please describe the tests you've developed or run to confirm this patch 
implements the feature or solves the problem.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [ ] I have reviewed the guidelines for [How to 
Contribute](https://github.com/apache/solr/blob/main/CONTRIBUTING.md) and my 
code conforms to the standards described there to the best of my ability.
   - [ ] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [ ] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [ ] I have developed this patch against the `main` branch.
   - [ ] I have run `./gradlew check`.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Reference 
Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17132) LTR model upload API issue

2024-01-29 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17811830#comment-17811830
 ] 

Noble Paul commented on SOLR-17132:
---

[~rajanimaski] yes. it can be used . Is there anything that stops you from 
using it today ?

> LTR model upload API issue
> --
>
> Key: SOLR-17132
> URL: https://issues.apache.org/jira/browse/SOLR-17132
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - LTR
>Affects Versions: 9.4.1
>Reporter: rajanimaski
>Priority: Major
>
> The LTR model upload API does not make the model available across all the 
> nodes. After upload, "model not found exception" is reported by certain 
> nodes. The model becomes available only after a collection "reload" api is 
> requested. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Created] (SOLR-17133) CloudAuthStreamTest fails with PRS=true

2024-01-29 Thread Noble Paul (Jira)
Noble Paul created SOLR-17133:
-

 Summary: CloudAuthStreamTest fails with PRS=true
 Key: SOLR-17133
 URL: https://issues.apache.org/jira/browse/SOLR-17133
 Project: Solr
  Issue Type: Task
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Noble Paul
Assignee: Noble Paul






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Updated] (SOLR-17133) CloudAuthStreamTest fails with PRS=true

2024-01-29 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-17133:
--
Issue Type: Bug  (was: Task)

> CloudAuthStreamTest fails with PRS=true
> ---
>
> Key: SOLR-17133
> URL: https://issues.apache.org/jira/browse/SOLR-17133
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17131: Optimize rows=0 since score/sort isn't necessary [solr]

2024-01-29 Thread via GitHub


risdenk commented on PR #2221:
URL: https://github.com/apache/solr/pull/2221#issuecomment-1914730907

   > The other code path is actually already optimized for rows==0 and no scores
   
   Which code path are you talking about?
   
   > I think the solution is to not save the results into queryResultCache when 
rows==0; thus the window won't be applied either.
   
   Not sure I agree with that. We should be able to store the result in the 
queryResultCache so that we don't have to recompute for the same exact query a 
second time?
   
   > I also think a queryResultWindowSize of 1 (which is the default if it 
isn't specified in solrconfig and isn't documented AFAICT) 
   
   Its documented here 
https://solr.apache.org/guide/solr/latest/configuration-guide/caches-warming.html#queryresultwindowsize-element
   
   I know in our case we do set `queryResultWindowSize` in `solrconfig.xml`.
   
   Either way I'll look into QueryComponent some. @dsmiley as far as you are 
aware this change even though maybe not the best place to put it - doesn't have 
any negative side effects right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17131: Optimize rows=0 since score/sort isn't necessary [solr]

2024-01-29 Thread via GitHub


dsmiley commented on PR #2221:
URL: https://github.com/apache/solr/pull/2221#issuecomment-1914809682

   > (Me) The other code path is actually already optimized for rows==0 and no 
scores
   
   The method you are attempting to modify calls `getDocListAndSetNC` when 
`useFilterCache=false`, which is well optimized for this.
   
   > Not sure I agree with that. We should be able to store the result in the 
queryResultCache so that we don't have to recompute for the same exact query a 
second time?
   
   Okay; I sympathize.  Another alteration to my suggestion, is that 
`queryResultWindowSize` should be ignored when rows=0 to allow us to run this 
query as fast as possible, even if this means not caching 
`queryResultWindowSize` docs -- because we'd rather not compute their scores 
merely to cache something in hopes it might be used.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] Update README.md [solr]

2024-01-29 Thread via GitHub


epugh closed pull request #2228: Update README.md
URL: https://github.com/apache/solr/pull/2228


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] Update README.md [solr]

2024-01-29 Thread via GitHub


epugh commented on PR #2228:
URL: https://github.com/apache/solr/pull/2228#issuecomment-1914812076

   While I love the sentiment, I think we would need some data to back up the 
statement "most popular" ;-).   However, please tag me with any other doc 
improvements!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17097: Upgrade Solr to use Lucene 9.9.2 [solr]

2024-01-29 Thread via GitHub


NazerkeBS commented on PR #2176:
URL: https://github.com/apache/solr/pull/2176#issuecomment-1914814723

   all tests passed locally


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17131: Optimize rows=0 since score/sort isn't necessary [solr]

2024-01-29 Thread via GitHub


risdenk commented on PR #2221:
URL: https://github.com/apache/solr/pull/2221#issuecomment-1914878120

   I made a slight change to improve the variable name I added - c6618d6
   
   > The method you are attempting to modify calls getDocListAndSetNC when 
useFilterCache=false, which is well optimized for this.
   
   Right but in that case - assuming you are referring to prior to my PR line 
1703 / 1708, we don't get the optimizations from SOLR-14765. This means in the 
case of rows=0, we don't reuse the filter cache dotset and have to recompute 
it. 
   
   With my change, we never get down to `getDocListAndSetNC` for the case of 
`rows=0` and can build the results right from the filterCache docset.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17131: Optimize rows=0 since score/sort isn't necessary [solr]

2024-01-29 Thread via GitHub


risdenk commented on PR #2221:
URL: https://github.com/apache/solr/pull/2221#issuecomment-1914900925

   >  is that queryResultWindowSize should be ignored when rows=0 to allow us 
to run this query as fast as possible, even if this means not caching 
queryResultWindowSize docs -- because we'd rather not compute their scores 
merely to cache something in hopes it might be used.
   
   100% agree with this idea. I'm not sure what is happening with 
`queryResultWindowSize` docs with my change. In theory we definitely shouldn't 
be computing them.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17132) LTR model upload API issue

2024-01-29 Thread rajanimaski (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17811936#comment-17811936
 ] 

rajanimaski commented on SOLR-17132:


Hi Noble,  Collection reload replays warming queries, so the interest is to 
avoid collection reload if possible.

Does the file store api not need collection reload? 

> LTR model upload API issue
> --
>
> Key: SOLR-17132
> URL: https://issues.apache.org/jira/browse/SOLR-17132
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - LTR
>Affects Versions: 9.4.1
>Reporter: rajanimaski
>Priority: Major
>
> The LTR model upload API does not make the model available across all the 
> nodes. After upload, "model not found exception" is reported by certain 
> nodes. The model becomes available only after a collection "reload" api is 
> requested. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17131: Optimize rows=0 since score/sort isn't necessary [solr]

2024-01-29 Thread via GitHub


risdenk commented on PR #2221:
URL: https://github.com/apache/solr/pull/2221#issuecomment-1914939631

   I addressed the queryResultWindowSize with 326afdd when rows=0.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17131: Optimize rows=0 since score/sort isn't necessary [solr]

2024-01-29 Thread via GitHub


dsmiley commented on PR #2221:
URL: https://github.com/apache/solr/pull/2221#issuecomment-1914960070

   You claim that your approach to this uses the filter cache (for `q`).  Yet 
it does not because it calls `getDocSetNC`.  I feel compelled to -1 this 
because that's what `useFilterCache` means.  But we agree on the goals -- don't 
compute the score!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17131: Optimize rows=0 since score/sort isn't necessary [solr]

2024-01-29 Thread via GitHub


risdenk commented on PR #2221:
URL: https://github.com/apache/solr/pull/2221#issuecomment-1914972100

   > You claim that your approach to this uses the filter cache (for q)
   
   With my change, for the case of `rows=0`, the filtercache docset will be 
used to compute the result using q.
   
   I don't think I understand your concerns. `getDocSetNC` uses the 
`filterCache` docset and does not insert the query result into the filter 
cache. If `getDocSetC` is used, the query result will be inserted into the 
filter cache - which we don't want.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17131: Optimize rows=0 since score/sort isn't necessary [solr]

2024-01-29 Thread via GitHub


dsmiley commented on code in PR #2221:
URL: https://github.com/apache/solr/pull/2221#discussion_r1469804423


##
solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java:
##
@@ -1818,7 +1820,7 @@ private void getDocListNC(QueryResult qr, QueryCommand 
cmd) throws IOException {
 int[] ids;
 float[] scores;
 
-boolean needScores = (cmd.getFlags() & GET_SCORES) != 0;
+boolean needScores = ((cmd.getFlags() & GET_SCORES) != 0) && (cmd.getLen() 
!= 0);

Review Comment:
   Arguably, the caller shouldn't be applying GET_SCORES in the first place -- 
which is QueryComponent.  I suggested this idea yesterday; do you not like it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17131: Optimize rows=0 since score/sort isn't necessary [solr]

2024-01-29 Thread via GitHub


risdenk commented on code in PR #2221:
URL: https://github.com/apache/solr/pull/2221#discussion_r1469808152


##
solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java:
##
@@ -1818,7 +1820,7 @@ private void getDocListNC(QueryResult qr, QueryCommand 
cmd) throws IOException {
 int[] ids;
 float[] scores;
 
-boolean needScores = (cmd.getFlags() & GET_SCORES) != 0;
+boolean needScores = ((cmd.getFlags() & GET_SCORES) != 0) && (cmd.getLen() 
!= 0);

Review Comment:
   I just haven't gotten back to doing that change yet. I will make another 
iteration.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[PR] SOLR-17066: Add GenericSolrRequest.setRequiresCollection [solr]

2024-01-29 Thread via GitHub


gerlowskija opened a new pull request, #2229:
URL: https://github.com/apache/solr/pull/2229

   https://issues.apache.org/jira/browse/SOLR-17066
   
   # Description
   
   SOLR-17066 added a 'defaultCollection' field to each SolrClient 
implementation, similar to the one that has long been in use for SolrJ's 
"cloud" clients.  This default collection (or core) is used on a 
request-by-request basis to build the complete HTTP path, based on the value of 
SolrRequest.requiresCollection().
   
   This is a particular challenge for GenericSolrRequest though, which can be 
used to make both collection-agnostic and collection-aware requests.
   
   # Solution
   
   This commit adds a GenericSolrRequest setter, 
`setRequiresCollection(boolean)`, to allow GSR users to specify which requests 
they would like the client-level default collection to be used on.  It updates 
a number of GSR usages in the project to be a bit more explicit about whether 
they're making a collection/core/index request, or not.
   
   GSR.requiresCollection defaults to `false`, so this PR only touches those 
uses that _do_ requires a collection/core.

   # Tests
   
   Existing tests continue to pass.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://github.com/apache/solr/blob/main/CONTRIBUTING.md) and my 
code conforms to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `main` branch.
   - [x] I have run `./gradlew check`.
   - [x] I have added tests for my changes.
   - [x] I have added documentation for the [Reference 
Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-15781: Document v2 API syntax conventions [solr]

2024-01-29 Thread via GitHub


gerlowskija commented on code in PR #2219:
URL: https://github.com/apache/solr/pull/2219#discussion_r1469894384


##
dev-docs/v2-api-conventions.adoc:
##
@@ -0,0 +1,76 @@
+== HTTP Paths
+
+Where possible, each v2 API is given an HTTP path that reflects the resource 
type and/or name most relevant to its functionality.
+Resource types are typically plural nouns such as "aliases", "collections", 
and "shards".
+Resource names are (typically user-provided) identifiers such as "myAlias", 
"techproducts", and "shard1".
+For example, `/api/collections` is the HTTP path used for all APIs concerned 
with collections generally, but that don't involve any one specific collection 
(e.g. listing all collections).
+APIs that concern themselves with a specific collection use the HTTP path 
`/api/collections/someCollectionName`.
+
+
+Resource types and names are arranged in the HTTP path such that each path 
segment is more specific, or "narrower", than the segment that came before.
+This "narrowing" also extends to resources that have an "is part of" or 
"contains" relationship to one another.
+In these cases all relevant resources and their types are included in the 
path, with the "contained" or "child" resource following its "parent".
+For example, since replicas always belong to a shard, and shards always belong 
to a collection, most v2 APIs pertaining to a specific replica use the HTTP 
path: 
`/api/collections/specificCollectionName/shards/specificShardName/replicas/specificReplicaName`.
+
+Following these guidelines has given us the following (non-exhaustive) list of 
v2 API paths, provided here to give a good sense of the paths currently in use 
and the logic underlying them.
+* `/api/aliases`
+* `/api/aliases/specificAliasName`
+* `/api/aliases/specificAliasName/properties`
+* `/api/aliases/specificAliasName/properties/specificPropertyName`
+* `/api/backups/specificBackupName`
+* `/api/backups/specificBackupName/versions`
+* `/api/backups/specificBackupName/versions/specificVersion`
+* `/api/cluster/nodes/specificNodeName/roles`
+* `/api/cluster/nodes/specificNodeName/roles/specificRoleName`
+* `/api/cluster/properties`
+* `/api/cluster/properties/specificPropertyName`
+* `/api/collections`
+* `/api/collections/specificCollName`
+* `/api/collections/specificCollName/properties`
+* `/api/collections/specificCollName/properties/specificPropertyName`
+* `/api/collections/specificcollName/shards`
+* `/api/collections/specificCollName/shards/specificShardName`
+* `/api/collections/specificCollName/shards/specificShardName/replicas`
+* 
`/api/collections/specificCollName/shards/specificShardName/replicas/specificReplicaName`
+* 
`/api/collections/specificCollName/shards/specificShardName/replicas/specificReplicaName/properties`
+* 
`/api/collections/specificCollName/shards/specificShardName/replicas/specificReplicaName/properties/specificPropertyName`
+* `/api/configsets`
+* `/api/configsets/specificConfigsetName`
+* `/api/cores`
+* `/api/cores/specificCoreName`
+* `/api/node`
+
+=== Unproxied APIs
+
+The last entry on the list above, `/api/node`, exhibits a bit of a special 
case.
+SolrCloud handles most requests as a distributed system, i.e. any request can 
be made to any node in the cluster and Solr will proxy or route the request 
internally in order to serve a response.
+But not all APIs work this way- some functionality is designed to only return 
data from the receiving node, such as `/api/node/key` which returns a 
cryptographic key specific to the receiving node.
+Solr will not proxy these requests.
+To represent this distinction the API design uses the idiosyncratic path 
`/api/node`, to help distinguish these from other node-related APIs.
+
+== HTTP Methods 
+
+Where possible, HTTP methods (colloquially called 'verbs') are used 
semantically to distinguish between APIs available at the same path.
+For example, the API to delete a collection uses the `DELETE` HTTP method, as 
in `DELETE /api/collections/specificCollectionName`.
+The API to modify the collection uses the `PUT` HTTP method, as in `PUT 
/api/collections/specificCollectionName`.
+
+While the best effort is made to use HTTP methods semantically, the v2 API 
currently restricts itself to the better known HTTP methods: `GET`, `POST`, 
`PUT`, and `DELETE`.
+In some situations this leads us to eschew a more semantically appropriate 
verb due to its relative obscurity.
+The most significant example of this is the HTTP method `PATCH`, which 
according to the HTTP spec is used to indicate a partial update (i.e. a 
resource modification request which only provides the part to-be-modified).
+Solr's "modify collection" functionality uses partial update semantics, but 
the v2 API uses `PUT` instead of `PATCH` due to the relative obscurity of the 
latter.
+
+For use within the v2 API, the four "popular" HTTP methods have the following 
semantics and implications:
+
+* `GET` - used for non-mutating (i.e. "read only") requests. Most often 

Re: [PR] SOLR-15781: Document v2 API syntax conventions [solr]

2024-01-29 Thread via GitHub


gerlowskija commented on code in PR #2219:
URL: https://github.com/apache/solr/pull/2219#discussion_r1469903877


##
dev-docs/apis.adoc:
##
@@ -51,27 +51,28 @@ Separating the API "definition" and "implementation" in 
this way allows us to on
 
 Writing a new v2 API may appear daunting, but additions in reality are 
actually pretty simple:
 
-1. *Create POJO ("Plain Old Java Object") classes as needed to represent the 
API's request and response*:
+1. *Agree on Endpoint Syntax*: Before implementing, developers should generate 
consensus around what the new API will look like.  General guidelines and 
conventions for the v2 API syntax are described 
<>.

Review Comment:
   It definitely works in adoc.  The question I'd ask though is whether Github 
is smart enough to do it as well when it's previewing these asciidoc files.
   
   That's the main way we expect these to be consumed, right?  Through the 
Github UI?  (i.e. we don't actually build these asciidoc files into any thing 
analogous to the ref-guide?)
   
   I'll push the change and see how Github handles it; if GH doesn't 
do-the-right-thing I'll revert.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-15781: Document v2 API syntax conventions [solr]

2024-01-29 Thread via GitHub


gerlowskija commented on code in PR #2219:
URL: https://github.com/apache/solr/pull/2219#discussion_r1469907912


##
dev-docs/apis.adoc:
##
@@ -51,27 +51,28 @@ Separating the API "definition" and "implementation" in 
this way allows us to on
 
 Writing a new v2 API may appear daunting, but additions in reality are 
actually pretty simple:
 
-1. *Create POJO ("Plain Old Java Object") classes as needed to represent the 
API's request and response*:
+1. *Agree on Endpoint Syntax*: Before implementing, developers should generate 
consensus around what the new API will look like.  General guidelines and 
conventions for the v2 API syntax are described 
<>.

Review Comment:
   Looks like GH does do the right thing 👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-15781: Document v2 API syntax conventions [solr]

2024-01-29 Thread via GitHub


gerlowskija merged PR #2219:
URL: https://github.com/apache/solr/pull/2219


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-15781) Make v2 APIs more REST-ful and migrate to JAX-RS

2024-01-29 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17811987#comment-17811987
 ] 

ASF subversion and git services commented on SOLR-15781:


Commit 0830423afd7d29ce2b8f18643b7fd660ce8dc08e in solr's branch 
refs/heads/main from Jason Gerlowski
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=0830423afd7 ]

SOLR-15781: Document v2 API syntax conventions (#2219)

Currently, decisions around what our v2 API should look like are
scattered across a large number of JIRA tickets, PR reviews, and
spreadsheets.  There's no one place giving an overview of the desired
syntax (beyond perhaps the general guidance in SIP-16 to move towards
"REST" where possible).

This commit attempts to address this by introducing docs describing
some of the conventions used in our v2 API.

Room for improvement remains: parameters conventions remain a big gap,
as does sourcing to the original rationale behind many of these
decisions.  But this commit provides a place to build off of, at the
least.

> Make v2 APIs more REST-ful and migrate to JAX-RS
> 
>
> Key: SOLR-15781
> URL: https://issues.apache.org/jira/browse/SOLR-15781
> Project: Solr
>  Issue Type: Improvement
>  Components: v2 API
>Reporter: Jason Gerlowski
>Priority: Major
>  Labels: V2
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> While the v2 API is in many ways an improvement over v1 in terms of 
> readability and API best practices, it still has a lot of room for 
> improvement.  The recent "experimental" designation for v2 opens the door to 
> pursuing many of these improvements. 
> This ticket is intended as an umbrella to track resolution of some of the 
> known warts in the current v2 API and some of the improvements that can be 
> made in terms of being more RESTful, etc.
> While we're touching the API code for these endpoints, it makes sense to 
> convert them to JAX-RS as well.  This JAX-RS migration work was initially 
> covered in a separate ticket (see SOLR-16370), but the changes required ended 
> up overlapping with the "cosmetic-improvement" effort here so significantly 
> that it made sense to track them together.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17131: Optimize rows=0 since score/sort isn't necessary [solr]

2024-01-29 Thread via GitHub


dsmiley commented on PR #2221:
URL: https://github.com/apache/solr/pull/2221#issuecomment-1915205945

   Let's use GH links to ensure we understand each other.  You are referring to 
[this](https://github.com/apache/solr/blob/e53bf109ba2d35e1aff6f9a5c428fff5eff63f7b/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L1658)?
   ```
   List filterList = cmd.getFilterList();
   if (filterList != null && !filterList.isEmpty()) {
 out.docSet = 
DocSetUtil.getDocSet(out.docSet.intersection(getDocSet(filterList)), this);
   }
   ```
   This shows that the list of filter queries goes through the filter cache.  
But that's true no matter what (other code paths ultimately lead to the same 
but in different methods deeper).  The question at hand is how is the `q` param 
processed, which in this code gets turned into a DocSet saved to `out.docSet`.  
Immediately before these lines, it's populated via `getDocSet`.  This is what 
`useFilterCache` *means*.  In your PR, you added a condition to not do this, 
thus not use the filter cache for `q`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17131: Optimize rows=0 since score/sort isn't necessary [solr]

2024-01-29 Thread via GitHub


uschindler commented on PR #2221:
URL: https://github.com/apache/solr/pull/2221#issuecomment-1915232387

   Hi,
   I already responded on Slack, but I now found this PR and wanted to say some 
additional things:
   
   There is also a separate collector and on top of that a separate method in 
`IndexSearcher`, which automatically does a lot of the top-docs related stuff. 
The problem with a normal `TopDocs` or `TopFieldDocs` collector is that the 
empty PQ has some additional overhead (although the score is not calculated and 
returned as 0).
   
   First of all: SolrIndexSearcher should also implement support for using the 
optimized methods and collectors, unless a `DocSet` is needed (for 
facets/aggs). Using the `TotalHitCountCollector` is one thing to do (this needs 
to be done more down the line in the search code of Solr, but it's worth to do. 
But as there is now support to short-circuit count queries on the Query level, 
you should also think of using `IndexSearcher#count()` directly: E.g., this one 
returns `IndexReader#numDocs()` for `MatchAllDocsQuery` (thats a stupid simple 
case), but it can also return quick counts when there are no deleted docs in a 
segment and the statistics in index can answer the query.
   
   The changes here look fine (although I can't follow the discussion about 
those crazy flags true/false/whatever), but in addition to the code here, in 
case of no TopDocs and no facets, the code should call `IndexSearcher#count()`. 
In the facet case, it should at least use the `TotalHitCountCollector` as the 
final sink.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17066: Add GenericSolrRequest.setRequiresCollection [solr]

2024-01-29 Thread via GitHub


epugh commented on code in PR #2229:
URL: https://github.com/apache/solr/pull/2229#discussion_r1469954419


##
solr/core/src/java/org/apache/solr/packagemanager/PackageManager.java:
##
@@ -237,8 +237,10 @@ public Map 
getPackagesDeployed(String collection) {
   NamedList result =
   solrClient.request(
   new GenericSolrRequest(
-  SolrRequest.METHOD.GET,
-  PackageUtils.getCollectionParamsPath(collection) + 
"/PKG_VERSIONS"));
+  SolrRequest.METHOD.GET,
+  PackageUtils.getCollectionParamsPath(collection) + 
"/PKG_VERSIONS")
+  .setRequiresCollection(
+  false) /* Making a collection request, but already baked 
into path */);

Review Comment:
   interesting, is this a place we could try and improve things to not have 
this logic?   Seems  a shame to have a collection request and state 'but we 
don't require it'?



##
solr/core/src/java/org/apache/solr/cli/ExportTool.java:
##
@@ -620,7 +622,7 @@ boolean exportDocsFromCore() throws IOException, 
SolrServerException {
 
 try (SolrClient client = SolrCLI.getSolrClient(baseurl, credentials)) {
   expectedDocs = getDocCount(replica.getCoreName(), client, query);
-  GenericSolrRequest request;
+  QueryRequest request;

Review Comment:
   I like that you were able to move away from the `GenericSolrRequest` here!



##
solr/solrj/src/java/org/apache/solr/client/solrj/request/GenericSolrRequest.java:
##


Review Comment:
   ooh, docs!



##
solr/core/src/java/org/apache/solr/packagemanager/PackageUtils.java:
##
@@ -155,9 +155,12 @@ public static String getFileFromJarsAsString(List 
jars, String filename) {
   }
 
   /** Returns JSON string from a given Solr URL */
-  public static String getJsonStringFromUrl(SolrClient client, String path, 
SolrParams params) {
+  public static String getJsonStringFromUrl(

Review Comment:
   I wonder if it would be simpler to just have "getJsonStringFromCollection" 
and "getJsonStringFromSolr" or something?   After all, it's not a generic 
"getJsonStringFromUrl" where I can pass in any URL, despite what the name of 
the method appears to suggest...!Then we could skip passing a magic boolean 
around...



##
solr/solrj/src/java/org/apache/solr/client/solrj/request/GenericSolrRequest.java:
##
@@ -29,16 +29,48 @@ public class GenericSolrRequest extends 
SolrRequest {
   public SolrParams params;
   public SimpleSolrResponse response = new SimpleSolrResponse();
   public ContentWriter contentWriter;
+  public boolean requiresCollection;
 
+  /**
+   * @param m the HTTP method to use for this request
+   * @param path the HTTP path to use for this request. If users are making a 
collection-aware
+   * request (i.e. {@link #setRequiresCollection(boolean)} is called with 
'true'), only the
+   * section of the API path following the collection or core should be 
provided here.
+   */
   public GenericSolrRequest(METHOD m, String path) {
 this(m, path, new ModifiableSolrParams());
   }
 
+  /**
+   * @param m the HTTP method to use for this request
+   * @param path the HTTP path to use for this request. If users are making a 
collection-aware
+   * request (i.e. {@link #setRequiresCollection(boolean)} is called with 
'true'), only the
+   * section of the API path following the collection or core should be 
provided here.
+   * @param params query parameter names and values for making this request.
+   */
   public GenericSolrRequest(METHOD m, String path, SolrParams params) {
 super(m, path);
 this.params = params;
   }
 
+  /**
+   * Determines whether the SolrRequest should use a default collection or 
core from the client
+   *
+   * Should generally be 'true' whenever making a request to a particular 
collection or core, and
+   * 'false' otherwise.

Review Comment:
   Wish `generally` could be `always` !



##
solr/core/src/test/org/apache/solr/security/BasicAuthIntegrationTest.java:
##
@@ -330,7 +330,7 @@ public void testBasicAuth() throws Exception {
   assertPkiAuthMetricsMinimums(3, 3, 0, 0, 0, 0);
 
   // Query that succeeds
-  GenericSolrRequest req = new GenericSolrRequest(SolrRequest.METHOD.GET, 
"/select", params);
+  final var req = new QueryRequest(params);

Review Comment:
   Yay!



##
solr/solrj/src/test/org/apache/solr/client/solrj/impl/Http2SolrClientTest.java:
##
@@ -1013,7 +1013,8 @@ public void testGetRawStream() throws Exception {
 getBaseUrl() + "/debug/foo", DEFAULT_CONNECTION_TIMEOUT, 
DEFAULT_CONNECTION_TIMEOUT)
 .build()) {
   GenericSolrRequest req =
-  new GenericSolrRequest(SolrRequest.METHOD.GET, "/select", 
params("q", "*:*"));
+  new GenericSolrRequest(SolrRequest.METHOD.GET, "/select", 
params("q", "*:*"))

Review Comment:
   Cou

[jira] [Created] (SOLR-17134) Bucketed rate limiting

2024-01-29 Thread Noble Paul (Jira)
Noble Paul created SOLR-17134:
-

 Summary: Bucketed rate limiting
 Key: SOLR-17134
 URL: https://issues.apache.org/jira/browse/SOLR-17134
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Noble Paul


The current rate limiting functionality is limiting . We either get a rate 
limit or we don't. There are cases where we would like to rate limit only a 
certain type of requests.

The usecase is as follows.

 
We may have normal requests that go through without any rate limiting. When a 
certain header or param is passed ,  the requests are throttled.  we should be 
able to throttle to
 * 'n1' requests in parallel for requests with a header/param value "x" 
 * "n2" requests in parallel for requests with a header/param value "y" 

and so on

The configuration would look as following

 
{code:json}
{
  "rate-limiters": {
"enabled": true,
"readBuckets": [
  {
"name": "x-type",
"header": {"solr-bucket-name": "x-type"},
"allowedRequests": 5,
"slotAcquisitionTimeoutInMS": 100
},
  {
"name": "y-type",
"header": {"solr-bucket-name": "y-type"},  
"allowedRequests": 25,
"slotAcquisitionTimeoutInMS": 100
}
  ]
  }
 }
{code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Created] (SOLR-17135) Default PRS at System level as opt in

2024-01-29 Thread Noble Paul (Jira)
Noble Paul created SOLR-17135:
-

 Summary: Default PRS at System level as opt in
 Key: SOLR-17135
 URL: https://issues.apache.org/jira/browse/SOLR-17135
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Noble Paul
Assignee: Noble Paul


We use the SolrJ API to randomize testing with PRS.  PRS is enabled only in 
certain tests . This means we do not have 100% coverage . We should have a 
system property ({{{}-Dsolr.prs.default=true{}}}) for Solr which decides 
whether collections are created with PRS=true false. Default is false for. The 
test framework may set this to true or false. With this change , we will remove 
all the explicit 

{{.setPerReplicaState(SolrCloudTestCase.USE_PER_REPLICA_STATE)}}

 

from our tests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[PR] Use a System wide property to enable PRS. the testcase can piggyback … [solr]

2024-01-29 Thread via GitHub


noblepaul opened a new pull request, #2230:
URL: https://github.com/apache/solr/pull/2230

   WIP do not merge


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17131: Optimize rows=0 since score/sort isn't necessary [solr]

2024-01-29 Thread via GitHub


dsmiley commented on PR #2221:
URL: https://github.com/apache/solr/pull/2221#issuecomment-1915447012

   @uschindler See 
[getDocListNC](https://github.com/apache/solr/blob/326afddeddaa2092086e6f6b27f8bc35bf6d79cf/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L1812)
 which already seems to be pretty well optimized when `lastDocRequested <= 0` 
(i.e. `rows==0`), and in conjunction also when `!needScores`.  The reason some 
of the Lucene convenience collectors & IndexSearcher.count aren't used is 
because this code is juggling a multitude of scenarios, such as when there 
might be a so-called `PostFilter`, thus leading to another Collector to add to 
the head of the collectors.  Care must be taken before adding yet another 
optimization -- does it *actually* add value or it just to feel good to use 
another Lucene class.  `getDocListAndSetNC` follows this method and is similar.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17068: Resolve mish mash of bin/post and bin/solr post references in favour of bin/solr post. [solr]

2024-01-29 Thread via GitHub


epugh commented on PR #2227:
URL: https://github.com/apache/solr/pull/2227#issuecomment-1915578292

   Just tested that `bin/post` still works!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17131: Optimize rows=0 since score/sort isn't necessary [solr]

2024-01-29 Thread via GitHub


risdenk commented on PR #2221:
URL: https://github.com/apache/solr/pull/2221#issuecomment-1915662767

   Updated this based on talking w/ @dsmiley. the `useFilterCache` part I was 
confused on that it always applies to `q` so my change didn't make sense. I 
pushed a change that made it clear what `useFilterCache` is always used for 
`q`. 
   
   I also made the change to `QueryComponent` to make sure `scores` aren't 
calculated if `rows=0`. I think there is more here potentially. 
   
   TODO:
   * Look at adding some more tests
   * Try adding a benchmark for this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [I] SolrCloud Pod moved to new Node - Replica Migration pending [solr-operator]

2024-01-29 Thread via GitHub


HoustonPutman commented on issue #668:
URL: https://github.com/apache/solr-operator/issues/668#issuecomment-1915739941

   The only ways that local volumes work as PVs is if the PVs that are created 
have node limitations (i.e. the Pod connected to the PV cannot be rescheduled 
onto another node). Are you sure that the local volume provisioner is setup 
correctly?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-16699: Expose creationTimeMillis in COLSTATUS API [solr]

2024-01-29 Thread via GitHub


dsmiley commented on code in PR #2226:
URL: https://github.com/apache/solr/pull/2226#discussion_r1470367294


##
solr/core/src/java/org/apache/solr/cloud/overseer/ClusterStateMutator.java:
##
@@ -133,9 +134,18 @@ public ZkWriteCommand createCollection(ClusterState 
clusterState, ZkNodeProps me
 }
 
 assert !collectionProps.containsKey(CollectionAdminParams.COLL_CONF);
+
+// This doc collection does not fully capture what will be persisted: the 
zkNodeVersion and

Review Comment:
   ```suggestion
   // This instance does not fully capture what will be persisted: the 
zkNodeVersion and
   ```



##
solr/core/src/test/org/apache/solr/cloud/overseer/ZkStateWriterTest.java:
##
@@ -549,4 +523,9 @@ public void testSingleExternalCollectionCompressedState() 
throws Exception {
   server.shutdown();
 }
   }
+
+  private DocCollection createDocCollection(String name, Map 
props) {

Review Comment:
   glad to see this; simplified the test



##
solr/solrj/src/test/org/apache/solr/client/solrj/impl/BaseHttpClusterStateProviderTest.java:
##
@@ -0,0 +1,153 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.client.solrj.impl;
+
+import java.io.IOException;
+import java.time.Instant;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import org.apache.solr.client.solrj.SolrClient;
+import org.apache.solr.client.solrj.SolrServerException;
+import org.apache.solr.client.solrj.request.CollectionAdminRequest;
+import org.apache.solr.client.solrj.response.CollectionAdminResponse;
+import org.apache.solr.cloud.SolrCloudTestCase;
+import org.apache.solr.common.cloud.ClusterState;
+import org.apache.solr.common.cloud.DocCollection;
+import org.apache.solr.common.util.NamedList;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+public class BaseHttpClusterStateProviderTest extends SolrCloudTestCase {

Review Comment:
   This test does not need to exist or could be converted to something generic. 
 A ClusterStateProvider is an abstraction with equivalent implementations -- 
and thus could be chosen arbitrarily in some other test that checks the time 
that otherwise doesn't really care about ZK or Http for that matter.  Many 
things in Lucene and Solr are tested embracing this philosophy and I count 
myself a convert.  "randomized testing".  Maybe an existing test that checks 
creation time could be retrofitted to use the cluster state provider impl 
randomly?  Or alternatively since there are 2, make a generic test that is 
parameterized for the two (example of the approach -- `TestConfigSetService`)



##
solr/core/src/test/org/apache/solr/cloud/api/collections/TestCollectionAPI.java:
##
@@ -345,6 +346,16 @@ private void clusterStatusWithCollectionAndMultipleShards()
 }
   }
 
+  private Instant getCreationTimeFromClusterStateOrFail(
+  CloudSolrClient client, String collectionName) throws IOException {
+DocCollection docCollection = 
client.getClusterState().getCollectionOrNull(collectionName);

Review Comment:
   Why call the `OrNull` version; why not let the default impl throw its own 
exception?  That's what its for.



##
solr/core/src/java/org/apache/solr/core/backup/BackupManager.java:
##
@@ -221,7 +222,11 @@ public DocCollection readCollectionState(String 
collectionName) throws IOExcepti
 repository.openInput(zkStateDir, COLLECTION_PROPS_FILE, 
IOContext.DEFAULT)) {
   byte[] arr = new byte[(int) is.length()]; // probably ok since the json 
file should be small.
   is.readBytes(arr, 0, (int) is.length());
-  ClusterState c_state = ClusterState.createFromJson(-1, arr, 
Collections.emptySet(), null);
+  // set a default created date, we don't aim at reading actual zookeeper 
state. The restored
+  // collection
+  // will have a new creation date when persisted in zookeeper.

Review Comment:
   reflow



##
solr/solrj/src/java/org/apache/solr/common/cloud/DocCollection.java:
##
@@ -385,6 +394,15 @@ public boolean isReadOnly() {
 return readOnly;
   }
 
+  /**
+   * When this DocCollection is read from Zookeeper, it captures

Re: [PR] SOLR-17066: Add GenericSolrRequest.setRequiresCollection [solr]

2024-01-29 Thread via GitHub


gerlowskija commented on PR #2229:
URL: https://github.com/apache/solr/pull/2229#issuecomment-1915912304

   > What jumped out at me is that we really abuse `GenericSolrRequest` all 
over the place. I also am hoping that as more of the V2 api's come online, we 
could reduce the places that we use `GenericSolrRequest`, kind of like how in 
some places you could swap in a `QueryRequest` in this PR!
   
   Yeah, I was thinking the same thing.  There's a few times we use 
GenericSolrRequest as a way to hit a custom test endpoint or something like 
that, but that's definitely the exception and not the rule.  90% of these GSR 
usages are bandaids covering up the fact that we're missing SolrJ coverage for 
particular APIs. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17066: Add GenericSolrRequest.setRequiresCollection [solr]

2024-01-29 Thread via GitHub


epugh commented on PR #2229:
URL: https://github.com/apache/solr/pull/2229#issuecomment-1915917826

   > > What jumped out at me is that we really abuse `GenericSolrRequest` all 
over the place. I also am hoping that as more of the V2 api's come online, we 
could reduce the places that we use `GenericSolrRequest`, kind of like how in 
some places you could swap in a `QueryRequest` in this PR!
   > 
   > Yeah, I was thinking the same thing. There's a few times we use 
GenericSolrRequest as a way to hit a custom test endpoint or something like 
that, but that's definitely the exception and not the rule. 90% of these GSR 
usages are bandaids covering up the fact that we're missing SolrJ coverage for 
particular APIs.
   > 
   > I think these invocations would be a great place to start sneaking in some 
of our v2 SolrRequest objects, as you suggested. (Would be a great newdev 
ticket!)
   
   Totally agree on getting some of the new v2 objects in!   Also, I wonder if 
looking at uses of GenericSolrRequest would provide insight into which v2 apis 
should be prioritized?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17066: Add GenericSolrRequest.setRequiresCollection [solr]

2024-01-29 Thread via GitHub


gerlowskija commented on code in PR #2229:
URL: https://github.com/apache/solr/pull/2229#discussion_r1470500298


##
solr/core/src/java/org/apache/solr/packagemanager/PackageManager.java:
##
@@ -237,8 +237,10 @@ public Map 
getPackagesDeployed(String collection) {
   NamedList result =
   solrClient.request(
   new GenericSolrRequest(
-  SolrRequest.METHOD.GET,
-  PackageUtils.getCollectionParamsPath(collection) + 
"/PKG_VERSIONS"));
+  SolrRequest.METHOD.GET,
+  PackageUtils.getCollectionParamsPath(collection) + 
"/PKG_VERSIONS")
+  .setRequiresCollection(
+  false) /* Making a collection request, but already baked 
into path */);

Review Comment:
   We could - basically that'd involve changing 
`PackageUtils.getCollectionParamsPath` to return a value of `/config/params` 
instead of the current value 
`/api/collections/someCollectionName/config/params`.  The method has 8 or 10 
call sites, so we'd have to restructure the logic in each of those places.
   
   It's not that big of a lift; I'm happy to go that route if it'd be your 
preference?  Lmk.
   
   That said, I have a slight preference for skipping this now.  Mostly 
because, as you mentioned - this code really shouldn't be using GSR in any 
case.  No point spending time reworking the path-building logic now, if down 
the line we're gonna replace all this code anyways with a more appropriate 
SolrRequest impl.



##
solr/core/src/java/org/apache/solr/packagemanager/PackageManager.java:
##
@@ -237,8 +237,10 @@ public Map 
getPackagesDeployed(String collection) {
   NamedList result =
   solrClient.request(
   new GenericSolrRequest(
-  SolrRequest.METHOD.GET,
-  PackageUtils.getCollectionParamsPath(collection) + 
"/PKG_VERSIONS"));
+  SolrRequest.METHOD.GET,
+  PackageUtils.getCollectionParamsPath(collection) + 
"/PKG_VERSIONS")
+  .setRequiresCollection(
+  false) /* Making a collection request, but already baked 
into path */);

Review Comment:
   We could - basically that'd involve changing 
`PackageUtils.getCollectionParamsPath` to return a value of `/config/params` 
instead of the current value 
`/api/collections/someCollectionName/config/params`.  The method has 8 or 10 
call sites, so we'd have to restructure the logic in each of those places.
   
   It's not that big of a lift; I'm happy to go that route if it'd be your 
preference?  Lmk.
   
   That said, I have a slight preference for skipping this now.  Mostly 
because, as you mentioned - this code really shouldn't be using GSR at all.  No 
point spending time reworking the path-building logic now, if down the line 
we're gonna replace all this code anyways with a more appropriate SolrRequest 
impl.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17066: Add GenericSolrRequest.setRequiresCollection [solr]

2024-01-29 Thread via GitHub


gerlowskija commented on code in PR #2229:
URL: https://github.com/apache/solr/pull/2229#discussion_r1470501485


##
solr/solrj/src/test/org/apache/solr/client/solrj/impl/Http2SolrClientTest.java:
##
@@ -1013,7 +1013,8 @@ public void testGetRawStream() throws Exception {
 getBaseUrl() + "/debug/foo", DEFAULT_CONNECTION_TIMEOUT, 
DEFAULT_CONNECTION_TIMEOUT)
 .build()) {
   GenericSolrRequest req =
-  new GenericSolrRequest(SolrRequest.METHOD.GET, "/select", 
params("q", "*:*"));
+  new GenericSolrRequest(SolrRequest.METHOD.GET, "/select", 
params("q", "*:*"))

Review Comment:
   I didn't try it earlier because I wasn't sure whether this was related to 
any of the `/debug/foo` stuff above, but I'll give it a shot and update if the 
test passes...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17066: Add GenericSolrRequest.setRequiresCollection [solr]

2024-01-29 Thread via GitHub


gerlowskija commented on code in PR #2229:
URL: https://github.com/apache/solr/pull/2229#discussion_r1470505104


##
solr/solrj/src/test/org/apache/solr/client/solrj/impl/Http2SolrClientTest.java:
##
@@ -1013,7 +1013,8 @@ public void testGetRawStream() throws Exception {
 getBaseUrl() + "/debug/foo", DEFAULT_CONNECTION_TIMEOUT, 
DEFAULT_CONNECTION_TIMEOUT)
 .build()) {
   GenericSolrRequest req =
-  new GenericSolrRequest(SolrRequest.METHOD.GET, "/select", 
params("q", "*:*"));
+  new GenericSolrRequest(SolrRequest.METHOD.GET, "/select", 
params("q", "*:*"))

Review Comment:
   Hey it worked - good catch!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17068: Resolve mish mash of bin/post and bin/solr post references in favour of bin/solr post. [solr]

2024-01-29 Thread via GitHub


dsmiley commented on PR #2227:
URL: https://github.com/apache/solr/pull/2227#issuecomment-1915947350

   Maybe I'm being grumpy but I like to think of the unfortunate addition of 
"post" to bin/solr as an implementation detail out of convenience to make 
multi-platform and general supportability easier for bin/post.  Does this mean 
we don't want users to use bin/post either?  I think not.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17130: edismax-matchalldocs-optimization [solr]

2024-01-29 Thread via GitHub


dsmiley commented on code in PR #2218:
URL: https://github.com/apache/solr/pull/2218#discussion_r1470516795


##
solr/core/src/java/org/apache/solr/search/ExtendedDismaxQParser.java:
##
@@ -199,6 +199,18 @@ public Query parse() throws SyntaxError {
   topQuery = FunctionScoreQuery.boostByValue(topQuery, 
boosts.get(0).asDoubleValuesSource());
 }
 
+// If topQuery is a boolean query, unwrap the boolean query to check if it 
is just

Review Comment:
   I recommend moving this to above the section right above here since 
FunctionScoreQuery may wrap the BooleanQuery or whatever `topQuery` happens to 
be.  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17066: Add GenericSolrRequest.setRequiresCollection [solr]

2024-01-29 Thread via GitHub


gerlowskija commented on code in PR #2229:
URL: https://github.com/apache/solr/pull/2229#discussion_r1470520449


##
solr/core/src/java/org/apache/solr/packagemanager/PackageUtils.java:
##
@@ -155,9 +155,12 @@ public static String getFileFromJarsAsString(List 
jars, String filename) {
   }
 
   /** Returns JSON string from a given Solr URL */
-  public static String getJsonStringFromUrl(SolrClient client, String path, 
SolrParams params) {
+  public static String getJsonStringFromUrl(

Review Comment:
   Yeah, the 'magic boolean' is less than ideal for sure.  Done!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17066: Add GenericSolrRequest.setRequiresCollection [solr]

2024-01-29 Thread via GitHub


gerlowskija commented on code in PR #2229:
URL: https://github.com/apache/solr/pull/2229#discussion_r1470505104


##
solr/solrj/src/test/org/apache/solr/client/solrj/impl/Http2SolrClientTest.java:
##
@@ -1013,7 +1013,8 @@ public void testGetRawStream() throws Exception {
 getBaseUrl() + "/debug/foo", DEFAULT_CONNECTION_TIMEOUT, 
DEFAULT_CONNECTION_TIMEOUT)
 .build()) {
   GenericSolrRequest req =
-  new GenericSolrRequest(SolrRequest.METHOD.GET, "/select", 
params("q", "*:*"));
+  new GenericSolrRequest(SolrRequest.METHOD.GET, "/select", 
params("q", "*:*"))

Review Comment:
   Hey it worked - good catch @epugh 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[PR] SOLR-17066: Switch HttpSolrClient away from coreURLs, pt 2 [solr]

2024-01-29 Thread via GitHub


gerlowskija opened a new pull request, #2231:
URL: https://github.com/apache/solr/pull/2231

   https://issues.apache.org/jira/browse/SOLR-17066
   
   # Description
   
   SOLR-17066 introduced a `withDefaultCollection` builder method that can be 
used in lieu of providing a core URL as a "base".  (Core URLs are not ideal as 
they prevent clients from making core-agnostic requests.)
   
   However, Solr's own client usage still needs modified to use this new 
builder method.
   
   # Solution
   
   This PR tackles this for the necessary production (i.e. non-test) code in 
several Solr modules, including: solr-core's tests, the 'main' and 'test' code 
for solrj and solrj-streaming, and the hadoop-auth and hdfs modules.
   
   # Tests
   
   Existing tests continue to pass.  A few additional unit tests for URLUtil
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://github.com/apache/solr/blob/main/CONTRIBUTING.md) and my 
code conforms to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `main` branch.
   - [ ] I have run `./gradlew check`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17066: Switch HttpSolrClient away from coreURLs, pt 2 [solr]

2024-01-29 Thread via GitHub


gerlowskija commented on PR #2231:
URL: https://github.com/apache/solr/pull/2231#issuecomment-1915971687

   Tests pass for me locally when combined with PR #2229.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17131: Optimize rows=0 since score/sort isn't necessary [solr]

2024-01-29 Thread via GitHub


dsmiley commented on code in PR #2221:
URL: https://github.com/apache/solr/pull/2221#discussion_r1470537795


##
solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java:
##
@@ -169,6 +160,20 @@ public void prepare(ResponseBuilder rb) throws IOException 
{
 
 try {
   QParser parser = QParser.getParser(rb.getQueryString(), defType, req);
+
+  SortSpec sortSpec = parser.getSortSpec(true);
+  rb.setSortSpec(sortSpec);
+  boolean someDocs = sortSpec == null || sortSpec.getCount() != 0;
+
+  // Set field flags

Review Comment:
   The try-catch is about query parsing / syntax errors; sortSpec & 
returnFields is separate; perhaps should go after.



##
solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java:
##
@@ -147,15 +147,6 @@ public void prepare(ResponseBuilder rb) throws IOException 
{
   }
 }
 
-// Set field flags

Review Comment:
   BTW, I cannot figure out why "field" is  used to describe/qualify the flags. 
 If you can't either, I recommend as you move this comment to simply omit the 
"field" word.



##
solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java:
##
@@ -1682,7 +1684,6 @@ private void getDocListC(QueryResult qr, QueryCommand 
cmd) throws IOException {
 }
   }
 } else {
-  fullSortCount.increment();

Review Comment:
   nocommit -- to move elsewhere still



##
solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java:
##
@@ -169,6 +160,20 @@ public void prepare(ResponseBuilder rb) throws IOException 
{
 
 try {
   QParser parser = QParser.getParser(rb.getQueryString(), defType, req);
+
+  SortSpec sortSpec = parser.getSortSpec(true);
+  rb.setSortSpec(sortSpec);
+  boolean someDocs = sortSpec == null || sortSpec.getCount() != 0;
+
+  // Set field flags

Review Comment:
   I was anticipating seeing a rows check in changes to QueryComponent but 
don't see it yet



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17038) /admin/segments handler: Expose the term count

2024-01-29 Thread Rahul Goswami (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812137#comment-17812137
 ] 

Rahul Goswami commented on SOLR-17038:
--

I am working on this.

> /admin/segments handler: Expose the term count
> --
>
> Key: SOLR-17038
> URL: https://issues.apache.org/jira/browse/SOLR-17038
> Project: Solr
>  Issue Type: Improvement
>Reporter: David Smiley
>Priority: Minor
>  Labels: newdev
>
> The term count for a field is not exposed for diagnostic purposes.  Strangely 
> enough, more obscure statistics like sumDocFreq and sumTotalTermFreq are.
> Just need to add a line like:
> {quote}fieldFlags.add("termCount", terms.size());{quote}
> to SegmentsInfoRequestHandler next to [where those other stats are 
> gathered|https://github.com/apache/solr/blob/releases/solr/9.4.1/solr/core/src/java/org/apache/solr/handler/admin/SegmentsInfoRequestHandler.java#L371-L372].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Comment Edited] (SOLR-17038) /admin/segments handler: Expose the term count

2024-01-29 Thread Rahul Goswami (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812137#comment-17812137
 ] 

Rahul Goswami edited comment on SOLR-17038 at 1/30/24 4:23 AM:
---

I am working on this. Any other stats apart from "termCount" that could be 
useful ?


was (Author: rahul196...@gmail.com):
I am working on this.

> /admin/segments handler: Expose the term count
> --
>
> Key: SOLR-17038
> URL: https://issues.apache.org/jira/browse/SOLR-17038
> Project: Solr
>  Issue Type: Improvement
>Reporter: David Smiley
>Priority: Minor
>  Labels: newdev
>
> The term count for a field is not exposed for diagnostic purposes.  Strangely 
> enough, more obscure statistics like sumDocFreq and sumTotalTermFreq are.
> Just need to add a line like:
> {quote}fieldFlags.add("termCount", terms.size());{quote}
> to SegmentsInfoRequestHandler next to [where those other stats are 
> gathered|https://github.com/apache/solr/blob/releases/solr/9.4.1/solr/core/src/java/org/apache/solr/handler/admin/SegmentsInfoRequestHandler.java#L371-L372].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org