date:20241104

[jira] [Commented] (SOLR-16757) Umbrella Ticket for Revamping Solr CLI's for the Future

2024-11-04 Thread Eric Pugh (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-16757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895274#comment-17895274
 ] 

Eric Pugh commented on SOLR-16757:
--

We are getting to the close of this effort.  I checked in with [~malliaridis] 
and he is going to push up a PR to get us moved to using OptionGroups to manage 
when you have multiple options, but only one is valid, to leverage the built in 
cli error handling for that situation.  He also has a refactoring of SolrCLI to 
take some of the weight out of it.

I would love to see JWT support added to the CLI (SOLR-13071 ), go along with 
our Basic Auth support...   [~Idjeraoui] would that be of interest to you to 
help on? 

I also would like to take another run at SOLR-7871 , which is the idea of 
having platform independent file for all our configuraiton values. 

> Umbrella Ticket for Revamping Solr CLI's for the Future
> ---
>
> Key: SOLR-16757
> URL: https://issues.apache.org/jira/browse/SOLR-16757
> Project: Solr
>  Issue Type: Task
>  Components: cli
>Reporter: Eric Pugh
>Assignee: Eric Pugh
>Priority: Minor
>
> This is to guide me in revamping the Solr CLI functions by tracking a set of 
> JIRA's.   It's to help me figure out which I am going to work on and which I 
> am not.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-17497) Pull replicas throws AlreadyClosedException

2024-11-04 Thread Jason Gerlowski (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-17497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895320#comment-17895320
 ] 

Jason Gerlowski commented on SOLR-17497:


Hey [~sanjaydutt] - I'm going to backport your change above to branch_9_7 as 
well, unless you've got any objections?  Was doing some "beast" runs this 
morning and noticed a big improvement in branch_9x with your change, so I'd 
love see branch_9_7 get that benefit as well with a 9.7.1 release coming up...

> Pull replicas throws AlreadyClosedException  
> -
>
> Key: SOLR-17497
> URL: https://issues.apache.org/jira/browse/SOLR-17497
> Project: Solr
>  Issue Type: Task
>Reporter: Sanjay Dutt
>Priority: Major
> Attachments: Screenshot 2024-10-23 at 6.01.02 PM.png
>
>
> Recently, a common exception (org.apache.lucene.store.AlreadyClosedException: 
> this Directory is closed) seen in multiple failed test cases. 
> FAILED:  org.apache.solr.cloud.TestPullReplica.testKillPullReplica
> FAILED:  
> org.apache.solr.cloud.SplitShardWithNodeRoleTest.testSolrClusterWithNodeRoleWithPull
> FAILED:  org.apache.solr.cloud.TestPullReplica.testAddDocs
>  
>  
> {code:java}
> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
> uncaught exception in thread: Thread[id=10271, 
> name=fsyncService-6341-thread-1, state=RUNNABLE, 
> group=TGRP-SplitShardWithNodeRoleTest]
>         at 
> __randomizedtesting.SeedInfo.seed([3F7DACB3BC44C3C4:E5DB3E97188A8EB9]:0)
> Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is 
> closed
>         at __randomizedtesting.SeedInfo.seed([3F7DACB3BC44C3C4]:0)
>         at 
> app//org.apache.lucene.store.BaseDirectory.ensureOpen(BaseDirectory.java:50)
>         at 
> app//org.apache.lucene.store.ByteBuffersDirectory.sync(ByteBuffersDirectory.java:237)
>         at 
> app//org.apache.lucene.tests.store.MockDirectoryWrapper.sync(MockDirectoryWrapper.java:214)
>         at 
> app//org.apache.solr.handler.IndexFetcher$DirectoryFile.sync(IndexFetcher.java:2034)
>         at 
> app//org.apache.solr.handler.IndexFetcher$FileFetcher.lambda$fetch$0(IndexFetcher.java:1803)
>         at 
> app//org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$1(ExecutorUtil.java:449)
>         at 
> java.base@11.0.24/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at 
> java.base@11.0.24/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base@11.0.24/java.lang.Thread.run(Thread.java:829)
>  {code}
>  
> Interesting thing about these test cases is that they all share same kind of 
> setup where each has one shard and two replicas – one NRT and another is PULL.
>  
> Going through one of the test case execution step.
> FAILED:  org.apache.solr.cloud.TestPullReplica.testKillPullReplica
>  
> Test flow
> 1. Create a collection with 1 NRT and 1 PULL replica
> 2. waitForState
> 3. waitForNumDocsInAllActiveReplicas(0); // *Name says it all*
> 4. Index another document.
> 5. waitForNumDocsInAllActiveReplicas(1);
> 6. Stop Pull replica
> 7. Index another document
> 8. waitForNumDocsInAllActiveReplicas(2);
> 9. Start Pull Replica
> 10. waitForState
> 11. waitForNumDocsInAllActiveReplicas(2);
>  
> As per the logs the whole sequence executed successfully. Here is the link to 
> the logs: 
> [https://ge.apache.org/s/yxydiox3gvlf2/tests/task/:solr:core:test/details/org.apache.solr.cloud.TestPullReplica/testKillPullReplica/1/output]
>  (link may stop working in the future)
>  
> Last step where they are making sure that all the active replicas should have 
> two documents each has logged a info which is another proof that it completed 
> successfully. 
>  
> {code:java}
> 616575 INFO 
> (TEST-TestPullReplica.testKillPullReplica-seed#[F30CC837FDD0DC28]) [n: c: s: 
> r: x: t:] o.a.s.c.TestPullReplica Replica core_node3 
> (https://127.0.0.1:35647/solr/pull_replica_test_kill_pull_replica_shard1_replica_n1/)
>  has all 2 docs 616606 INFO (qtp1091538342-13057-null-11348) 
> [n:127.0.0.1:38207_solr c:pull_replica_test_kill_pull_replica s:shard1 
> r:core_node4 x:pull_replica_test_kill_pull_replica_shard1_replica_p2 
> t:null-11348] o.a.s.c.S.Request webapp=/solr path=/select 
> params={q=*:*&wt=javabin&version=2} rid=null-11348 hits=2 status=0 QTime=0 
> 616607 INFO 
> (TEST-TestPullReplica.testKillPullReplica-seed#[F30CC837FDD0DC28]) [n: c: s: 
> r: x: t:] o.a.s.c.TestPullReplica Replica core_node4 
> (https://127.0.0.1:38207/solr/pull_replica_test_kill_pull_replica_shard1_replica_p2/)
>  has all 2 docs{code}
>  
> *Where is the issue then?*
> In the logs it has been observed, that after restarting the PULL replica. The 
> recovery process started and after fetching all the files info from the NRT, 
>

Re: [PR] SOLR-17453: Leverage waitForState() instead of busy waiting [solr]

2024-11-04 Thread via GitHub



psalagnac commented on PR #2737:
URL: https://github.com/apache/solr/pull/2737#issuecomment-2454956000

   > One thought, is there a way to enforce the use of waitForState() pattern 
via any of our code quality tools?
   
   Not sure how we can automate decision on whether usages of `Timeout` are 
legit or not. We should use `waitForState()` instead of busy waiting for 
changes in Zookeeper, so we leverage the registered watchers. There are other 
cases, mostly when doing Solr-to-Solr requests, where we should keep `Timeout`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17535 ClusterState.collectionStream() in lieu of getCollectionStates() [solr]

2024-11-04 Thread via GitHub



murblanc commented on code in PR #2834:
URL: https://github.com/apache/solr/pull/2834#discussion_r1827321369


##
solr/solrj/src/java/org/apache/solr/common/cloud/ClusterState.java:
##
@@ -400,29 +402,14 @@ public Set getHostAllowList() {
 return hostAllowList;
   }
 
-  /**
-   * Iterate over collections. Unlike {@link #getCollectionStates()} 
collections passed to the
-   * consumer are guaranteed to exist.
-   *
-   * @param consumer collection consumer.
-   */
+  /** Streams the resolved DocCollections. Use this sparingly in case there 
are many collections. */

Review Comment:
   Better use Javadoc tags to refer to other classes



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17535 ClusterState.collectionStream() in lieu of getCollectionStates() [solr]

2024-11-04 Thread via GitHub



murblanc commented on code in PR #2834:
URL: https://github.com/apache/solr/pull/2834#discussion_r1827926712


##
solr/core/src/java/org/apache/solr/handler/designer/SchemaDesignerConfigSetHelper.java:
##
@@ -168,24 +167,12 @@ Map analyzeField(String configSet, String 
fieldName, String fiel
   }
 
   List listCollectionsForConfig(String configSet) {
-final List collections = new ArrayList<>();
-Map states =
-zkStateReader().getClusterState().getCollectionStates();
-for (Map.Entry e : states.entrySet()) {
-  final String coll = e.getKey();
-  if (coll.startsWith(DESIGNER_PREFIX)) {
-continue; // ignore temp
-  }
-
-  try {
-if (configSet.equals(e.getValue().get().getConfigName()) && 
e.getValue().get() != null) {
-  collections.add(coll);
-}
-  } catch (Exception exc) {
-log.warn("Failed to get config name for {}", coll, exc);
-  }
-}
-return collections;
+return zkStateReader()
+.getClusterState()
+.collectionStream()

Review Comment:
   My comment is unrelated to the name filter.
   This code collects collection names. The change forces it to read the 
`state.json` of the collections even though it doesn't need the additional 
info. The behavior also changes, partially created collections will no longer 
be considered when previously they were.
   
   I don't know what this specific class does with the collection name, but 
since this PR deprecates a method without providing an alternate way of 
achieving the same result (with comparable performance), I stand by my comment 
to add `getCollectionNames()` in this PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-17497) Pull replicas throws AlreadyClosedException

2024-11-04 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-17497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895334#comment-17895334
 ] 

ASF subversion and git services commented on SOLR-17497:


Commit 4c211ba5be43c63d3b9a7ccf5810b4aa73960e80 in solr's branch 
refs/heads/branch_9_7 from Sanjay Dutt
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=4c211ba5be4 ]

SOLR-17448 SOLR-17497: IndexFetcher, catch exception instead of bubbling up 
uncaught (#2800)

(cherry picked from commit cc30093c5ee988555389b50cf2333edf743bb50f)


> Pull replicas throws AlreadyClosedException  
> -
>
> Key: SOLR-17497
> URL: https://issues.apache.org/jira/browse/SOLR-17497
> Project: Solr
>  Issue Type: Task
>Reporter: Sanjay Dutt
>Priority: Major
> Attachments: Screenshot 2024-10-23 at 6.01.02 PM.png
>
>
> Recently, a common exception (org.apache.lucene.store.AlreadyClosedException: 
> this Directory is closed) seen in multiple failed test cases. 
> FAILED:  org.apache.solr.cloud.TestPullReplica.testKillPullReplica
> FAILED:  
> org.apache.solr.cloud.SplitShardWithNodeRoleTest.testSolrClusterWithNodeRoleWithPull
> FAILED:  org.apache.solr.cloud.TestPullReplica.testAddDocs
>  
>  
> {code:java}
> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
> uncaught exception in thread: Thread[id=10271, 
> name=fsyncService-6341-thread-1, state=RUNNABLE, 
> group=TGRP-SplitShardWithNodeRoleTest]
>         at 
> __randomizedtesting.SeedInfo.seed([3F7DACB3BC44C3C4:E5DB3E97188A8EB9]:0)
> Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is 
> closed
>         at __randomizedtesting.SeedInfo.seed([3F7DACB3BC44C3C4]:0)
>         at 
> app//org.apache.lucene.store.BaseDirectory.ensureOpen(BaseDirectory.java:50)
>         at 
> app//org.apache.lucene.store.ByteBuffersDirectory.sync(ByteBuffersDirectory.java:237)
>         at 
> app//org.apache.lucene.tests.store.MockDirectoryWrapper.sync(MockDirectoryWrapper.java:214)
>         at 
> app//org.apache.solr.handler.IndexFetcher$DirectoryFile.sync(IndexFetcher.java:2034)
>         at 
> app//org.apache.solr.handler.IndexFetcher$FileFetcher.lambda$fetch$0(IndexFetcher.java:1803)
>         at 
> app//org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$1(ExecutorUtil.java:449)
>         at 
> java.base@11.0.24/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at 
> java.base@11.0.24/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base@11.0.24/java.lang.Thread.run(Thread.java:829)
>  {code}
>  
> Interesting thing about these test cases is that they all share same kind of 
> setup where each has one shard and two replicas – one NRT and another is PULL.
>  
> Going through one of the test case execution step.
> FAILED:  org.apache.solr.cloud.TestPullReplica.testKillPullReplica
>  
> Test flow
> 1. Create a collection with 1 NRT and 1 PULL replica
> 2. waitForState
> 3. waitForNumDocsInAllActiveReplicas(0); // *Name says it all*
> 4. Index another document.
> 5. waitForNumDocsInAllActiveReplicas(1);
> 6. Stop Pull replica
> 7. Index another document
> 8. waitForNumDocsInAllActiveReplicas(2);
> 9. Start Pull Replica
> 10. waitForState
> 11. waitForNumDocsInAllActiveReplicas(2);
>  
> As per the logs the whole sequence executed successfully. Here is the link to 
> the logs: 
> [https://ge.apache.org/s/yxydiox3gvlf2/tests/task/:solr:core:test/details/org.apache.solr.cloud.TestPullReplica/testKillPullReplica/1/output]
>  (link may stop working in the future)
>  
> Last step where they are making sure that all the active replicas should have 
> two documents each has logged a info which is another proof that it completed 
> successfully. 
>  
> {code:java}
> 616575 INFO 
> (TEST-TestPullReplica.testKillPullReplica-seed#[F30CC837FDD0DC28]) [n: c: s: 
> r: x: t:] o.a.s.c.TestPullReplica Replica core_node3 
> (https://127.0.0.1:35647/solr/pull_replica_test_kill_pull_replica_shard1_replica_n1/)
>  has all 2 docs 616606 INFO (qtp1091538342-13057-null-11348) 
> [n:127.0.0.1:38207_solr c:pull_replica_test_kill_pull_replica s:shard1 
> r:core_node4 x:pull_replica_test_kill_pull_replica_shard1_replica_p2 
> t:null-11348] o.a.s.c.S.Request webapp=/solr path=/select 
> params={q=*:*&wt=javabin&version=2} rid=null-11348 hits=2 status=0 QTime=0 
> 616607 INFO 
> (TEST-TestPullReplica.testKillPullReplica-seed#[F30CC837FDD0DC28]) [n: c: s: 
> r: x: t:] o.a.s.c.TestPullReplica Replica core_node4 
> (https://127.0.0.1:38207/solr/pull_replica_test_kill_pull_replica_shard1_replica_p2/)
>  has all 2 docs{code}
>  
> *Where is the issue then?*
> In the logs it has been observed, that after restarting the PULL replica. The 
> recovery process star

[jira] [Commented] (SOLR-17448) Audit usage of ExecutorService#submit in Solr codebase

2024-11-04 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-17448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895333#comment-17895333
 ] 

ASF subversion and git services commented on SOLR-17448:


Commit 4c211ba5be43c63d3b9a7ccf5810b4aa73960e80 in solr's branch 
refs/heads/branch_9_7 from Sanjay Dutt
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=4c211ba5be4 ]

SOLR-17448 SOLR-17497: IndexFetcher, catch exception instead of bubbling up 
uncaught (#2800)

(cherry picked from commit cc30093c5ee988555389b50cf2333edf743bb50f)


> Audit usage of ExecutorService#submit in Solr codebase
> --
>
> Key: SOLR-17448
> URL: https://issues.apache.org/jira/browse/SOLR-17448
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 9.7
>Reporter: Andrey Bozhko
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 9.8
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> There are quite a few places in Solr codebase where the background task is 
> created by invoking `ExecutorService#submit(...)` method - but where the 
> reference to the returned future is not retained.
> So if the background task fails for any reason, and the task doesn't itself 
> have a try-catch block to log the failure, - the failure will go completely 
> unnoticed.
>  
> This ticket is to review the usage of ExecutorService#submit method in the 
> codebase, and replace those with Executor#execute where appropriate.
>  
> Originally brought up in the dev mailing list: 
> [https://lists.apache.org/thread/5f1965rltcspgw0j8nzcn2qnz9l4s8qm]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17535 ClusterState.collectionStream() in lieu of getCollectionStates() [solr]

2024-11-04 Thread via GitHub



dsmiley commented on code in PR #2834:
URL: https://github.com/apache/solr/pull/2834#discussion_r182794


##
solr/core/src/java/org/apache/solr/handler/designer/SchemaDesignerConfigSetHelper.java:
##
@@ -168,24 +167,12 @@ Map analyzeField(String configSet, String 
fieldName, String fiel
   }
 
   List listCollectionsForConfig(String configSet) {
-final List collections = new ArrayList<>();
-Map states =
-zkStateReader().getClusterState().getCollectionStates();
-for (Map.Entry e : states.entrySet()) {
-  final String coll = e.getKey();
-  if (coll.startsWith(DESIGNER_PREFIX)) {
-continue; // ignore temp
-  }
-
-  try {
-if (configSet.equals(e.getValue().get().getConfigName()) && 
e.getValue().get() != null) {
-  collections.add(coll);
-}
-  } catch (Exception exc) {
-log.warn("Failed to get config name for {}", coll, exc);
-  }
-}
-return collections;
+return zkStateReader()
+.getClusterState()
+.collectionStream()

Review Comment:
   The DocCollection was being read before as well.  It's where the configSet 
name is.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-16116: Use apache curator to manage the Solr Zookeeper interactions [solr]

2024-11-04 Thread via GitHub



HoustonPutman commented on PR #760:
URL: https://github.com/apache/solr/pull/760#issuecomment-2455058672

   Yes, that is absolutely related, but I thoroughly tested that before 
pushing 🙄 I'll take a look.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17535 ClusterState.collectionStream() in lieu of getCollectionStates() [solr]

2024-11-04 Thread via GitHub



dsmiley commented on code in PR #2834:
URL: https://github.com/apache/solr/pull/2834#discussion_r1827951977


##
solr/solrj/src/java/org/apache/solr/common/cloud/ClusterState.java:
##
@@ -382,6 +383,7 @@ void setLiveNodes(Set liveNodes) {
* Be aware that this may return collections which may not exist now. You 
can confirm that this
* collection exists after verifying CollectionRef.get() != null
*/
+  @Deprecated // see collectionStream()

Review Comment:
   Indeed; this PR shall not be merged until getCollectionNames is (today!)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Updated] (SOLR-17487) Can't POST a dense vector that contains two or more occurences of the same float value

2024-11-04 Thread Pierre Salagnac (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-17487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pierre Salagnac updated SOLR-17487:
---
Attachment: image.png

> Can't POST a dense vector that contains two or more occurences of the same 
> float value
> --
>
> Key: SOLR-17487
> URL: https://issues.apache.org/jira/browse/SOLR-17487
> Project: Solr
>  Issue Type: Bug
>  Components: UpdateRequestProcessors
>Affects Versions: 9.7, 9.6.1
>Reporter: Guillaume Jactat
>Priority: Major
> Attachments: image-2024-10-10-18-05-01-195.png, 
> image-2024-10-10-18-07-14-904.png, image-2024-10-10-18-07-19-370.png, 
> image-2024-10-10-23-27-26-566.png, image.png, vector-384.json, 
> vector-384.xml, vector-768.json
>
>
> *EDIT 10/10/2024* : 
> After a detailed analysis of the problematic vectors, I found that the 
> “missing” dimensions were actually dimensions of the same value.
> In concrete terms, the values present several times in the posted vectors are 
> deduplicated by Solr.
> You can see for yourself that the vectors supplied as attachments have the 
> common characteristic of containing {*}two or more occurences of the very 
> same float value{*}. The embedding model I use (all-minilm:33m) seems to 
> generate many such cases. 
> It seems that {*}Solr only takes into account the first occurrence of these 
> values{*}. As a result, the length of the final vector is no longer correct.
> The following screenshot show exactly what happens. With a smaller vector 
> field type of size 5. We can see that the vector [1, 5, 3, 4, 5] becomes [1, 
> 5, 3, 4].
> !image-2024-10-10-23-27-26-566.png!
>  
> -
> Hello,
>  
> I'm using Solr 9.7 as a vector database. I've come across something I can't 
> explain : I POST my documents as JSON and I've got a vector field of 
> dimension {*}768{*}.
>  
> The JSON document I POST has a vector field, which is an array of length 768. 
> Each value is a float.
>  
> Solr complains that my array is only *767* long...
> I've compared the JSON I POST and the array parsed by Solr and written in the 
> logs And indeed, one of the 768 values has simply disappeared in the 
> process.
>  
> The problem can easily be reproduced. All you have to do is :
>  * In your "schema.xml", declare the following dense vector field type :
> {code:java}
>  vectorDimension="768" similarityFunction="cosine"/>{code}
>  * In your schema.xml, declare the followig dense vector dynamic field :
> {code:java}
>  stored="true"/>{code}
>  * Use the Solr Admin UI to post the *attached document* to your Solr core.
>  * You should get the following error : "{*}incorrect vector dimension. The 
> vector value has size 767 while it is expected a vector with size 768"{*}
>  
>  * Furthermore, while the POSTed vector has 768 size, the vector written in 
> the logs is only 767... One value is missing. You can easily spot the missing 
> value with a simple diff.
> Maybe someone will find the reason why this specific vector leads to this 
> issue. Of course, I have plenty of others documents that get indexed without 
> any issue.
> In case it helps, the value that disappears from the 768 vector is 
> "0.0335415453". It's the 384th dimension (starting from 1)
> !image-2024-10-10-18-07-19-370.png!
> Thanks for reading



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Updated] (SOLR-17487) Can't POST a dense vector that contains two or more occurences of the same float value

2024-11-04 Thread Pierre Salagnac (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-17487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pierre Salagnac updated SOLR-17487:
---
Attachment: (was: image.png)

> Can't POST a dense vector that contains two or more occurences of the same 
> float value
> --
>
> Key: SOLR-17487
> URL: https://issues.apache.org/jira/browse/SOLR-17487
> Project: Solr
>  Issue Type: Bug
>  Components: UpdateRequestProcessors
>Affects Versions: 9.7, 9.6.1
>Reporter: Guillaume Jactat
>Priority: Major
> Attachments: image-2024-10-10-18-05-01-195.png, 
> image-2024-10-10-18-07-14-904.png, image-2024-10-10-18-07-19-370.png, 
> image-2024-10-10-23-27-26-566.png, vector-384.json, vector-384.xml, 
> vector-768.json
>
>
> *EDIT 10/10/2024* : 
> After a detailed analysis of the problematic vectors, I found that the 
> “missing” dimensions were actually dimensions of the same value.
> In concrete terms, the values present several times in the posted vectors are 
> deduplicated by Solr.
> You can see for yourself that the vectors supplied as attachments have the 
> common characteristic of containing {*}two or more occurences of the very 
> same float value{*}. The embedding model I use (all-minilm:33m) seems to 
> generate many such cases. 
> It seems that {*}Solr only takes into account the first occurrence of these 
> values{*}. As a result, the length of the final vector is no longer correct.
> The following screenshot show exactly what happens. With a smaller vector 
> field type of size 5. We can see that the vector [1, 5, 3, 4, 5] becomes [1, 
> 5, 3, 4].
> !image-2024-10-10-23-27-26-566.png!
>  
> -
> Hello,
>  
> I'm using Solr 9.7 as a vector database. I've come across something I can't 
> explain : I POST my documents as JSON and I've got a vector field of 
> dimension {*}768{*}.
>  
> The JSON document I POST has a vector field, which is an array of length 768. 
> Each value is a float.
>  
> Solr complains that my array is only *767* long...
> I've compared the JSON I POST and the array parsed by Solr and written in the 
> logs And indeed, one of the 768 values has simply disappeared in the 
> process.
>  
> The problem can easily be reproduced. All you have to do is :
>  * In your "schema.xml", declare the following dense vector field type :
> {code:java}
>  vectorDimension="768" similarityFunction="cosine"/>{code}
>  * In your schema.xml, declare the followig dense vector dynamic field :
> {code:java}
>  stored="true"/>{code}
>  * Use the Solr Admin UI to post the *attached document* to your Solr core.
>  * You should get the following error : "{*}incorrect vector dimension. The 
> vector value has size 767 while it is expected a vector with size 768"{*}
>  
>  * Furthermore, while the POSTed vector has 768 size, the vector written in 
> the logs is only 767... One value is missing. You can easily spot the missing 
> value with a simple diff.
> Maybe someone will find the reason why this specific vector leads to this 
> issue. Of course, I have plenty of others documents that get indexed without 
> any issue.
> In case it helps, the value that disappears from the 768 vector is 
> "0.0335415453". It's the 384th dimension (starting from 1)
> !image-2024-10-10-18-07-19-370.png!
> Thanks for reading



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17516: LBHttpSolrClient: support HttpJdkSolrClient (Generic Version) [solr]

2024-11-04 Thread via GitHub



dsmiley commented on code in PR #2828:
URL: https://github.com/apache/solr/pull/2828#discussion_r1827982607


##
solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBSolrClient.java:
##
@@ -699,11 +699,20 @@ public NamedList request(
 if (e.getRootCause() instanceof IOException) {
   ex = e;
   moveAliveToDead(wrapper);
-  if (justFailed == null) justFailed = new HashMap<>();
+  if (justFailed == null) {
+justFailed = new HashMap<>();
+  }

Review Comment:
   It's a shame to see duplication in getRootCause IOException detection with 
the dedicated clause.  Maybe it'd be cleaner to have another try-catch just 
around calling request() above that detects the IOException and throws it?  
Just a thought; I leave it to you.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17538: CloudHttp2SolrClient needs a custom ClusterStateProvider option [solr]

2024-11-04 Thread via GitHub



dsmiley commented on code in PR #2832:
URL: https://github.com/apache/solr/pull/2832#discussion_r1827994098


##
solr/solrj/src/java/org/apache/solr/client/solrj/impl/CloudHttp2SolrClient.java:
##
@@ -233,6 +240,11 @@ public Builder(List zkHosts, Optional 
zkChroot) {
   if (zkChroot.isPresent()) this.zkChroot = zkChroot.get();
 }
 
+/* for an expert use-case */

Review Comment:
   Missing another asterisk



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] Prevent conflicting connection flags from being used via OptionGroup [solr]

2024-11-04 Thread via GitHub



malliaridis commented on PR #2725:
URL: https://github.com/apache/solr/pull/2725#issuecomment-2455326616

   @epugh this one is ready I believe. The merge with main was ugly. >.<
   
   I ended up with more lines added than removed, even though I have simplified 
and removed redundant elements. This is probably only because of the extraction 
of options to a separate file or variables.
   
   I made a few cleanups and migrations for consistency and also fixed a bug in 
StatusTool (see 31b1becd357e3f6d22950e9868cd1ff07686b129). Now we are using 
`getOptionValue` only with `Option` as parameter, not strings. And I also 
migrated to `getParsedOptionValue` wherever possible (booleans are not 
supported and the File parsing seems buggy).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] Prevent conflicting connection flags from being used via OptionGroup [solr]

2024-11-04 Thread via GitHub



malliaridis commented on PR #2725:
URL: https://github.com/apache/solr/pull/2725#issuecomment-2455332282

   I didn't find any options that are mutually exclusive to use groups. This 
was relevant for the deprecated options to simplify the `getOptionValue` with a 
single option group. But since we have removed the deprecated options, we are 
back again using only `Option`s.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-15929) Clean up NOTICE and LICENSE files for Solr

2024-11-04 Thread Christos Malliaridis (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-15929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895377#comment-17895377
 ] 

Christos Malliaridis commented on SOLR-15929:
-

We should also make sure that all files are up-to-date. I noticed a few 
differences in some libraries.

Since we may not have a single source of truth for these files, we may end up 
with inconsistencies anyway, but standardizing the approach of fetching the 
content of these files should elp with that (more of a documentation task 
probably).

> Clean up NOTICE and LICENSE files for Solr
> --
>
> Key: SOLR-15929
> URL: https://issues.apache.org/jira/browse/SOLR-15929
> Project: Solr
>  Issue Type: Improvement
>Reporter: Jan Høydahl
>Priority: Major
>
> Spinoff from SOLR-15862 and SOLR-2406:
> We need a total cleanup of both these files
>  * Move lots of (C) notices from NOTICE to LICENSE file
>  * Cross-check that we list all dependencies, and that removed deps (such as 
> for DIH etc) are removed from NOTICE/LICENSE
> I wonder if 
> [https://github.com/apache/solr/blob/main/solr/licenses/README.committers.txt]
>  should also be relocated to either `dev-docs/` or `help/` to make it easier 
> to find. It is hard to get the license/notice stuff right, so we need a good 
> guide for committers!
> See [https://infra.apache.org/licensing-howto.html] for the requirements. PS: 
> Any preference whether we should rename the files without {{.txt}} suffix?
> Also, our source and binary distributions are quite different, and would 
> ideally have different LICENSE and NOTICE files compared to the binary 
> distro. I think the Apache Whisker tool could potientailly help with this 
> [https://creadur.apache.org/whisker/index.html] but have not looked deeply.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Created] (SOLR-17542) AccessControlException when attempting to post document

2024-11-04 Thread Ivo Janssen (Jira)

Ivo Janssen created SOLR-17542:
--

 Summary: AccessControlException when attempting to post document
 Key: SOLR-17542
 URL: https://issues.apache.org/jira/browse/SOLR-17542
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: contrib - Solr Cell (Tika extraction), security
Affects Versions: 9.7
 Environment: * Solr 9.7
 * MacOS 15.0.1
 * M1 Max CPU
 * 64GB RAM
Reporter: Ivo Janssen


I'm using Solr 9.7 on MacOS 15.0.1, with Cell enabled, and it returns a 500 
error when I try to add a document. The error on Solr's side is as follows:

{noformat}
2024-10-18 00:49:03.350 INFO  (qtp1955990522-40-localhost-1) [c: s: r: 
x:test_docstore t:localhost-1] o.a.s.c.PluginBag Going to create a new 
requestHandler with {type = requestHandler,name = /update/extract,class = 
solr.extraction.ExtractingRequestHandler,attributes = {startup=lazy, 
name=/update/extract, class=solr.extraction.ExtractingRequestHandler},args = 
{defaults={fmap.Last-Modified=last_modified, uprefix=ignored_, df=_text_}}}
2024-10-18 00:49:03.653 ERROR (qtp1955990522-40-localhost-1) [c: s: r: 
x:test_docstore t:localhost-1] o.a.s.s.HttpSolrCall 500 Exception => 
java.lang.IllegalStateException: java.security.AccessControlException: access 
denied ("java.io.FilePermission" 
"/private/var/folders/8y/0166d0yx0wd7lxycs42l6t9cgs/T/jetty-127_0_0_1-8983-webapp-_solr-any-16097010865664396603"
 "read")
at 
org.eclipse.jetty.server.MultiPartFormInputStream.throwIfError(MultiPartFormInputStream.java:526)
java.lang.IllegalStateException: java.security.AccessControlException: access 
denied ("java.io.FilePermission" 
"/private/var/folders/8y/0166d0yx0wd7lxycs42l6t9cgs/T/jetty-127_0_0_1-8983-webapp-_solr-any-16097010865664396603"
 "read")
at 
org.eclipse.jetty.server.MultiPartFormInputStream.throwIfError(MultiPartFormInputStream.java:526)
 ~[jetty-server-10.0.22.jar:10.0.22]
at 
org.eclipse.jetty.server.MultiPartFormInputStream.getParts(MultiPartFormInputStream.java:491)
 ~[jetty-server-10.0.22.jar:10.0.22]
at 
org.eclipse.jetty.server.MultiParts$MultiPartsHttpParser.getParts(MultiParts.java:90)
 ~[jetty-server-10.0.22.jar:10.0.22]
at org.eclipse.jetty.server.Request.getParts(Request.java:2354) 
~[jetty-server-10.0.22.jar:10.0.22]
at org.eclipse.jetty.server.Request.getParts(Request.java:2328) 
~[jetty-server-10.0.22.jar:10.0.22]
at 
javax.servlet.http.HttpServletRequestWrapper.getParts(HttpServletRequestWrapper.java:317)
 ~[jetty-servlet-api-4.0.6.jar:?]
at 
org.apache.solr.servlet.SolrRequestParsers$MultipartRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:649)
 ~[?:?]
at 
org.apache.solr.servlet.SolrRequestParsers$StandardRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:893)
 ~[?:?]
at 
org.apache.solr.servlet.SolrRequestParsers.parse(SolrRequestParsers.java:169) 
~[?:?]
at org.apache.solr.servlet.HttpSolrCall.init(HttpSolrCall.java:313) ~[?:?]
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:524) ~[?:?]
at 
org.apache.solr.servlet.SolrDispatchFilter.dispatch(SolrDispatchFilter.java:251)
 ~[?:?]
at 
org.apache.solr.servlet.SolrDispatchFilter.lambda$doFilter$0(SolrDispatchFilter.java:208)
 ~[?:?]
at 
org.apache.solr.servlet.ServletUtils.traceHttpRequestExecution2(ServletUtils.java:243)
 ~[?:?]
at org.apache.solr.servlet.ServletUtils.rateLimitRequest(ServletUtils.java:213) 
~[?:?]
{noformat}

I've confirmed that this is related to the security policy, since I'm able to 
work around it by running Solr with `-Djava.security.manager=allow`, but 
looking at the policy nothing jumps out at me for being wrong or missing.

[Link to discussion on the mailing 
list|https://lists.apache.org/thread/8grxnpnxtyb2c1wb4j4vpl88vktzfy13] 
(disregard my attempted fix in that thread - it was incorrect)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17534 ClusterState.getCollectionNames [solr]

2024-11-04 Thread via GitHub



dsmiley merged PR #2826:
URL: https://github.com/apache/solr/pull/2826


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] Update org.hamcrest:* to v3 (major) [solr]

2024-11-04 Thread via GitHub



malliaridis merged PR #2617:
URL: https://github.com/apache/solr/pull/2617


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-7871) Platform independent config file instead of solr.in.sh and solr.in.cmd

2024-11-04 Thread Eric Pugh (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895278#comment-17895278
 ] 

Eric Pugh commented on SOLR-7871:
-

Hi all..    I'd like to take to another, but fresh run, at this idea.   I'm 
going to come from the perspective that I've formed from using Ruby on Rails, 
that you have a configuraiton that establishes what your environment looks 
like.  To learn more about on Rails does it, check out this (GPT) summary: 
[https://poe.com/s/MvWTBLoLz7S60kCKaKPE]

 

What does this look like for Solr?

We introduce two default files.   development.yml and production.yml.   They 
live in ./server/solr/environments directory.   When you run bin/solr start, 
you automagically reading in the development.yml.   You start bin/solr start -e 
production and you read in production.yml.   You can create your own as well.  

How do we adopt this?

Start with bin/solr start and bin/solr stop command.  

Ensure that bin/solr start feeds all the vairous enviornment/system properties 
into a running Solr.

Then expand to the other bin/solr commands to make sure they are using this 
configuration.

Then, deprecate in 9x solr.in.sh and solr.in.cmd.

Make sure the start service continues to function.

Then, in 10x remove solr.in.sh and solr.in.cmd in favour of the environment 
based configuration files.

 

the configuraiton files will list out ALL of the various sytem properties with 
some short comments, and have them enabled and disable as makes sense for that 
envionment.  

 

 

 

 

> Platform independent config file instead of solr.in.sh and solr.in.cmd
> --
>
> Key: SOLR-7871
> URL: https://issues.apache.org/jira/browse/SOLR-7871
> Project: Solr
>  Issue Type: Improvement
>  Components: scripts and tools
>Affects Versions: 5.2.1
>Reporter: Jan Høydahl
>Priority: Major
>  Labels: bin/solr
> Attachments: SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, 
> SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, 
> SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, 
> SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch
>
>
> Spinoff from SOLR-7043
> The config files {{solr.in.sh}} and {{solr.in.cmd}} are currently executable 
> batch files, but all they do is to set environment variables for the start 
> scripts on the format {{key=value}}
> Suggest to instead have one central platform independent config file e.g. 
> {{bin/solr.yml}} or {{bin/solrstart.properties}} which is parsed by 
> {{SolrCLI.java}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17256: Remove usages of deprecated SolrRequest.setBasePath [solr]

2024-11-04 Thread via GitHub



gerlowskija commented on code in PR #2811:
URL: https://github.com/apache/solr/pull/2811#discussion_r1827843121


##
solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBSolrClient.java:
##
@@ -480,14 +480,35 @@ private static boolean isTimeExceeded(long 
timeAllowedNano, long timeOutTime) {
 return timeAllowedNano > 0 && System.nanoTime() > timeOutTime;
   }
 
+  private NamedList doMakeRequest(Endpoint endpoint, SolrRequest 
solrRequest)
+  throws SolrServerException, IOException {
+final var solrClient = getClient(endpoint);
+return doMakeRequest(solrClient, endpoint.getBaseUrl(), 
endpoint.getCore(), solrRequest);
+  }
+
+  // TODO This special casing can be removed if either: (1) SOLR-16367 is 
completed, or (2)
+  // LBHttp2SolrClient.getClient() is modified to return a client already 
pointed at the correct URL
+  private NamedList doMakeRequest(
+  SolrClient solrClient, String baseUrl, String collection, SolrRequest 
solrRequest)
+  throws SolrServerException, IOException {
+// Some implementations of LBSolrClient.getClient(...) return a 
Http2SolrClient that may not be
+// pointed at the desired URL (or any URL for that matter).  We special 
case that here to ensure
+// the appropriate URL is provided.
+if (solrClient instanceof Http2SolrClient) {
+  final var httpSolrClient = (Http2SolrClient) solrClient;
+  return httpSolrClient.requestWithBaseUrl(baseUrl, (c) -> 
c.request(solrRequest, collection));
+}
+
+return solrClient.request(solrRequest, collection);

Review Comment:
   Great - just created SOLR-17541, and updated the TODO comment around this 
method accordingly



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17256: Remove usages of deprecated SolrRequest.setBasePath [solr]

2024-11-04 Thread via GitHub



gerlowskija commented on PR #2811:
URL: https://github.com/apache/solr/pull/2811#issuecomment-2454900565

   > I know it's out of scope... sort of... but @iamsanjay recently updated 
SolrClientNodeStateProvider.invoke to not call setBasePath but the result is 
more lines of code / complexity than needed.
   
   I haven't run tests to validate the change yet, but I've included this for 
now.  Will walk it back if it causes any complications though...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17453: Leverage waitForState() instead of busy waiting [solr]

2024-11-04 Thread via GitHub



psalagnac commented on code in PR #2737:
URL: https://github.com/apache/solr/pull/2737#discussion_r1827843673


##
solr/core/src/java/org/apache/solr/cloud/api/collections/CreateCollectionCmd.java:
##
@@ -221,24 +223,19 @@ public void call(ClusterState clusterState, ZkNodeProps 
message, NamedList

Re: [PR] SOLR-17453: Leverage waitForState() instead of busy waiting [solr]

2024-11-04 Thread via GitHub



psalagnac commented on code in PR #2737:
URL: https://github.com/apache/solr/pull/2737#discussion_r1827855744


##
solr/test-framework/src/java/org/apache/solr/cloud/AbstractDistribZkTestBase.java:
##
@@ -242,45 +240,15 @@ public static void waitForCollectionToDisappear(
 log.info("Collection has disappeared - collection:{}", collection);
   }
 
-  static void waitForNewLeader(
-  CloudSolrClient cloudClient, String shardName, Replica oldLeader, 
TimeOut timeOut)
+  static void waitForNewLeader(CloudSolrClient cloudClient, String shardName, 
Replica oldLeader)
   throws Exception {
-log.info("Will wait for a node to become leader for {} secs", 
timeOut.timeLeft(SECONDS));
+log.info("Will wait for a node to become leader for 15 secs");
 ZkStateReader zkStateReader = ZkStateReader.from(cloudClient);
-zkStateReader.forceUpdateCollection(DEFAULT_COLLECTION);
-
-for (; ; ) {
-  ClusterState clusterState = zkStateReader.getClusterState();
-  DocCollection coll = clusterState.getCollection("collection1");
-  Slice slice = coll.getSlice(shardName);
-  if (slice.getLeader() != null
-  && !slice.getLeader().equals(oldLeader)
-  && slice.getLeader().getState() == Replica.State.ACTIVE) {
-if (log.isInfoEnabled()) {
-  log.info(
-  "Old leader {}, new leader {}. New leader got elected in {} ms",
-  oldLeader,
-  slice.getLeader(),
-  timeOut.timeElapsed(MILLISECONDS));
-}
-break;
-  }
-
-  if (timeOut.hasTimedOut()) {

Review Comment:
   Good point. But this is at the cost of potentially slower test execution 
since we don't unblock the test thread when we receive the ZK watch.
   
   We would get the same if we call `logThreadDumps(...)` and 
`printLayoutToStream(...)` in case of error. I can do such a change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17453: Leverage waitForState() instead of busy waiting [solr]

2024-11-04 Thread via GitHub



psalagnac commented on code in PR #2737:
URL: https://github.com/apache/solr/pull/2737#discussion_r1827860425


##
solr/core/src/test/org/apache/solr/cloud/TestRebalanceLeaders.java:
##
@@ -604,74 +572,61 @@ private void forceUpdateCollectionStatus() {
 
   // Since we have to restart jettys, we don't want to try re-balancing etc. 
until we're sure all
   // jettys that should be up are and all replicas are active.
-  private void checkReplicasInactive(List downJettys) throws 
InterruptedException {
-TimeOut timeout = new TimeOut(timeoutMs, TimeUnit.MILLISECONDS, 
TimeSource.NANO_TIME);
-DocCollection docCollection = null;
-Set liveNodes = null;
+  private void checkReplicasInactive(List downJettys) {
 
 Set downJettyNodes = new TreeSet<>();
 for (JettySolrRunner jetty : downJettys) {
   downJettyNodes.add(
   jetty.getBaseUrl().getHost() + ":" + jetty.getBaseUrl().getPort() + 
"_solr");
 }
-while (timeout.hasTimedOut() == false) {
-  forceUpdateCollectionStatus();
-  docCollection = 
cluster.getSolrClient().getClusterState().getCollection(COLLECTION_NAME);
-  liveNodes = cluster.getSolrClient().getClusterState().getLiveNodes();
-  boolean expectedInactive = true;
-
-  for (Slice slice : docCollection.getSlices()) {
-for (Replica rep : slice.getReplicas()) {
-  if (downJettyNodes.contains(rep.getNodeName()) == false) {
-continue; // We are on a live node
-  }
-  // A replica on an allegedly down node is reported as active.
-  if (rep.isActive(liveNodes)) {
-expectedInactive = false;
+
+waitForState(
+"Waiting for all replicas to become inactive",
+COLLECTION_NAME,
+(liveNodes, docCollection) -> {
+  boolean expectedInactive = true;
+
+  for (Slice slice : docCollection.getSlices()) {
+for (Replica rep : slice.getReplicas()) {
+  if (!downJettyNodes.contains(rep.getNodeName())) {

Review Comment:
   I flipped this because it is reported by my IDE. I didn't know such a 
convention was embraced at some point. Will change back to `== false` form.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17516: LBHttpSolrClient: support HttpJdkSolrClient (Generic Version) [solr]

2024-11-04 Thread via GitHub



jdyer1 commented on code in PR #2828:
URL: https://github.com/apache/solr/pull/2828#discussion_r1827873836


##
solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpJdkSolrClient.java:
##
@@ -173,8 +173,8 @@ public NamedList request(SolrRequest 
solrRequest, String collection)
   "Timeout occurred while waiting response from server at: " + 
pReq.url, e);
 } catch (SolrException se) {
   throw se;
-} catch (RuntimeException re) {
-  throw new SolrServerException(re);
+} catch (IOException | RuntimeException e) {

Review Comment:
   Indeed, the doc-comment on the `LBSolrClient` request method says,
   
   >  If a request fails due to an IOException, the server is moved to the dead 
pool
   
   So I restored the previous behavior in `HttpJdkSolrClient` and made the 
modification in `LBSolrClient`.  This new behavior is covered by 
`LBHttp2SolrClientIntegrationTest`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-16790) Umbrella - Improve Solr CLI tools

2024-11-04 Thread Eric Pugh (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-16790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895260#comment-17895260
 ] 

Eric Pugh commented on SOLR-16790:
--

I want to do a bit of Jira tidying, and get rid of this issue by moving tickets 
to SOLR-16757.

> Umbrella - Improve Solr CLI tools
> -
>
> Key: SOLR-16790
> URL: https://issues.apache.org/jira/browse/SOLR-16790
> Project: Solr
>  Issue Type: Improvement
>  Components: scripts and tools
>Affects Versions: 9.2.1
>Reporter: Shawn Heisey
>Assignee: Shawn Heisey
>Priority: Major
>
> There are a lot of things we can do to improve Solr's startup.  We can use 
> this issue as an umbrella to keep track of various pieces of that work.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Resolved] (SOLR-16790) Umbrella - Improve Solr CLI tools

2024-11-04 Thread Eric Pugh (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-16790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Pugh resolved SOLR-16790.
--
Resolution: Duplicate

I am marking this as a duplicate of SOLR-16757 and moved the child tickets over 
to that JIRA.   It's great that we are all getting the CLI to healthy place in 
the 9x and 10x lines of Solr!

> Umbrella - Improve Solr CLI tools
> -
>
> Key: SOLR-16790
> URL: https://issues.apache.org/jira/browse/SOLR-16790
> Project: Solr
>  Issue Type: Improvement
>  Components: scripts and tools
>Affects Versions: 9.2.1
>Reporter: Shawn Heisey
>Assignee: Shawn Heisey
>Priority: Major
>
> There are a lot of things we can do to improve Solr's startup.  We can use 
> this issue as an umbrella to keep track of various pieces of that work.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Created] (SOLR-17541) LBSolrClient implementations should agree on 'getClient()' semantics

2024-11-04 Thread Jason Gerlowski (Jira)

Jason Gerlowski created SOLR-17541:
--

 Summary: LBSolrClient implementations should agree on 
'getClient()' semantics 
 Key: SOLR-17541
 URL: https://issues.apache.org/jira/browse/SOLR-17541
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrJ
Affects Versions: 9.7
Reporter: Jason Gerlowski


LBSolrClient has an abstract "getClient(String url)" method that is used to 
fetch a "Http" SolrClient appropriate for the specified URL.

But implementations of this method differ in the client that is returned.  
LBHttpSolrClient returns a client that is already pointed at the specified URL 
and can be used without modification. But LBHttp2SolrClient returns a client 
with no URL altogether, that must be pointed at the right endpoint prior to 
use.  This is a bit messy, and complicates the calling code in LBSolrClient 
quite a bit.

We should choose one of these approaches and use it for all LBSolrClient 
implementations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[PR] Backport: Update org.hamcrest:* to v3 (major) (#2617) [solr]

2024-11-04 Thread via GitHub



malliaridis opened a new pull request, #2845:
URL: https://github.com/apache/solr/pull/2845

   * Update org.hamcrest:* to v3
   
   * Update hamcrest license file
   
   -
   
   Co-authored-by: Christos Malliaridis 
   (cherry picked from commit f216c984d348c12cc4c4c24e24ee6bf014cc9b01)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17535 ClusterState.collectionStream() in lieu of getCollectionStates() [solr]

2024-11-04 Thread via GitHub



murblanc commented on code in PR #2834:
URL: https://github.com/apache/solr/pull/2834#discussion_r1827322144


##
solr/solrj/src/java/org/apache/solr/common/cloud/ClusterState.java:
##
@@ -400,29 +402,14 @@ public Set getHostAllowList() {
 return hostAllowList;
   }
 
-  /**
-   * Iterate over collections. Unlike {@link #getCollectionStates()} 
collections passed to the
-   * consumer are guaranteed to exist.
-   *
-   * @param consumer collection consumer.
-   */
+  /** Streams the resolved DocCollections. Use this sparingly in case there 
are many collections. */
+  public Stream collectionStream() {
+return 
collectionStates.values().stream().map(CollectionRef::get).filter(Objects::nonNull);
+  }
+
+  /** Streams the resolved DocCollections. Use this sparingly in case there 
are many collections. */

Review Comment:
   Seems that Javadoc comment was copied from above method and needs updating



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Assigned] (SOLR-17256) Remove SolrRequest.getBasePath setBasePath

2024-11-04 Thread Jason Gerlowski (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski reassigned SOLR-17256:
--

Assignee: Jason Gerlowski

> Remove SolrRequest.getBasePath setBasePath
> --
>
> Key: SOLR-17256
> URL: https://issues.apache.org/jira/browse/SOLR-17256
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrJ
>Reporter: David Smiley
>Assignee: Jason Gerlowski
>Priority: Minor
>  Labels: newdev, pull-request-available
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> SolrRequest has a getBasePath & setBasePath.  The naming is poor; it's the 
> URL base to the Solr node like "http://localhost:8983/solr";.  It's only 
> recognized by HttpSolrClient; LBSolrClient (used by CloudSolrClient) ignores 
> it and will in fact mutate the passed in request to its liking, which is 
> rather ugly because it means a request cannot be used concurrently if the 
> user wants to.  But moreover I think there's a conceptual discordance of 
> placing this concept on SolrRequest given that some clients want to route 
> requests to nodes *they* choose.  I propose removing this from SolrRequest 
> and instead adding a method specific to HttpSolrClient.  Almost all existing 
> usages of setBasePath immediately execute the request on an HttpSolrClient, 
> so should be easy to change.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17256: Remove usages of deprecated SolrRequest.setBasePath [solr]

2024-11-04 Thread via GitHub



gerlowskija commented on PR #2811:
URL: https://github.com/apache/solr/pull/2811#issuecomment-2455573937

   Tests and 'check' all look good; will aim to merge tomorrow pending any last 
comments!  Thanks @dsmiley for all the review!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Comment Edited] (SOLR-15929) Clean up NOTICE and LICENSE files for Solr

2024-11-04 Thread Jira



[ 
https://issues.apache.org/jira/browse/SOLR-15929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895429#comment-17895429
 ] 

Jan Høydahl edited comment on SOLR-15929 at 11/4/24 9:22 PM:
-

I think we may need a spreadsheet with a line per Java dependency and also the 
adminui ones, with columns for license-type, whether it has notice, whether it 
is a test only dep, whether the licenses/txt files are correct, whether it is 
part of slim distro etc.

We should also iterate all (C) notices in current LICENSE/NOTICE files to find 
which are no longer relevant and which must be retained/updated - may be some 
forked piece of source code not mentioned in version catalog.

Then we can scrap current LICENSE.txt and NOTICE.txt and make a script that 
generates these in three variants during tarball build:
- Binary, full
- Binary, slim
- Src - only for src code we ship, not for binary maven deps


was (Author: janhoy):
I think we may need a spreadsheet with a line per Java dependency and also the 
adminui ones, with columns for license-type, whether it has notice, whether it 
is a test only dep, whether the licenses/txt files are correct, whether it is 
part of slim distro etc.

Then we can scrap current LICENSE.txt and NOTICE.txt and make a script that 
generates these in three variants during tarball build:
- Binary, full
- Binary, slim
- Src (superset including test deps)

> Clean up NOTICE and LICENSE files for Solr
> --
>
> Key: SOLR-15929
> URL: https://issues.apache.org/jira/browse/SOLR-15929
> Project: Solr
>  Issue Type: Improvement
>Reporter: Jan Høydahl
>Priority: Major
>
> Spinoff from SOLR-15862 and SOLR-2406:
> We need a total cleanup of both these files
>  * Move lots of (C) notices from NOTICE to LICENSE file
>  * Cross-check that we list all dependencies, and that removed deps (such as 
> for DIH etc) are removed from NOTICE/LICENSE
> I wonder if 
> [https://github.com/apache/solr/blob/main/solr/licenses/README.committers.txt]
>  should also be relocated to either `dev-docs/` or `help/` to make it easier 
> to find. It is hard to get the license/notice stuff right, so we need a good 
> guide for committers!
> See [https://infra.apache.org/licensing-howto.html] for the requirements. PS: 
> Any preference whether we should rename the files without {{.txt}} suffix?
> Also, our source and binary distributions are quite different, and would 
> ideally have different LICENSE and NOTICE files compared to the binary 
> distro. I think the Apache Whisker tool could potientailly help with this 
> [https://creadur.apache.org/whisker/index.html] but have not looked deeply.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Comment Edited] (SOLR-15929) Clean up NOTICE and LICENSE files for Solr

2024-11-04 Thread Jira



[ 
https://issues.apache.org/jira/browse/SOLR-15929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895429#comment-17895429
 ] 

Jan Høydahl edited comment on SOLR-15929 at 11/4/24 9:27 PM:
-

I think we may need a spreadsheet with a line per Java dependency and also the 
adminui ones, with columns for license-type, whether it has notice, whether it 
is a test only dep, whether the licenses/txt files are correct, whether it is 
part of slim distro etc.

We should also iterate all (C) notices in current LICENSE/NOTICE files to find 
which are no longer relevant and which must be retained/updated - may be some 
forked piece of source code not mentioned in version catalog.

Then we can scrap current LICENSE.txt and NOTICE.txt and make a script that 
generates these in three variants during tarball build:
- Binary, full
- Binary, slim
- Src - only for src code we ship, not for binary maven deps

I wonder if we could get help form SBOM tooling here? Would be cool to also 
publish a structured SBOM file with our tarballs.


was (Author: janhoy):
I think we may need a spreadsheet with a line per Java dependency and also the 
adminui ones, with columns for license-type, whether it has notice, whether it 
is a test only dep, whether the licenses/txt files are correct, whether it is 
part of slim distro etc.

We should also iterate all (C) notices in current LICENSE/NOTICE files to find 
which are no longer relevant and which must be retained/updated - may be some 
forked piece of source code not mentioned in version catalog.

Then we can scrap current LICENSE.txt and NOTICE.txt and make a script that 
generates these in three variants during tarball build:
- Binary, full
- Binary, slim
- Src - only for src code we ship, not for binary maven deps

> Clean up NOTICE and LICENSE files for Solr
> --
>
> Key: SOLR-15929
> URL: https://issues.apache.org/jira/browse/SOLR-15929
> Project: Solr
>  Issue Type: Improvement
>Reporter: Jan Høydahl
>Priority: Major
>
> Spinoff from SOLR-15862 and SOLR-2406:
> We need a total cleanup of both these files
>  * Move lots of (C) notices from NOTICE to LICENSE file
>  * Cross-check that we list all dependencies, and that removed deps (such as 
> for DIH etc) are removed from NOTICE/LICENSE
> I wonder if 
> [https://github.com/apache/solr/blob/main/solr/licenses/README.committers.txt]
>  should also be relocated to either `dev-docs/` or `help/` to make it easier 
> to find. It is hard to get the license/notice stuff right, so we need a good 
> guide for committers!
> See [https://infra.apache.org/licensing-howto.html] for the requirements. PS: 
> Any preference whether we should rename the files without {{.txt}} suffix?
> Also, our source and binary distributions are quite different, and would 
> ideally have different LICENSE and NOTICE files compared to the binary 
> distro. I think the Apache Whisker tool could potientailly help with this 
> [https://creadur.apache.org/whisker/index.html] but have not looked deeply.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17456: TransactionLog ctor integrity [solr]

2024-11-04 Thread via GitHub



dsmiley merged PR #2762:
URL: https://github.com/apache/solr/pull/2762


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17390: EmbeddedSolrServer now considers the ResponseParser [solr]

2024-11-04 Thread via GitHub



dsmiley merged PR #2774:
URL: https://github.com/apache/solr/pull/2774


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-17456) TransactionLog NPE

2024-11-04 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895443#comment-17895443
 ] 

ASF subversion and git services commented on SOLR-17456:


Commit e849fd0540fb4f0e013a1f73e93c3e85a933ed83 in solr's branch 
refs/heads/main from David Smiley
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=e849fd0540f ]

SOLR-17456: TransactionLog ctor integrity (#2762)

The TransactionLog constructor can't handle an existing file being present; it 
shouldn't be there.
Should throw an exception in this case, NOT log a warning which would leave the 
object in a partially constructed state.
This should happen in the first place, of course.  I see no evidence it has 
occurred.

> TransactionLog NPE
> --
>
> Key: SOLR-17456
> URL: https://issues.apache.org/jira/browse/SOLR-17456
> Project: Solr
>  Issue Type: Bug
>Reporter: David Smiley
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In an erroneous case, a TransactionLog should throw an exception if an 
> unexpected log file exists instead of merely log a warning in its 
> constructor.  The latter leaves the file in a partially constructed state 
> that leads to NPEs when it's used later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-17487) Can't POST a dense vector that contains two or more occurences of the same float value

2024-11-04 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-17487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895390#comment-17895390
 ] 

David Smiley commented on SOLR-17487:
-

Solr's [default 
solrconfig.xml|https://github.com/apache/solr/blob/branch_9x/solr/server/solr/configsets/_default/conf/solrconfig.xml]
 does *not* do this.

> Can't POST a dense vector that contains two or more occurences of the same 
> float value
> --
>
> Key: SOLR-17487
> URL: https://issues.apache.org/jira/browse/SOLR-17487
> Project: Solr
>  Issue Type: Bug
>  Components: UpdateRequestProcessors
>Affects Versions: 9.7, 9.6.1
>Reporter: Guillaume Jactat
>Priority: Major
> Attachments: image-2024-10-10-18-05-01-195.png, 
> image-2024-10-10-18-07-14-904.png, image-2024-10-10-18-07-19-370.png, 
> image-2024-10-10-23-27-26-566.png, vector-384.json, vector-384.xml, 
> vector-768.json
>
>
> *EDIT 10/10/2024* : 
> After a detailed analysis of the problematic vectors, I found that the 
> “missing” dimensions were actually dimensions of the same value.
> In concrete terms, the values present several times in the posted vectors are 
> deduplicated by Solr.
> You can see for yourself that the vectors supplied as attachments have the 
> common characteristic of containing {*}two or more occurences of the very 
> same float value{*}. The embedding model I use (all-minilm:33m) seems to 
> generate many such cases. 
> It seems that {*}Solr only takes into account the first occurrence of these 
> values{*}. As a result, the length of the final vector is no longer correct.
> The following screenshot show exactly what happens. With a smaller vector 
> field type of size 5. We can see that the vector [1, 5, 3, 4, 5] becomes [1, 
> 5, 3, 4].
> !image-2024-10-10-23-27-26-566.png!
>  
> -
> Hello,
>  
> I'm using Solr 9.7 as a vector database. I've come across something I can't 
> explain : I POST my documents as JSON and I've got a vector field of 
> dimension {*}768{*}.
>  
> The JSON document I POST has a vector field, which is an array of length 768. 
> Each value is a float.
>  
> Solr complains that my array is only *767* long...
> I've compared the JSON I POST and the array parsed by Solr and written in the 
> logs And indeed, one of the 768 values has simply disappeared in the 
> process.
>  
> The problem can easily be reproduced. All you have to do is :
>  * In your "schema.xml", declare the following dense vector field type :
> {code:java}
>  vectorDimension="768" similarityFunction="cosine"/>{code}
>  * In your schema.xml, declare the followig dense vector dynamic field :
> {code:java}
>  stored="true"/>{code}
>  * Use the Solr Admin UI to post the *attached document* to your Solr core.
>  * You should get the following error : "{*}incorrect vector dimension. The 
> vector value has size 767 while it is expected a vector with size 768"{*}
>  
>  * Furthermore, while the POSTed vector has 768 size, the vector written in 
> the logs is only 767... One value is missing. You can easily spot the missing 
> value with a simple diff.
> Maybe someone will find the reason why this specific vector leads to this 
> issue. Of course, I have plenty of others documents that get indexed without 
> any issue.
> In case it helps, the value that disappears from the 768 vector is 
> "0.0335415453". It's the 384th dimension (starting from 1)
> !image-2024-10-10-18-07-19-370.png!
> Thanks for reading



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] Prevent conflicting connection flags from being used via OptionGroup [solr]

2024-11-04 Thread via GitHub



malliaridis commented on code in PR #2725:
URL: https://github.com/apache/solr/pull/2725#discussion_r1828214894


##
solr/core/src/java/org/apache/solr/cli/ApiTool.java:
##
@@ -36,6 +36,16 @@
  * Used to send an arbitrary HTTP request to a Solr API endpoint.
  */
 public class ApiTool extends ToolBase {
+
+  private static final Option SOLR_URL_OPTION =
+  Option.builder()

Review Comment:
   We should probably add the `-s` option here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] Prevent conflicting connection flags from being used via OptionGroup [solr]

2024-11-04 Thread via GitHub



epugh commented on PR #2725:
URL: https://github.com/apache/solr/pull/2725#issuecomment-2455461246

   I go back and forth on making everything a Java object.  
`cli.getOption("my-option")` to me reads better than 
`cli.getOption(MY_OPTION)`, having said that, I think the enhanced IDE 
integration is the way to go...!Glad to see the tests get fixed!   I will 
review in the AM and merge.   
   
   Is this for branch_9x as well???


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-15929) Clean up NOTICE and LICENSE files for Solr

2024-11-04 Thread Christos Malliaridis (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-15929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895417#comment-17895417
 ] 

Christos Malliaridis commented on SOLR-15929:
-

What is the correct way for "Move lots of (C) notices from NOTICE to LICENSE 
file"? Just cut out the text block and paste it at the end of the license file?

> Clean up NOTICE and LICENSE files for Solr
> --
>
> Key: SOLR-15929
> URL: https://issues.apache.org/jira/browse/SOLR-15929
> Project: Solr
>  Issue Type: Improvement
>Reporter: Jan Høydahl
>Priority: Major
>
> Spinoff from SOLR-15862 and SOLR-2406:
> We need a total cleanup of both these files
>  * Move lots of (C) notices from NOTICE to LICENSE file
>  * Cross-check that we list all dependencies, and that removed deps (such as 
> for DIH etc) are removed from NOTICE/LICENSE
> I wonder if 
> [https://github.com/apache/solr/blob/main/solr/licenses/README.committers.txt]
>  should also be relocated to either `dev-docs/` or `help/` to make it easier 
> to find. It is hard to get the license/notice stuff right, so we need a good 
> guide for committers!
> See [https://infra.apache.org/licensing-howto.html] for the requirements. PS: 
> Any preference whether we should rename the files without {{.txt}} suffix?
> Also, our source and binary distributions are quite different, and would 
> ideally have different LICENSE and NOTICE files compared to the binary 
> distro. I think the Apache Whisker tool could potientailly help with this 
> [https://creadur.apache.org/whisker/index.html] but have not looked deeply.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Closed] (SOLR-17487) Can't POST a dense vector that contains two or more occurences of the same float value

2024-11-04 Thread Guillaume Jactat (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-17487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guillaume Jactat closed SOLR-17487.
---

> Can't POST a dense vector that contains two or more occurences of the same 
> float value
> --
>
> Key: SOLR-17487
> URL: https://issues.apache.org/jira/browse/SOLR-17487
> Project: Solr
>  Issue Type: Bug
>  Components: UpdateRequestProcessors
>Affects Versions: 9.7, 9.6.1
>Reporter: Guillaume Jactat
>Priority: Major
> Attachments: image-2024-10-10-18-05-01-195.png, 
> image-2024-10-10-18-07-14-904.png, image-2024-10-10-18-07-19-370.png, 
> image-2024-10-10-23-27-26-566.png, vector-384.json, vector-384.xml, 
> vector-768.json
>
>
> *EDIT 10/10/2024* : 
> After a detailed analysis of the problematic vectors, I found that the 
> “missing” dimensions were actually dimensions of the same value.
> In concrete terms, the values present several times in the posted vectors are 
> deduplicated by Solr.
> You can see for yourself that the vectors supplied as attachments have the 
> common characteristic of containing {*}two or more occurences of the very 
> same float value{*}. The embedding model I use (all-minilm:33m) seems to 
> generate many such cases. 
> It seems that {*}Solr only takes into account the first occurrence of these 
> values{*}. As a result, the length of the final vector is no longer correct.
> The following screenshot show exactly what happens. With a smaller vector 
> field type of size 5. We can see that the vector [1, 5, 3, 4, 5] becomes [1, 
> 5, 3, 4].
> !image-2024-10-10-23-27-26-566.png!
>  
> -
> Hello,
>  
> I'm using Solr 9.7 as a vector database. I've come across something I can't 
> explain : I POST my documents as JSON and I've got a vector field of 
> dimension {*}768{*}.
>  
> The JSON document I POST has a vector field, which is an array of length 768. 
> Each value is a float.
>  
> Solr complains that my array is only *767* long...
> I've compared the JSON I POST and the array parsed by Solr and written in the 
> logs And indeed, one of the 768 values has simply disappeared in the 
> process.
>  
> The problem can easily be reproduced. All you have to do is :
>  * In your "schema.xml", declare the following dense vector field type :
> {code:java}
>  vectorDimension="768" similarityFunction="cosine"/>{code}
>  * In your schema.xml, declare the followig dense vector dynamic field :
> {code:java}
>  stored="true"/>{code}
>  * Use the Solr Admin UI to post the *attached document* to your Solr core.
>  * You should get the following error : "{*}incorrect vector dimension. The 
> vector value has size 767 while it is expected a vector with size 768"{*}
>  
>  * Furthermore, while the POSTed vector has 768 size, the vector written in 
> the logs is only 767... One value is missing. You can easily spot the missing 
> value with a simple diff.
> Maybe someone will find the reason why this specific vector leads to this 
> issue. Of course, I have plenty of others documents that get indexed without 
> any issue.
> In case it helps, the value that disappears from the 768 vector is 
> "0.0335415453". It's the 384th dimension (starting from 1)
> !image-2024-10-10-18-07-19-370.png!
> Thanks for reading



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17453: Leverage waitForState() instead of busy waiting [solr]

2024-11-04 Thread via GitHub



dsmiley commented on PR #2737:
URL: https://github.com/apache/solr/pull/2737#issuecomment-2455879410

   Probably just a one-liner left and I'll merge away :-)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-17456) TransactionLog NPE

2024-11-04 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895451#comment-17895451
 ] 

ASF subversion and git services commented on SOLR-17456:


Commit 919bc994cc3618638d678e03d8a96f244786faad in solr's branch 
refs/heads/branch_9x from David Smiley
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=919bc994cc3 ]

SOLR-17456: TransactionLog ctor integrity (#2762)

The TransactionLog constructor can't handle an existing file being present; it 
shouldn't be there.
Should throw an exception in this case, NOT log a warning which would leave the 
object in a partially constructed state.
This should happen in the first place, of course.  I see no evidence it has 
occurred.

(cherry picked from commit e849fd0540fb4f0e013a1f73e93c3e85a933ed83)


> TransactionLog NPE
> --
>
> Key: SOLR-17456
> URL: https://issues.apache.org/jira/browse/SOLR-17456
> Project: Solr
>  Issue Type: Bug
>Reporter: David Smiley
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In an erroneous case, a TransactionLog should throw an exception if an 
> unexpected log file exists instead of merely log a warning in its 
> constructor.  The latter leaves the file in a partially constructed state 
> that leads to NPEs when it's used later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-17390) EmbeddedSolrServer should support a ResponseParser

2024-11-04 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-17390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895446#comment-17895446
 ] 

ASF subversion and git services commented on SOLR-17390:


Commit c5c538a9e025bda77ad591ee82beaa6a6732c408 in solr's branch 
refs/heads/main from David Smiley
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=c5c538a9e02 ]

SOLR-17390: EmbeddedSolrServer now considers the ResponseParser (#2774)

And
* Moved HttpSolrCall.getResponseWriter to SolrQueryRequest
* Subtle improvements to make ContentStream work when they might not have

> EmbeddedSolrServer should support a ResponseParser
> --
>
> Key: SOLR-17390
> URL: https://issues.apache.org/jira/browse/SOLR-17390
> Project: Solr
>  Issue Type: Improvement
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> By default, a SolrRequest has a null/unspecified ResponseParser; it's handled 
> automatically within SolrJ.  But an explicit one communicates an intent for 
> the client code to need it, like JsonMapResponseParser, 
> InputStreamResponseParser, or NoOpResponseParser (particularly those 3).  
> EmbeddedSolrServer doesn't look at this; the NamedList right out of the 
> core/handler is normalized (via javabin round-trip) and returned.  While that 
> makes sense _normally_, a ResponseParser should also be supported.  This 
> enables tests that might want to use EmbeddedSolrServer but that which need 
> to test JSON or XML (for convenience of xpath/json expressions, for example). 
>  Also, the newer V2 API generated clients would need this to support 
> EmbeddedSolrServer as they are currently based off of 
> InputStreamResponseParser.
> Doing this means determining the correct ResponseWriter (not assuming JavaBin 
> during normalization).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17453: Leverage waitForState() instead of busy waiting [solr]

2024-11-04 Thread via GitHub



dsmiley commented on code in PR #2737:
URL: https://github.com/apache/solr/pull/2737#discussion_r1828513954


##
solr/test-framework/src/java/org/apache/solr/cloud/AbstractDistribZkTestBase.java:
##
@@ -242,45 +240,15 @@ public static void waitForCollectionToDisappear(
 log.info("Collection has disappeared - collection:{}", collection);
   }
 
-  static void waitForNewLeader(
-  CloudSolrClient cloudClient, String shardName, Replica oldLeader, 
TimeOut timeOut)
+  static void waitForNewLeader(CloudSolrClient cloudClient, String shardName, 
Replica oldLeader)
   throws Exception {
-log.info("Will wait for a node to become leader for {} secs", 
timeOut.timeLeft(SECONDS));
+log.info("Will wait for a node to become leader for 15 secs");
 ZkStateReader zkStateReader = ZkStateReader.from(cloudClient);
-zkStateReader.forceUpdateCollection(DEFAULT_COLLECTION);
-
-for (; ; ) {
-  ClusterState clusterState = zkStateReader.getClusterState();
-  DocCollection coll = clusterState.getCollection("collection1");
-  Slice slice = coll.getSlice(shardName);
-  if (slice.getLeader() != null
-  && !slice.getLeader().equals(oldLeader)
-  && slice.getLeader().getState() == Replica.State.ACTIVE) {
-if (log.isInfoEnabled()) {
-  log.info(
-  "Old leader {}, new leader {}. New leader got elected in {} ms",
-  oldLeader,
-  slice.getLeader(),
-  timeOut.timeElapsed(MILLISECONDS));
-}
-break;
-  }
-
-  if (timeOut.hasTimedOut()) {

Review Comment:
   I like your change except for one small thing:  You propagate the exception 
(e.g. TimeoutException) but previously the test code here would explicitly 
fail.  I think we should explicitly fail.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Resolved] (SOLR-17390) EmbeddedSolrServer should support a ResponseParser

2024-11-04 Thread David Smiley (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-17390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley resolved SOLR-17390.
-
Fix Version/s: 9.8
   Resolution: Fixed

> EmbeddedSolrServer should support a ResponseParser
> --
>
> Key: SOLR-17390
> URL: https://issues.apache.org/jira/browse/SOLR-17390
> Project: Solr
>  Issue Type: Improvement
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
>  Labels: pull-request-available
> Fix For: 9.8
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> By default, a SolrRequest has a null/unspecified ResponseParser; it's handled 
> automatically within SolrJ.  But an explicit one communicates an intent for 
> the client code to need it, like JsonMapResponseParser, 
> InputStreamResponseParser, or NoOpResponseParser (particularly those 3).  
> EmbeddedSolrServer doesn't look at this; the NamedList right out of the 
> core/handler is normalized (via javabin round-trip) and returned.  While that 
> makes sense _normally_, a ResponseParser should also be supported.  This 
> enables tests that might want to use EmbeddedSolrServer but that which need 
> to test JSON or XML (for convenience of xpath/json expressions, for example). 
>  Also, the newer V2 API generated clients would need this to support 
> EmbeddedSolrServer as they are currently based off of 
> InputStreamResponseParser.
> Doing this means determining the correct ResponseWriter (not assuming JavaBin 
> during normalization).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Resolved] (SOLR-17456) TransactionLog NPE

2024-11-04 Thread David Smiley (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley resolved SOLR-17456.
-
Fix Version/s: 9.8
 Assignee: David Smiley
   Resolution: Fixed

> TransactionLog NPE
> --
>
> Key: SOLR-17456
> URL: https://issues.apache.org/jira/browse/SOLR-17456
> Project: Solr
>  Issue Type: Bug
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 9.8
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In an erroneous case, a TransactionLog should throw an exception if an 
> unexpected log file exists instead of merely log a warning in its 
> constructor.  The latter leaves the file in a partially constructed state 
> that leads to NPEs when it's used later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-17390) EmbeddedSolrServer should support a ResponseParser

2024-11-04 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-17390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895452#comment-17895452
 ] 

ASF subversion and git services commented on SOLR-17390:


Commit 337c3d6ccb2e6157f79f3488ef0deb7b4a852734 in solr's branch 
refs/heads/branch_9x from David Smiley
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=337c3d6ccb2 ]

SOLR-17390: EmbeddedSolrServer now considers the ResponseParser (#2774)

And
* Moved HttpSolrCall.getResponseWriter to SolrQueryRequest
* Subtle improvements to make ContentStream work when they might not have

(cherry picked from commit c5c538a9e025bda77ad591ee82beaa6a6732c408)


> EmbeddedSolrServer should support a ResponseParser
> --
>
> Key: SOLR-17390
> URL: https://issues.apache.org/jira/browse/SOLR-17390
> Project: Solr
>  Issue Type: Improvement
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> By default, a SolrRequest has a null/unspecified ResponseParser; it's handled 
> automatically within SolrJ.  But an explicit one communicates an intent for 
> the client code to need it, like JsonMapResponseParser, 
> InputStreamResponseParser, or NoOpResponseParser (particularly those 3).  
> EmbeddedSolrServer doesn't look at this; the NamedList right out of the 
> core/handler is normalized (via javabin round-trip) and returned.  While that 
> makes sense _normally_, a ResponseParser should also be supported.  This 
> enables tests that might want to use EmbeddedSolrServer but that which need 
> to test JSON or XML (for convenience of xpath/json expressions, for example). 
>  Also, the newer V2 API generated clients would need this to support 
> EmbeddedSolrServer as they are currently based off of 
> InputStreamResponseParser.
> Doing this means determining the correct ResponseWriter (not assuming JavaBin 
> during normalization).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] Prevent conflicting connection flags from being used via OptionGroup [solr]

2024-11-04 Thread via GitHub



malliaridis commented on PR #2725:
URL: https://github.com/apache/solr/pull/2725#issuecomment-2455611441

   > I go back and forth on making everything a Java object. 
cli.getOption("my-option") to me reads better than cli.getOption(MY_OPTION)
   
   I agree on that, but I also feel that having long and short variants of 
options makes `cli.getOption("my-option") kinda confusing. And it has proven to 
be also error-prone when working with strings. Forgetting to update a referene 
is easier with strings than with object references. This is probably the most 
important reason to go for objects.
   
   > Is this for branch_9x as well???
   
   No, 9x would require all the deprecated options as well, resulting to 
`OptionGroup`s. If we make a completely different PR we could introduce similar 
changes, but backporting won't do here withot breaking backwards compatibility.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Comment Edited] (SOLR-15929) Clean up NOTICE and LICENSE files for Solr

2024-11-04 Thread Jira



[ 
https://issues.apache.org/jira/browse/SOLR-15929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895429#comment-17895429
 ] 

Jan Høydahl edited comment on SOLR-15929 at 11/4/24 9:05 PM:
-

I think we may need a spreadsheet with a line per Java dependency and also the 
adminui ones, with columns for license-type, whether it has notice, whether it 
is a test only dep, whether the licenses/txt files are correct, whether it is 
part of slim distro etc.

Then we can scrap current LICENSE.txt and NOTICE.txt and make a script that 
generates these in three variants during tarball build:
- Binary, full
- Binary, slim
- Src (superset including test deps)


was (Author: janhoy):
I think we may need a spreadsheet with a line per Java dependency and also the 
adminui ones, with columns for license-type, whether it has notice, whether it 
is a test only dep, whether the licenses/txt files are correct, whether it is 
part of slim distro etc.

Then we can scrap current LICENSE.txt and NOTICE.txt and make a script that 
generates these in three variants during tarball build:
- Binary, full
- Binary, slim
- Src 

> Clean up NOTICE and LICENSE files for Solr
> --
>
> Key: SOLR-15929
> URL: https://issues.apache.org/jira/browse/SOLR-15929
> Project: Solr
>  Issue Type: Improvement
>Reporter: Jan Høydahl
>Priority: Major
>
> Spinoff from SOLR-15862 and SOLR-2406:
> We need a total cleanup of both these files
>  * Move lots of (C) notices from NOTICE to LICENSE file
>  * Cross-check that we list all dependencies, and that removed deps (such as 
> for DIH etc) are removed from NOTICE/LICENSE
> I wonder if 
> [https://github.com/apache/solr/blob/main/solr/licenses/README.committers.txt]
>  should also be relocated to either `dev-docs/` or `help/` to make it easier 
> to find. It is hard to get the license/notice stuff right, so we need a good 
> guide for committers!
> See [https://infra.apache.org/licensing-howto.html] for the requirements. PS: 
> Any preference whether we should rename the files without {{.txt}} suffix?
> Also, our source and binary distributions are quite different, and would 
> ideally have different LICENSE and NOTICE files compared to the binary 
> distro. I think the Apache Whisker tool could potientailly help with this 
> [https://creadur.apache.org/whisker/index.html] but have not looked deeply.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-15929) Clean up NOTICE and LICENSE files for Solr

2024-11-04 Thread Jira



[ 
https://issues.apache.org/jira/browse/SOLR-15929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895429#comment-17895429
 ] 

Jan Høydahl commented on SOLR-15929:


I think we may need a spreadsheet with a line per Java dependency and also the 
adminui ones, with columns for license-type, whether it has notice, whether it 
is a test only dep, whether the licenses/txt files are correct, whether it is 
part of slim distro etc.

Then we can scrap current LICENSE.txt and NOTICE.txt and make a script that 
generates these in three variants during tarball build:
- Binary, full
- Binary, slim
- Src 

> Clean up NOTICE and LICENSE files for Solr
> --
>
> Key: SOLR-15929
> URL: https://issues.apache.org/jira/browse/SOLR-15929
> Project: Solr
>  Issue Type: Improvement
>Reporter: Jan Høydahl
>Priority: Major
>
> Spinoff from SOLR-15862 and SOLR-2406:
> We need a total cleanup of both these files
>  * Move lots of (C) notices from NOTICE to LICENSE file
>  * Cross-check that we list all dependencies, and that removed deps (such as 
> for DIH etc) are removed from NOTICE/LICENSE
> I wonder if 
> [https://github.com/apache/solr/blob/main/solr/licenses/README.committers.txt]
>  should also be relocated to either `dev-docs/` or `help/` to make it easier 
> to find. It is hard to get the license/notice stuff right, so we need a good 
> guide for committers!
> See [https://infra.apache.org/licensing-howto.html] for the requirements. PS: 
> Any preference whether we should rename the files without {{.txt}} suffix?
> Also, our source and binary distributions are quite different, and would 
> ideally have different LICENSE and NOTICE files compared to the binary 
> distro. I think the Apache Whisker tool could potientailly help with this 
> [https://creadur.apache.org/whisker/index.html] but have not looked deeply.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-7871) Platform independent config file instead of solr.in.sh and solr.in.cmd

2024-11-04 Thread Jira



[ 
https://issues.apache.org/jira/browse/SOLR-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895424#comment-17895424
 ] 

Jan Høydahl commented on SOLR-7871:
---

As long as the bike shed is yaml I’m all in 🤣🤣🤣 Not sure I see the need for 
config per environments but it would be nice to piggy back on the production 
mode to do more strict startup checks than in dev etc. Would the yaml file have 
SOLR_foo style keys and manipulate env or solr.foo style keys and set sysprops?

> Platform independent config file instead of solr.in.sh and solr.in.cmd
> --
>
> Key: SOLR-7871
> URL: https://issues.apache.org/jira/browse/SOLR-7871
> Project: Solr
>  Issue Type: Improvement
>  Components: scripts and tools
>Affects Versions: 5.2.1
>Reporter: Jan Høydahl
>Priority: Major
>  Labels: bin/solr
> Attachments: SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, 
> SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, 
> SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, 
> SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch
>
>
> Spinoff from SOLR-7043
> The config files {{solr.in.sh}} and {{solr.in.cmd}} are currently executable 
> batch files, but all they do is to set environment variables for the start 
> scripts on the format {{key=value}}
> Suggest to instead have one central platform independent config file e.g. 
> {{bin/solr.yml}} or {{bin/solrstart.properties}} which is parsed by 
> {{SolrCLI.java}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17535 ClusterState.collectionStream() in lieu of getCollectionStates() [solr]

2024-11-04 Thread via GitHub



dsmiley commented on code in PR #2834:
URL: https://github.com/apache/solr/pull/2834#discussion_r1827774012


##
solr/core/src/java/org/apache/solr/handler/designer/SchemaDesignerConfigSetHelper.java:
##
@@ -168,24 +167,12 @@ Map analyzeField(String configSet, String 
fieldName, String fiel
   }
 
   List listCollectionsForConfig(String configSet) {
-final List collections = new ArrayList<>();
-Map states =
-zkStateReader().getClusterState().getCollectionStates();
-for (Map.Entry e : states.entrySet()) {
-  final String coll = e.getKey();
-  if (coll.startsWith(DESIGNER_PREFIX)) {
-continue; // ignore temp
-  }
-
-  try {
-if (configSet.equals(e.getValue().get().getConfigName()) && 
e.getValue().get() != null) {
-  collections.add(coll);
-}
-  } catch (Exception exc) {
-log.warn("Failed to get config name for {}", coll, exc);
-  }
-}
-return collections;
+return zkStateReader()
+.getClusterState()
+.collectionStream()

Review Comment:
   You are saying this because of the collection name filter that was there.  
As I indicated in my self-PR review, I think that was bogus/erroneous.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17256: Remove usages of deprecated SolrRequest.setBasePath [solr]

2024-11-04 Thread via GitHub



dsmiley commented on code in PR #2811:
URL: https://github.com/apache/solr/pull/2811#discussion_r1827794290


##
solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBSolrClient.java:
##
@@ -480,14 +480,35 @@ private static boolean isTimeExceeded(long 
timeAllowedNano, long timeOutTime) {
 return timeAllowedNano > 0 && System.nanoTime() > timeOutTime;
   }
 
+  private NamedList doMakeRequest(Endpoint endpoint, SolrRequest 
solrRequest)
+  throws SolrServerException, IOException {
+final var solrClient = getClient(endpoint);
+return doMakeRequest(solrClient, endpoint.getBaseUrl(), 
endpoint.getCore(), solrRequest);
+  }
+
+  // TODO This special casing can be removed if either: (1) SOLR-16367 is 
completed, or (2)
+  // LBHttp2SolrClient.getClient() is modified to return a client already 
pointed at the correct URL
+  private NamedList doMakeRequest(
+  SolrClient solrClient, String baseUrl, String collection, SolrRequest 
solrRequest)
+  throws SolrServerException, IOException {
+// Some implementations of LBSolrClient.getClient(...) return a 
Http2SolrClient that may not be
+// pointed at the desired URL (or any URL for that matter).  We special 
case that here to ensure
+// the appropriate URL is provided.
+if (solrClient instanceof Http2SolrClient) {
+  final var httpSolrClient = (Http2SolrClient) solrClient;
+  return httpSolrClient.requestWithBaseUrl(baseUrl, (c) -> 
c.request(solrRequest, collection));
+}
+
+return solrClient.request(solrRequest, collection);

Review Comment:
   I agree it's definitely worth its own ticket!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17256: Remove usages of deprecated SolrRequest.setBasePath [solr]

2024-11-04 Thread via GitHub



gerlowskija commented on code in PR #2811:
URL: https://github.com/apache/solr/pull/2811#discussion_r1827753327


##
solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBSolrClient.java:
##
@@ -480,14 +480,35 @@ private static boolean isTimeExceeded(long 
timeAllowedNano, long timeOutTime) {
 return timeAllowedNano > 0 && System.nanoTime() > timeOutTime;
   }
 
+  private NamedList doMakeRequest(Endpoint endpoint, SolrRequest 
solrRequest)
+  throws SolrServerException, IOException {
+final var solrClient = getClient(endpoint);
+return doMakeRequest(solrClient, endpoint.getBaseUrl(), 
endpoint.getCore(), solrRequest);
+  }
+
+  // TODO This special casing can be removed if either: (1) SOLR-16367 is 
completed, or (2)
+  // LBHttp2SolrClient.getClient() is modified to return a client already 
pointed at the correct URL
+  private NamedList doMakeRequest(
+  SolrClient solrClient, String baseUrl, String collection, SolrRequest 
solrRequest)
+  throws SolrServerException, IOException {
+// Some implementations of LBSolrClient.getClient(...) return a 
Http2SolrClient that may not be
+// pointed at the desired URL (or any URL for that matter).  We special 
case that here to ensure
+// the appropriate URL is provided.
+if (solrClient instanceof Http2SolrClient) {
+  final var httpSolrClient = (Http2SolrClient) solrClient;
+  return httpSolrClient.requestWithBaseUrl(baseUrl, (c) -> 
c.request(solrRequest, collection));
+}
+
+return solrClient.request(solrRequest, collection);

Review Comment:
   Agreed, but IMO that probably deserves its own ticket.
   
   Switching the Jetty LB client to work this way would probably let us 
reuse/share some of the client-management code from the Apache LB 
client...which is great!...but it'd also turn things into a slightly larger 
refactor than I want to tackle here.
   
   If we're agreed on this approach I can create a ticket for that work and 
update the TODO comment here to say essentially: "rip this out when tackling 
SOLR-12345"?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17256: Remove usages of deprecated SolrRequest.setBasePath [solr]

2024-11-04 Thread via GitHub



gerlowskija commented on code in PR #2811:
URL: https://github.com/apache/solr/pull/2811#discussion_r1827753327


##
solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBSolrClient.java:
##
@@ -480,14 +480,35 @@ private static boolean isTimeExceeded(long 
timeAllowedNano, long timeOutTime) {
 return timeAllowedNano > 0 && System.nanoTime() > timeOutTime;
   }
 
+  private NamedList doMakeRequest(Endpoint endpoint, SolrRequest 
solrRequest)
+  throws SolrServerException, IOException {
+final var solrClient = getClient(endpoint);
+return doMakeRequest(solrClient, endpoint.getBaseUrl(), 
endpoint.getCore(), solrRequest);
+  }
+
+  // TODO This special casing can be removed if either: (1) SOLR-16367 is 
completed, or (2)
+  // LBHttp2SolrClient.getClient() is modified to return a client already 
pointed at the correct URL
+  private NamedList doMakeRequest(
+  SolrClient solrClient, String baseUrl, String collection, SolrRequest 
solrRequest)
+  throws SolrServerException, IOException {
+// Some implementations of LBSolrClient.getClient(...) return a 
Http2SolrClient that may not be
+// pointed at the desired URL (or any URL for that matter).  We special 
case that here to ensure
+// the appropriate URL is provided.
+if (solrClient instanceof Http2SolrClient) {
+  final var httpSolrClient = (Http2SolrClient) solrClient;
+  return httpSolrClient.requestWithBaseUrl(baseUrl, (c) -> 
c.request(solrRequest, collection));
+}
+
+return solrClient.request(solrRequest, collection);

Review Comment:
   Agreed, but IMO that probably deserves its own ticket.
   
   Switching the Jetty LB client to work this way would probably let us 
reuse/share some of the client-management code from the Apache LB 
client...which is great!...but it'd also turn things into a slightly larger 
refactor than I want to tackle here.
   
   If we're agreed on this approach I can create a ticket for that work and 
update the TODO comment here to say essentially: "rip this out when tackling 
SOLR-#"?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17535 ClusterState.collectionStream() in lieu of getCollectionStates() [solr]

2024-11-04 Thread via GitHub



dsmiley commented on code in PR #2834:
URL: https://github.com/apache/solr/pull/2834#discussion_r1827771870


##
solr/solrj/src/java/org/apache/solr/common/cloud/ClusterState.java:
##
@@ -400,29 +402,14 @@ public Set getHostAllowList() {
 return hostAllowList;
   }
 
-  /**
-   * Iterate over collections. Unlike {@link #getCollectionStates()} 
collections passed to the
-   * consumer are guaranteed to exist.
-   *
-   * @param consumer collection consumer.
-   */
+  /** Streams the resolved DocCollections. Use this sparingly in case there 
are many collections. */
+  public Stream collectionStream() {
+return 
collectionStates.values().stream().map(CollectionRef::get).filter(Objects::nonNull);
+  }
+
+  /** Streams the resolved DocCollections. Use this sparingly in case there 
are many collections. */

Review Comment:
   It was deliberate as I thought it was good enough, but I'll add more words



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

60 matches

Mail list logo