[jira] [Commented] (SOLR-16757) Umbrella Ticket for Revamping Solr CLI's for the Future
[ https://issues.apache.org/jira/browse/SOLR-16757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895274#comment-17895274 ] Eric Pugh commented on SOLR-16757: -- We are getting to the close of this effort. I checked in with [~malliaridis] and he is going to push up a PR to get us moved to using OptionGroups to manage when you have multiple options, but only one is valid, to leverage the built in cli error handling for that situation. He also has a refactoring of SolrCLI to take some of the weight out of it. I would love to see JWT support added to the CLI (SOLR-13071 ), go along with our Basic Auth support... [~Idjeraoui] would that be of interest to you to help on? I also would like to take another run at SOLR-7871 , which is the idea of having platform independent file for all our configuraiton values. > Umbrella Ticket for Revamping Solr CLI's for the Future > --- > > Key: SOLR-16757 > URL: https://issues.apache.org/jira/browse/SOLR-16757 > Project: Solr > Issue Type: Task > Components: cli >Reporter: Eric Pugh >Assignee: Eric Pugh >Priority: Minor > > This is to guide me in revamping the Solr CLI functions by tracking a set of > JIRA's. It's to help me figure out which I am going to work on and which I > am not. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17497) Pull replicas throws AlreadyClosedException
[ https://issues.apache.org/jira/browse/SOLR-17497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895320#comment-17895320 ] Jason Gerlowski commented on SOLR-17497: Hey [~sanjaydutt] - I'm going to backport your change above to branch_9_7 as well, unless you've got any objections? Was doing some "beast" runs this morning and noticed a big improvement in branch_9x with your change, so I'd love see branch_9_7 get that benefit as well with a 9.7.1 release coming up... > Pull replicas throws AlreadyClosedException > - > > Key: SOLR-17497 > URL: https://issues.apache.org/jira/browse/SOLR-17497 > Project: Solr > Issue Type: Task >Reporter: Sanjay Dutt >Priority: Major > Attachments: Screenshot 2024-10-23 at 6.01.02 PM.png > > > Recently, a common exception (org.apache.lucene.store.AlreadyClosedException: > this Directory is closed) seen in multiple failed test cases. > FAILED: org.apache.solr.cloud.TestPullReplica.testKillPullReplica > FAILED: > org.apache.solr.cloud.SplitShardWithNodeRoleTest.testSolrClusterWithNodeRoleWithPull > FAILED: org.apache.solr.cloud.TestPullReplica.testAddDocs > > > {code:java} > com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an > uncaught exception in thread: Thread[id=10271, > name=fsyncService-6341-thread-1, state=RUNNABLE, > group=TGRP-SplitShardWithNodeRoleTest] > at > __randomizedtesting.SeedInfo.seed([3F7DACB3BC44C3C4:E5DB3E97188A8EB9]:0) > Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is > closed > at __randomizedtesting.SeedInfo.seed([3F7DACB3BC44C3C4]:0) > at > app//org.apache.lucene.store.BaseDirectory.ensureOpen(BaseDirectory.java:50) > at > app//org.apache.lucene.store.ByteBuffersDirectory.sync(ByteBuffersDirectory.java:237) > at > app//org.apache.lucene.tests.store.MockDirectoryWrapper.sync(MockDirectoryWrapper.java:214) > at > app//org.apache.solr.handler.IndexFetcher$DirectoryFile.sync(IndexFetcher.java:2034) > at > app//org.apache.solr.handler.IndexFetcher$FileFetcher.lambda$fetch$0(IndexFetcher.java:1803) > at > app//org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$1(ExecutorUtil.java:449) > at > java.base@11.0.24/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base@11.0.24/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base@11.0.24/java.lang.Thread.run(Thread.java:829) > {code} > > Interesting thing about these test cases is that they all share same kind of > setup where each has one shard and two replicas – one NRT and another is PULL. > > Going through one of the test case execution step. > FAILED: org.apache.solr.cloud.TestPullReplica.testKillPullReplica > > Test flow > 1. Create a collection with 1 NRT and 1 PULL replica > 2. waitForState > 3. waitForNumDocsInAllActiveReplicas(0); // *Name says it all* > 4. Index another document. > 5. waitForNumDocsInAllActiveReplicas(1); > 6. Stop Pull replica > 7. Index another document > 8. waitForNumDocsInAllActiveReplicas(2); > 9. Start Pull Replica > 10. waitForState > 11. waitForNumDocsInAllActiveReplicas(2); > > As per the logs the whole sequence executed successfully. Here is the link to > the logs: > [https://ge.apache.org/s/yxydiox3gvlf2/tests/task/:solr:core:test/details/org.apache.solr.cloud.TestPullReplica/testKillPullReplica/1/output] > (link may stop working in the future) > > Last step where they are making sure that all the active replicas should have > two documents each has logged a info which is another proof that it completed > successfully. > > {code:java} > 616575 INFO > (TEST-TestPullReplica.testKillPullReplica-seed#[F30CC837FDD0DC28]) [n: c: s: > r: x: t:] o.a.s.c.TestPullReplica Replica core_node3 > (https://127.0.0.1:35647/solr/pull_replica_test_kill_pull_replica_shard1_replica_n1/) > has all 2 docs 616606 INFO (qtp1091538342-13057-null-11348) > [n:127.0.0.1:38207_solr c:pull_replica_test_kill_pull_replica s:shard1 > r:core_node4 x:pull_replica_test_kill_pull_replica_shard1_replica_p2 > t:null-11348] o.a.s.c.S.Request webapp=/solr path=/select > params={q=*:*&wt=javabin&version=2} rid=null-11348 hits=2 status=0 QTime=0 > 616607 INFO > (TEST-TestPullReplica.testKillPullReplica-seed#[F30CC837FDD0DC28]) [n: c: s: > r: x: t:] o.a.s.c.TestPullReplica Replica core_node4 > (https://127.0.0.1:38207/solr/pull_replica_test_kill_pull_replica_shard1_replica_p2/) > has all 2 docs{code} > > *Where is the issue then?* > In the logs it has been observed, that after restarting the PULL replica. The > recovery process started and after fetching all the files info from the NRT, >
Re: [PR] SOLR-17453: Leverage waitForState() instead of busy waiting [solr]
psalagnac commented on PR #2737: URL: https://github.com/apache/solr/pull/2737#issuecomment-2454956000 > One thought, is there a way to enforce the use of waitForState() pattern via any of our code quality tools? Not sure how we can automate decision on whether usages of `Timeout` are legit or not. We should use `waitForState()` instead of busy waiting for changes in Zookeeper, so we leverage the registered watchers. There are other cases, mostly when doing Solr-to-Solr requests, where we should keep `Timeout`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17535 ClusterState.collectionStream() in lieu of getCollectionStates() [solr]
murblanc commented on code in PR #2834: URL: https://github.com/apache/solr/pull/2834#discussion_r1827321369 ## solr/solrj/src/java/org/apache/solr/common/cloud/ClusterState.java: ## @@ -400,29 +402,14 @@ public Set getHostAllowList() { return hostAllowList; } - /** - * Iterate over collections. Unlike {@link #getCollectionStates()} collections passed to the - * consumer are guaranteed to exist. - * - * @param consumer collection consumer. - */ + /** Streams the resolved DocCollections. Use this sparingly in case there are many collections. */ Review Comment: Better use Javadoc tags to refer to other classes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17535 ClusterState.collectionStream() in lieu of getCollectionStates() [solr]
murblanc commented on code in PR #2834: URL: https://github.com/apache/solr/pull/2834#discussion_r1827926712 ## solr/core/src/java/org/apache/solr/handler/designer/SchemaDesignerConfigSetHelper.java: ## @@ -168,24 +167,12 @@ Map analyzeField(String configSet, String fieldName, String fiel } List listCollectionsForConfig(String configSet) { -final List collections = new ArrayList<>(); -Map states = -zkStateReader().getClusterState().getCollectionStates(); -for (Map.Entry e : states.entrySet()) { - final String coll = e.getKey(); - if (coll.startsWith(DESIGNER_PREFIX)) { -continue; // ignore temp - } - - try { -if (configSet.equals(e.getValue().get().getConfigName()) && e.getValue().get() != null) { - collections.add(coll); -} - } catch (Exception exc) { -log.warn("Failed to get config name for {}", coll, exc); - } -} -return collections; +return zkStateReader() +.getClusterState() +.collectionStream() Review Comment: My comment is unrelated to the name filter. This code collects collection names. The change forces it to read the `state.json` of the collections even though it doesn't need the additional info. The behavior also changes, partially created collections will no longer be considered when previously they were. I don't know what this specific class does with the collection name, but since this PR deprecates a method without providing an alternate way of achieving the same result (with comparable performance), I stand by my comment to add `getCollectionNames()` in this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17497) Pull replicas throws AlreadyClosedException
[ https://issues.apache.org/jira/browse/SOLR-17497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895334#comment-17895334 ] ASF subversion and git services commented on SOLR-17497: Commit 4c211ba5be43c63d3b9a7ccf5810b4aa73960e80 in solr's branch refs/heads/branch_9_7 from Sanjay Dutt [ https://gitbox.apache.org/repos/asf?p=solr.git;h=4c211ba5be4 ] SOLR-17448 SOLR-17497: IndexFetcher, catch exception instead of bubbling up uncaught (#2800) (cherry picked from commit cc30093c5ee988555389b50cf2333edf743bb50f) > Pull replicas throws AlreadyClosedException > - > > Key: SOLR-17497 > URL: https://issues.apache.org/jira/browse/SOLR-17497 > Project: Solr > Issue Type: Task >Reporter: Sanjay Dutt >Priority: Major > Attachments: Screenshot 2024-10-23 at 6.01.02 PM.png > > > Recently, a common exception (org.apache.lucene.store.AlreadyClosedException: > this Directory is closed) seen in multiple failed test cases. > FAILED: org.apache.solr.cloud.TestPullReplica.testKillPullReplica > FAILED: > org.apache.solr.cloud.SplitShardWithNodeRoleTest.testSolrClusterWithNodeRoleWithPull > FAILED: org.apache.solr.cloud.TestPullReplica.testAddDocs > > > {code:java} > com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an > uncaught exception in thread: Thread[id=10271, > name=fsyncService-6341-thread-1, state=RUNNABLE, > group=TGRP-SplitShardWithNodeRoleTest] > at > __randomizedtesting.SeedInfo.seed([3F7DACB3BC44C3C4:E5DB3E97188A8EB9]:0) > Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is > closed > at __randomizedtesting.SeedInfo.seed([3F7DACB3BC44C3C4]:0) > at > app//org.apache.lucene.store.BaseDirectory.ensureOpen(BaseDirectory.java:50) > at > app//org.apache.lucene.store.ByteBuffersDirectory.sync(ByteBuffersDirectory.java:237) > at > app//org.apache.lucene.tests.store.MockDirectoryWrapper.sync(MockDirectoryWrapper.java:214) > at > app//org.apache.solr.handler.IndexFetcher$DirectoryFile.sync(IndexFetcher.java:2034) > at > app//org.apache.solr.handler.IndexFetcher$FileFetcher.lambda$fetch$0(IndexFetcher.java:1803) > at > app//org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$1(ExecutorUtil.java:449) > at > java.base@11.0.24/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base@11.0.24/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base@11.0.24/java.lang.Thread.run(Thread.java:829) > {code} > > Interesting thing about these test cases is that they all share same kind of > setup where each has one shard and two replicas – one NRT and another is PULL. > > Going through one of the test case execution step. > FAILED: org.apache.solr.cloud.TestPullReplica.testKillPullReplica > > Test flow > 1. Create a collection with 1 NRT and 1 PULL replica > 2. waitForState > 3. waitForNumDocsInAllActiveReplicas(0); // *Name says it all* > 4. Index another document. > 5. waitForNumDocsInAllActiveReplicas(1); > 6. Stop Pull replica > 7. Index another document > 8. waitForNumDocsInAllActiveReplicas(2); > 9. Start Pull Replica > 10. waitForState > 11. waitForNumDocsInAllActiveReplicas(2); > > As per the logs the whole sequence executed successfully. Here is the link to > the logs: > [https://ge.apache.org/s/yxydiox3gvlf2/tests/task/:solr:core:test/details/org.apache.solr.cloud.TestPullReplica/testKillPullReplica/1/output] > (link may stop working in the future) > > Last step where they are making sure that all the active replicas should have > two documents each has logged a info which is another proof that it completed > successfully. > > {code:java} > 616575 INFO > (TEST-TestPullReplica.testKillPullReplica-seed#[F30CC837FDD0DC28]) [n: c: s: > r: x: t:] o.a.s.c.TestPullReplica Replica core_node3 > (https://127.0.0.1:35647/solr/pull_replica_test_kill_pull_replica_shard1_replica_n1/) > has all 2 docs 616606 INFO (qtp1091538342-13057-null-11348) > [n:127.0.0.1:38207_solr c:pull_replica_test_kill_pull_replica s:shard1 > r:core_node4 x:pull_replica_test_kill_pull_replica_shard1_replica_p2 > t:null-11348] o.a.s.c.S.Request webapp=/solr path=/select > params={q=*:*&wt=javabin&version=2} rid=null-11348 hits=2 status=0 QTime=0 > 616607 INFO > (TEST-TestPullReplica.testKillPullReplica-seed#[F30CC837FDD0DC28]) [n: c: s: > r: x: t:] o.a.s.c.TestPullReplica Replica core_node4 > (https://127.0.0.1:38207/solr/pull_replica_test_kill_pull_replica_shard1_replica_p2/) > has all 2 docs{code} > > *Where is the issue then?* > In the logs it has been observed, that after restarting the PULL replica. The > recovery process star
[jira] [Commented] (SOLR-17448) Audit usage of ExecutorService#submit in Solr codebase
[ https://issues.apache.org/jira/browse/SOLR-17448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895333#comment-17895333 ] ASF subversion and git services commented on SOLR-17448: Commit 4c211ba5be43c63d3b9a7ccf5810b4aa73960e80 in solr's branch refs/heads/branch_9_7 from Sanjay Dutt [ https://gitbox.apache.org/repos/asf?p=solr.git;h=4c211ba5be4 ] SOLR-17448 SOLR-17497: IndexFetcher, catch exception instead of bubbling up uncaught (#2800) (cherry picked from commit cc30093c5ee988555389b50cf2333edf743bb50f) > Audit usage of ExecutorService#submit in Solr codebase > -- > > Key: SOLR-17448 > URL: https://issues.apache.org/jira/browse/SOLR-17448 > Project: Solr > Issue Type: Improvement >Affects Versions: 9.7 >Reporter: Andrey Bozhko >Priority: Minor > Labels: pull-request-available > Fix For: 9.8 > > Time Spent: 2.5h > Remaining Estimate: 0h > > There are quite a few places in Solr codebase where the background task is > created by invoking `ExecutorService#submit(...)` method - but where the > reference to the returned future is not retained. > So if the background task fails for any reason, and the task doesn't itself > have a try-catch block to log the failure, - the failure will go completely > unnoticed. > > This ticket is to review the usage of ExecutorService#submit method in the > codebase, and replace those with Executor#execute where appropriate. > > Originally brought up in the dev mailing list: > [https://lists.apache.org/thread/5f1965rltcspgw0j8nzcn2qnz9l4s8qm] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17535 ClusterState.collectionStream() in lieu of getCollectionStates() [solr]
dsmiley commented on code in PR #2834: URL: https://github.com/apache/solr/pull/2834#discussion_r182794 ## solr/core/src/java/org/apache/solr/handler/designer/SchemaDesignerConfigSetHelper.java: ## @@ -168,24 +167,12 @@ Map analyzeField(String configSet, String fieldName, String fiel } List listCollectionsForConfig(String configSet) { -final List collections = new ArrayList<>(); -Map states = -zkStateReader().getClusterState().getCollectionStates(); -for (Map.Entry e : states.entrySet()) { - final String coll = e.getKey(); - if (coll.startsWith(DESIGNER_PREFIX)) { -continue; // ignore temp - } - - try { -if (configSet.equals(e.getValue().get().getConfigName()) && e.getValue().get() != null) { - collections.add(coll); -} - } catch (Exception exc) { -log.warn("Failed to get config name for {}", coll, exc); - } -} -return collections; +return zkStateReader() +.getClusterState() +.collectionStream() Review Comment: The DocCollection was being read before as well. It's where the configSet name is. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-16116: Use apache curator to manage the Solr Zookeeper interactions [solr]
HoustonPutman commented on PR #760: URL: https://github.com/apache/solr/pull/760#issuecomment-2455058672 Yes, that is absolutely related, but I thoroughly tested that before pushing 🙄 I'll take a look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17535 ClusterState.collectionStream() in lieu of getCollectionStates() [solr]
dsmiley commented on code in PR #2834: URL: https://github.com/apache/solr/pull/2834#discussion_r1827951977 ## solr/solrj/src/java/org/apache/solr/common/cloud/ClusterState.java: ## @@ -382,6 +383,7 @@ void setLiveNodes(Set liveNodes) { * Be aware that this may return collections which may not exist now. You can confirm that this * collection exists after verifying CollectionRef.get() != null */ + @Deprecated // see collectionStream() Review Comment: Indeed; this PR shall not be merged until getCollectionNames is (today!) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-17487) Can't POST a dense vector that contains two or more occurences of the same float value
[ https://issues.apache.org/jira/browse/SOLR-17487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pierre Salagnac updated SOLR-17487: --- Attachment: image.png > Can't POST a dense vector that contains two or more occurences of the same > float value > -- > > Key: SOLR-17487 > URL: https://issues.apache.org/jira/browse/SOLR-17487 > Project: Solr > Issue Type: Bug > Components: UpdateRequestProcessors >Affects Versions: 9.7, 9.6.1 >Reporter: Guillaume Jactat >Priority: Major > Attachments: image-2024-10-10-18-05-01-195.png, > image-2024-10-10-18-07-14-904.png, image-2024-10-10-18-07-19-370.png, > image-2024-10-10-23-27-26-566.png, image.png, vector-384.json, > vector-384.xml, vector-768.json > > > *EDIT 10/10/2024* : > After a detailed analysis of the problematic vectors, I found that the > “missing” dimensions were actually dimensions of the same value. > In concrete terms, the values present several times in the posted vectors are > deduplicated by Solr. > You can see for yourself that the vectors supplied as attachments have the > common characteristic of containing {*}two or more occurences of the very > same float value{*}. The embedding model I use (all-minilm:33m) seems to > generate many such cases. > It seems that {*}Solr only takes into account the first occurrence of these > values{*}. As a result, the length of the final vector is no longer correct. > The following screenshot show exactly what happens. With a smaller vector > field type of size 5. We can see that the vector [1, 5, 3, 4, 5] becomes [1, > 5, 3, 4]. > !image-2024-10-10-23-27-26-566.png! > > - > Hello, > > I'm using Solr 9.7 as a vector database. I've come across something I can't > explain : I POST my documents as JSON and I've got a vector field of > dimension {*}768{*}. > > The JSON document I POST has a vector field, which is an array of length 768. > Each value is a float. > > Solr complains that my array is only *767* long... > I've compared the JSON I POST and the array parsed by Solr and written in the > logs And indeed, one of the 768 values has simply disappeared in the > process. > > The problem can easily be reproduced. All you have to do is : > * In your "schema.xml", declare the following dense vector field type : > {code:java} > vectorDimension="768" similarityFunction="cosine"/>{code} > * In your schema.xml, declare the followig dense vector dynamic field : > {code:java} > stored="true"/>{code} > * Use the Solr Admin UI to post the *attached document* to your Solr core. > * You should get the following error : "{*}incorrect vector dimension. The > vector value has size 767 while it is expected a vector with size 768"{*} > > * Furthermore, while the POSTed vector has 768 size, the vector written in > the logs is only 767... One value is missing. You can easily spot the missing > value with a simple diff. > Maybe someone will find the reason why this specific vector leads to this > issue. Of course, I have plenty of others documents that get indexed without > any issue. > In case it helps, the value that disappears from the 768 vector is > "0.0335415453". It's the 384th dimension (starting from 1) > !image-2024-10-10-18-07-19-370.png! > Thanks for reading -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-17487) Can't POST a dense vector that contains two or more occurences of the same float value
[ https://issues.apache.org/jira/browse/SOLR-17487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pierre Salagnac updated SOLR-17487: --- Attachment: (was: image.png) > Can't POST a dense vector that contains two or more occurences of the same > float value > -- > > Key: SOLR-17487 > URL: https://issues.apache.org/jira/browse/SOLR-17487 > Project: Solr > Issue Type: Bug > Components: UpdateRequestProcessors >Affects Versions: 9.7, 9.6.1 >Reporter: Guillaume Jactat >Priority: Major > Attachments: image-2024-10-10-18-05-01-195.png, > image-2024-10-10-18-07-14-904.png, image-2024-10-10-18-07-19-370.png, > image-2024-10-10-23-27-26-566.png, vector-384.json, vector-384.xml, > vector-768.json > > > *EDIT 10/10/2024* : > After a detailed analysis of the problematic vectors, I found that the > “missing” dimensions were actually dimensions of the same value. > In concrete terms, the values present several times in the posted vectors are > deduplicated by Solr. > You can see for yourself that the vectors supplied as attachments have the > common characteristic of containing {*}two or more occurences of the very > same float value{*}. The embedding model I use (all-minilm:33m) seems to > generate many such cases. > It seems that {*}Solr only takes into account the first occurrence of these > values{*}. As a result, the length of the final vector is no longer correct. > The following screenshot show exactly what happens. With a smaller vector > field type of size 5. We can see that the vector [1, 5, 3, 4, 5] becomes [1, > 5, 3, 4]. > !image-2024-10-10-23-27-26-566.png! > > - > Hello, > > I'm using Solr 9.7 as a vector database. I've come across something I can't > explain : I POST my documents as JSON and I've got a vector field of > dimension {*}768{*}. > > The JSON document I POST has a vector field, which is an array of length 768. > Each value is a float. > > Solr complains that my array is only *767* long... > I've compared the JSON I POST and the array parsed by Solr and written in the > logs And indeed, one of the 768 values has simply disappeared in the > process. > > The problem can easily be reproduced. All you have to do is : > * In your "schema.xml", declare the following dense vector field type : > {code:java} > vectorDimension="768" similarityFunction="cosine"/>{code} > * In your schema.xml, declare the followig dense vector dynamic field : > {code:java} > stored="true"/>{code} > * Use the Solr Admin UI to post the *attached document* to your Solr core. > * You should get the following error : "{*}incorrect vector dimension. The > vector value has size 767 while it is expected a vector with size 768"{*} > > * Furthermore, while the POSTed vector has 768 size, the vector written in > the logs is only 767... One value is missing. You can easily spot the missing > value with a simple diff. > Maybe someone will find the reason why this specific vector leads to this > issue. Of course, I have plenty of others documents that get indexed without > any issue. > In case it helps, the value that disappears from the 768 vector is > "0.0335415453". It's the 384th dimension (starting from 1) > !image-2024-10-10-18-07-19-370.png! > Thanks for reading -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17516: LBHttpSolrClient: support HttpJdkSolrClient (Generic Version) [solr]
dsmiley commented on code in PR #2828: URL: https://github.com/apache/solr/pull/2828#discussion_r1827982607 ## solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBSolrClient.java: ## @@ -699,11 +699,20 @@ public NamedList request( if (e.getRootCause() instanceof IOException) { ex = e; moveAliveToDead(wrapper); - if (justFailed == null) justFailed = new HashMap<>(); + if (justFailed == null) { +justFailed = new HashMap<>(); + } Review Comment: It's a shame to see duplication in getRootCause IOException detection with the dedicated clause. Maybe it'd be cleaner to have another try-catch just around calling request() above that detects the IOException and throws it? Just a thought; I leave it to you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17538: CloudHttp2SolrClient needs a custom ClusterStateProvider option [solr]
dsmiley commented on code in PR #2832: URL: https://github.com/apache/solr/pull/2832#discussion_r1827994098 ## solr/solrj/src/java/org/apache/solr/client/solrj/impl/CloudHttp2SolrClient.java: ## @@ -233,6 +240,11 @@ public Builder(List zkHosts, Optional zkChroot) { if (zkChroot.isPresent()) this.zkChroot = zkChroot.get(); } +/* for an expert use-case */ Review Comment: Missing another asterisk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] Prevent conflicting connection flags from being used via OptionGroup [solr]
malliaridis commented on PR #2725: URL: https://github.com/apache/solr/pull/2725#issuecomment-2455326616 @epugh this one is ready I believe. The merge with main was ugly. >.< I ended up with more lines added than removed, even though I have simplified and removed redundant elements. This is probably only because of the extraction of options to a separate file or variables. I made a few cleanups and migrations for consistency and also fixed a bug in StatusTool (see 31b1becd357e3f6d22950e9868cd1ff07686b129). Now we are using `getOptionValue` only with `Option` as parameter, not strings. And I also migrated to `getParsedOptionValue` wherever possible (booleans are not supported and the File parsing seems buggy). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] Prevent conflicting connection flags from being used via OptionGroup [solr]
malliaridis commented on PR #2725: URL: https://github.com/apache/solr/pull/2725#issuecomment-2455332282 I didn't find any options that are mutually exclusive to use groups. This was relevant for the deprecated options to simplify the `getOptionValue` with a single option group. But since we have removed the deprecated options, we are back again using only `Option`s. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-15929) Clean up NOTICE and LICENSE files for Solr
[ https://issues.apache.org/jira/browse/SOLR-15929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895377#comment-17895377 ] Christos Malliaridis commented on SOLR-15929: - We should also make sure that all files are up-to-date. I noticed a few differences in some libraries. Since we may not have a single source of truth for these files, we may end up with inconsistencies anyway, but standardizing the approach of fetching the content of these files should elp with that (more of a documentation task probably). > Clean up NOTICE and LICENSE files for Solr > -- > > Key: SOLR-15929 > URL: https://issues.apache.org/jira/browse/SOLR-15929 > Project: Solr > Issue Type: Improvement >Reporter: Jan Høydahl >Priority: Major > > Spinoff from SOLR-15862 and SOLR-2406: > We need a total cleanup of both these files > * Move lots of (C) notices from NOTICE to LICENSE file > * Cross-check that we list all dependencies, and that removed deps (such as > for DIH etc) are removed from NOTICE/LICENSE > I wonder if > [https://github.com/apache/solr/blob/main/solr/licenses/README.committers.txt] > should also be relocated to either `dev-docs/` or `help/` to make it easier > to find. It is hard to get the license/notice stuff right, so we need a good > guide for committers! > See [https://infra.apache.org/licensing-howto.html] for the requirements. PS: > Any preference whether we should rename the files without {{.txt}} suffix? > Also, our source and binary distributions are quite different, and would > ideally have different LICENSE and NOTICE files compared to the binary > distro. I think the Apache Whisker tool could potientailly help with this > [https://creadur.apache.org/whisker/index.html] but have not looked deeply. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Created] (SOLR-17542) AccessControlException when attempting to post document
Ivo Janssen created SOLR-17542: -- Summary: AccessControlException when attempting to post document Key: SOLR-17542 URL: https://issues.apache.org/jira/browse/SOLR-17542 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: contrib - Solr Cell (Tika extraction), security Affects Versions: 9.7 Environment: * Solr 9.7 * MacOS 15.0.1 * M1 Max CPU * 64GB RAM Reporter: Ivo Janssen I'm using Solr 9.7 on MacOS 15.0.1, with Cell enabled, and it returns a 500 error when I try to add a document. The error on Solr's side is as follows: {noformat} 2024-10-18 00:49:03.350 INFO (qtp1955990522-40-localhost-1) [c: s: r: x:test_docstore t:localhost-1] o.a.s.c.PluginBag Going to create a new requestHandler with {type = requestHandler,name = /update/extract,class = solr.extraction.ExtractingRequestHandler,attributes = {startup=lazy, name=/update/extract, class=solr.extraction.ExtractingRequestHandler},args = {defaults={fmap.Last-Modified=last_modified, uprefix=ignored_, df=_text_}}} 2024-10-18 00:49:03.653 ERROR (qtp1955990522-40-localhost-1) [c: s: r: x:test_docstore t:localhost-1] o.a.s.s.HttpSolrCall 500 Exception => java.lang.IllegalStateException: java.security.AccessControlException: access denied ("java.io.FilePermission" "/private/var/folders/8y/0166d0yx0wd7lxycs42l6t9cgs/T/jetty-127_0_0_1-8983-webapp-_solr-any-16097010865664396603" "read") at org.eclipse.jetty.server.MultiPartFormInputStream.throwIfError(MultiPartFormInputStream.java:526) java.lang.IllegalStateException: java.security.AccessControlException: access denied ("java.io.FilePermission" "/private/var/folders/8y/0166d0yx0wd7lxycs42l6t9cgs/T/jetty-127_0_0_1-8983-webapp-_solr-any-16097010865664396603" "read") at org.eclipse.jetty.server.MultiPartFormInputStream.throwIfError(MultiPartFormInputStream.java:526) ~[jetty-server-10.0.22.jar:10.0.22] at org.eclipse.jetty.server.MultiPartFormInputStream.getParts(MultiPartFormInputStream.java:491) ~[jetty-server-10.0.22.jar:10.0.22] at org.eclipse.jetty.server.MultiParts$MultiPartsHttpParser.getParts(MultiParts.java:90) ~[jetty-server-10.0.22.jar:10.0.22] at org.eclipse.jetty.server.Request.getParts(Request.java:2354) ~[jetty-server-10.0.22.jar:10.0.22] at org.eclipse.jetty.server.Request.getParts(Request.java:2328) ~[jetty-server-10.0.22.jar:10.0.22] at javax.servlet.http.HttpServletRequestWrapper.getParts(HttpServletRequestWrapper.java:317) ~[jetty-servlet-api-4.0.6.jar:?] at org.apache.solr.servlet.SolrRequestParsers$MultipartRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:649) ~[?:?] at org.apache.solr.servlet.SolrRequestParsers$StandardRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:893) ~[?:?] at org.apache.solr.servlet.SolrRequestParsers.parse(SolrRequestParsers.java:169) ~[?:?] at org.apache.solr.servlet.HttpSolrCall.init(HttpSolrCall.java:313) ~[?:?] at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:524) ~[?:?] at org.apache.solr.servlet.SolrDispatchFilter.dispatch(SolrDispatchFilter.java:251) ~[?:?] at org.apache.solr.servlet.SolrDispatchFilter.lambda$doFilter$0(SolrDispatchFilter.java:208) ~[?:?] at org.apache.solr.servlet.ServletUtils.traceHttpRequestExecution2(ServletUtils.java:243) ~[?:?] at org.apache.solr.servlet.ServletUtils.rateLimitRequest(ServletUtils.java:213) ~[?:?] {noformat} I've confirmed that this is related to the security policy, since I'm able to work around it by running Solr with `-Djava.security.manager=allow`, but looking at the policy nothing jumps out at me for being wrong or missing. [Link to discussion on the mailing list|https://lists.apache.org/thread/8grxnpnxtyb2c1wb4j4vpl88vktzfy13] (disregard my attempted fix in that thread - it was incorrect) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17534 ClusterState.getCollectionNames [solr]
dsmiley merged PR #2826: URL: https://github.com/apache/solr/pull/2826 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] Update org.hamcrest:* to v3 (major) [solr]
malliaridis merged PR #2617: URL: https://github.com/apache/solr/pull/2617 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-7871) Platform independent config file instead of solr.in.sh and solr.in.cmd
[ https://issues.apache.org/jira/browse/SOLR-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895278#comment-17895278 ] Eric Pugh commented on SOLR-7871: - Hi all.. I'd like to take to another, but fresh run, at this idea. I'm going to come from the perspective that I've formed from using Ruby on Rails, that you have a configuraiton that establishes what your environment looks like. To learn more about on Rails does it, check out this (GPT) summary: [https://poe.com/s/MvWTBLoLz7S60kCKaKPE] What does this look like for Solr? We introduce two default files. development.yml and production.yml. They live in ./server/solr/environments directory. When you run bin/solr start, you automagically reading in the development.yml. You start bin/solr start -e production and you read in production.yml. You can create your own as well. How do we adopt this? Start with bin/solr start and bin/solr stop command. Ensure that bin/solr start feeds all the vairous enviornment/system properties into a running Solr. Then expand to the other bin/solr commands to make sure they are using this configuration. Then, deprecate in 9x solr.in.sh and solr.in.cmd. Make sure the start service continues to function. Then, in 10x remove solr.in.sh and solr.in.cmd in favour of the environment based configuration files. the configuraiton files will list out ALL of the various sytem properties with some short comments, and have them enabled and disable as makes sense for that envionment. > Platform independent config file instead of solr.in.sh and solr.in.cmd > -- > > Key: SOLR-7871 > URL: https://issues.apache.org/jira/browse/SOLR-7871 > Project: Solr > Issue Type: Improvement > Components: scripts and tools >Affects Versions: 5.2.1 >Reporter: Jan Høydahl >Priority: Major > Labels: bin/solr > Attachments: SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, > SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, > SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, > SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch > > > Spinoff from SOLR-7043 > The config files {{solr.in.sh}} and {{solr.in.cmd}} are currently executable > batch files, but all they do is to set environment variables for the start > scripts on the format {{key=value}} > Suggest to instead have one central platform independent config file e.g. > {{bin/solr.yml}} or {{bin/solrstart.properties}} which is parsed by > {{SolrCLI.java}}. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17256: Remove usages of deprecated SolrRequest.setBasePath [solr]
gerlowskija commented on code in PR #2811: URL: https://github.com/apache/solr/pull/2811#discussion_r1827843121 ## solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBSolrClient.java: ## @@ -480,14 +480,35 @@ private static boolean isTimeExceeded(long timeAllowedNano, long timeOutTime) { return timeAllowedNano > 0 && System.nanoTime() > timeOutTime; } + private NamedList doMakeRequest(Endpoint endpoint, SolrRequest solrRequest) + throws SolrServerException, IOException { +final var solrClient = getClient(endpoint); +return doMakeRequest(solrClient, endpoint.getBaseUrl(), endpoint.getCore(), solrRequest); + } + + // TODO This special casing can be removed if either: (1) SOLR-16367 is completed, or (2) + // LBHttp2SolrClient.getClient() is modified to return a client already pointed at the correct URL + private NamedList doMakeRequest( + SolrClient solrClient, String baseUrl, String collection, SolrRequest solrRequest) + throws SolrServerException, IOException { +// Some implementations of LBSolrClient.getClient(...) return a Http2SolrClient that may not be +// pointed at the desired URL (or any URL for that matter). We special case that here to ensure +// the appropriate URL is provided. +if (solrClient instanceof Http2SolrClient) { + final var httpSolrClient = (Http2SolrClient) solrClient; + return httpSolrClient.requestWithBaseUrl(baseUrl, (c) -> c.request(solrRequest, collection)); +} + +return solrClient.request(solrRequest, collection); Review Comment: Great - just created SOLR-17541, and updated the TODO comment around this method accordingly -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17256: Remove usages of deprecated SolrRequest.setBasePath [solr]
gerlowskija commented on PR #2811: URL: https://github.com/apache/solr/pull/2811#issuecomment-2454900565 > I know it's out of scope... sort of... but @iamsanjay recently updated SolrClientNodeStateProvider.invoke to not call setBasePath but the result is more lines of code / complexity than needed. I haven't run tests to validate the change yet, but I've included this for now. Will walk it back if it causes any complications though... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17453: Leverage waitForState() instead of busy waiting [solr]
psalagnac commented on code in PR #2737: URL: https://github.com/apache/solr/pull/2737#discussion_r1827843673 ## solr/core/src/java/org/apache/solr/cloud/api/collections/CreateCollectionCmd.java: ## @@ -221,24 +223,19 @@ public void call(ClusterState clusterState, ZkNodeProps message, NamedList
Re: [PR] SOLR-17453: Leverage waitForState() instead of busy waiting [solr]
psalagnac commented on code in PR #2737: URL: https://github.com/apache/solr/pull/2737#discussion_r1827855744 ## solr/test-framework/src/java/org/apache/solr/cloud/AbstractDistribZkTestBase.java: ## @@ -242,45 +240,15 @@ public static void waitForCollectionToDisappear( log.info("Collection has disappeared - collection:{}", collection); } - static void waitForNewLeader( - CloudSolrClient cloudClient, String shardName, Replica oldLeader, TimeOut timeOut) + static void waitForNewLeader(CloudSolrClient cloudClient, String shardName, Replica oldLeader) throws Exception { -log.info("Will wait for a node to become leader for {} secs", timeOut.timeLeft(SECONDS)); +log.info("Will wait for a node to become leader for 15 secs"); ZkStateReader zkStateReader = ZkStateReader.from(cloudClient); -zkStateReader.forceUpdateCollection(DEFAULT_COLLECTION); - -for (; ; ) { - ClusterState clusterState = zkStateReader.getClusterState(); - DocCollection coll = clusterState.getCollection("collection1"); - Slice slice = coll.getSlice(shardName); - if (slice.getLeader() != null - && !slice.getLeader().equals(oldLeader) - && slice.getLeader().getState() == Replica.State.ACTIVE) { -if (log.isInfoEnabled()) { - log.info( - "Old leader {}, new leader {}. New leader got elected in {} ms", - oldLeader, - slice.getLeader(), - timeOut.timeElapsed(MILLISECONDS)); -} -break; - } - - if (timeOut.hasTimedOut()) { Review Comment: Good point. But this is at the cost of potentially slower test execution since we don't unblock the test thread when we receive the ZK watch. We would get the same if we call `logThreadDumps(...)` and `printLayoutToStream(...)` in case of error. I can do such a change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17453: Leverage waitForState() instead of busy waiting [solr]
psalagnac commented on code in PR #2737: URL: https://github.com/apache/solr/pull/2737#discussion_r1827860425 ## solr/core/src/test/org/apache/solr/cloud/TestRebalanceLeaders.java: ## @@ -604,74 +572,61 @@ private void forceUpdateCollectionStatus() { // Since we have to restart jettys, we don't want to try re-balancing etc. until we're sure all // jettys that should be up are and all replicas are active. - private void checkReplicasInactive(List downJettys) throws InterruptedException { -TimeOut timeout = new TimeOut(timeoutMs, TimeUnit.MILLISECONDS, TimeSource.NANO_TIME); -DocCollection docCollection = null; -Set liveNodes = null; + private void checkReplicasInactive(List downJettys) { Set downJettyNodes = new TreeSet<>(); for (JettySolrRunner jetty : downJettys) { downJettyNodes.add( jetty.getBaseUrl().getHost() + ":" + jetty.getBaseUrl().getPort() + "_solr"); } -while (timeout.hasTimedOut() == false) { - forceUpdateCollectionStatus(); - docCollection = cluster.getSolrClient().getClusterState().getCollection(COLLECTION_NAME); - liveNodes = cluster.getSolrClient().getClusterState().getLiveNodes(); - boolean expectedInactive = true; - - for (Slice slice : docCollection.getSlices()) { -for (Replica rep : slice.getReplicas()) { - if (downJettyNodes.contains(rep.getNodeName()) == false) { -continue; // We are on a live node - } - // A replica on an allegedly down node is reported as active. - if (rep.isActive(liveNodes)) { -expectedInactive = false; + +waitForState( +"Waiting for all replicas to become inactive", +COLLECTION_NAME, +(liveNodes, docCollection) -> { + boolean expectedInactive = true; + + for (Slice slice : docCollection.getSlices()) { +for (Replica rep : slice.getReplicas()) { + if (!downJettyNodes.contains(rep.getNodeName())) { Review Comment: I flipped this because it is reported by my IDE. I didn't know such a convention was embraced at some point. Will change back to `== false` form. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17516: LBHttpSolrClient: support HttpJdkSolrClient (Generic Version) [solr]
jdyer1 commented on code in PR #2828: URL: https://github.com/apache/solr/pull/2828#discussion_r1827873836 ## solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpJdkSolrClient.java: ## @@ -173,8 +173,8 @@ public NamedList request(SolrRequest solrRequest, String collection) "Timeout occurred while waiting response from server at: " + pReq.url, e); } catch (SolrException se) { throw se; -} catch (RuntimeException re) { - throw new SolrServerException(re); +} catch (IOException | RuntimeException e) { Review Comment: Indeed, the doc-comment on the `LBSolrClient` request method says, > If a request fails due to an IOException, the server is moved to the dead pool So I restored the previous behavior in `HttpJdkSolrClient` and made the modification in `LBSolrClient`. This new behavior is covered by `LBHttp2SolrClientIntegrationTest`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-16790) Umbrella - Improve Solr CLI tools
[ https://issues.apache.org/jira/browse/SOLR-16790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895260#comment-17895260 ] Eric Pugh commented on SOLR-16790: -- I want to do a bit of Jira tidying, and get rid of this issue by moving tickets to SOLR-16757. > Umbrella - Improve Solr CLI tools > - > > Key: SOLR-16790 > URL: https://issues.apache.org/jira/browse/SOLR-16790 > Project: Solr > Issue Type: Improvement > Components: scripts and tools >Affects Versions: 9.2.1 >Reporter: Shawn Heisey >Assignee: Shawn Heisey >Priority: Major > > There are a lot of things we can do to improve Solr's startup. We can use > this issue as an umbrella to keep track of various pieces of that work. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-16790) Umbrella - Improve Solr CLI tools
[ https://issues.apache.org/jira/browse/SOLR-16790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Pugh resolved SOLR-16790. -- Resolution: Duplicate I am marking this as a duplicate of SOLR-16757 and moved the child tickets over to that JIRA. It's great that we are all getting the CLI to healthy place in the 9x and 10x lines of Solr! > Umbrella - Improve Solr CLI tools > - > > Key: SOLR-16790 > URL: https://issues.apache.org/jira/browse/SOLR-16790 > Project: Solr > Issue Type: Improvement > Components: scripts and tools >Affects Versions: 9.2.1 >Reporter: Shawn Heisey >Assignee: Shawn Heisey >Priority: Major > > There are a lot of things we can do to improve Solr's startup. We can use > this issue as an umbrella to keep track of various pieces of that work. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Created] (SOLR-17541) LBSolrClient implementations should agree on 'getClient()' semantics
Jason Gerlowski created SOLR-17541: -- Summary: LBSolrClient implementations should agree on 'getClient()' semantics Key: SOLR-17541 URL: https://issues.apache.org/jira/browse/SOLR-17541 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: SolrJ Affects Versions: 9.7 Reporter: Jason Gerlowski LBSolrClient has an abstract "getClient(String url)" method that is used to fetch a "Http" SolrClient appropriate for the specified URL. But implementations of this method differ in the client that is returned. LBHttpSolrClient returns a client that is already pointed at the specified URL and can be used without modification. But LBHttp2SolrClient returns a client with no URL altogether, that must be pointed at the right endpoint prior to use. This is a bit messy, and complicates the calling code in LBSolrClient quite a bit. We should choose one of these approaches and use it for all LBSolrClient implementations. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[PR] Backport: Update org.hamcrest:* to v3 (major) (#2617) [solr]
malliaridis opened a new pull request, #2845: URL: https://github.com/apache/solr/pull/2845 * Update org.hamcrest:* to v3 * Update hamcrest license file - Co-authored-by: Christos Malliaridis (cherry picked from commit f216c984d348c12cc4c4c24e24ee6bf014cc9b01) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17535 ClusterState.collectionStream() in lieu of getCollectionStates() [solr]
murblanc commented on code in PR #2834: URL: https://github.com/apache/solr/pull/2834#discussion_r1827322144 ## solr/solrj/src/java/org/apache/solr/common/cloud/ClusterState.java: ## @@ -400,29 +402,14 @@ public Set getHostAllowList() { return hostAllowList; } - /** - * Iterate over collections. Unlike {@link #getCollectionStates()} collections passed to the - * consumer are guaranteed to exist. - * - * @param consumer collection consumer. - */ + /** Streams the resolved DocCollections. Use this sparingly in case there are many collections. */ + public Stream collectionStream() { +return collectionStates.values().stream().map(CollectionRef::get).filter(Objects::nonNull); + } + + /** Streams the resolved DocCollections. Use this sparingly in case there are many collections. */ Review Comment: Seems that Javadoc comment was copied from above method and needs updating -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Assigned] (SOLR-17256) Remove SolrRequest.getBasePath setBasePath
[ https://issues.apache.org/jira/browse/SOLR-17256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gerlowski reassigned SOLR-17256: -- Assignee: Jason Gerlowski > Remove SolrRequest.getBasePath setBasePath > -- > > Key: SOLR-17256 > URL: https://issues.apache.org/jira/browse/SOLR-17256 > Project: Solr > Issue Type: Improvement > Components: SolrJ >Reporter: David Smiley >Assignee: Jason Gerlowski >Priority: Minor > Labels: newdev, pull-request-available > Time Spent: 4h 40m > Remaining Estimate: 0h > > SolrRequest has a getBasePath & setBasePath. The naming is poor; it's the > URL base to the Solr node like "http://localhost:8983/solr";. It's only > recognized by HttpSolrClient; LBSolrClient (used by CloudSolrClient) ignores > it and will in fact mutate the passed in request to its liking, which is > rather ugly because it means a request cannot be used concurrently if the > user wants to. But moreover I think there's a conceptual discordance of > placing this concept on SolrRequest given that some clients want to route > requests to nodes *they* choose. I propose removing this from SolrRequest > and instead adding a method specific to HttpSolrClient. Almost all existing > usages of setBasePath immediately execute the request on an HttpSolrClient, > so should be easy to change. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17256: Remove usages of deprecated SolrRequest.setBasePath [solr]
gerlowskija commented on PR #2811: URL: https://github.com/apache/solr/pull/2811#issuecomment-2455573937 Tests and 'check' all look good; will aim to merge tomorrow pending any last comments! Thanks @dsmiley for all the review! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Comment Edited] (SOLR-15929) Clean up NOTICE and LICENSE files for Solr
[ https://issues.apache.org/jira/browse/SOLR-15929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895429#comment-17895429 ] Jan Høydahl edited comment on SOLR-15929 at 11/4/24 9:22 PM: - I think we may need a spreadsheet with a line per Java dependency and also the adminui ones, with columns for license-type, whether it has notice, whether it is a test only dep, whether the licenses/txt files are correct, whether it is part of slim distro etc. We should also iterate all (C) notices in current LICENSE/NOTICE files to find which are no longer relevant and which must be retained/updated - may be some forked piece of source code not mentioned in version catalog. Then we can scrap current LICENSE.txt and NOTICE.txt and make a script that generates these in three variants during tarball build: - Binary, full - Binary, slim - Src - only for src code we ship, not for binary maven deps was (Author: janhoy): I think we may need a spreadsheet with a line per Java dependency and also the adminui ones, with columns for license-type, whether it has notice, whether it is a test only dep, whether the licenses/txt files are correct, whether it is part of slim distro etc. Then we can scrap current LICENSE.txt and NOTICE.txt and make a script that generates these in three variants during tarball build: - Binary, full - Binary, slim - Src (superset including test deps) > Clean up NOTICE and LICENSE files for Solr > -- > > Key: SOLR-15929 > URL: https://issues.apache.org/jira/browse/SOLR-15929 > Project: Solr > Issue Type: Improvement >Reporter: Jan Høydahl >Priority: Major > > Spinoff from SOLR-15862 and SOLR-2406: > We need a total cleanup of both these files > * Move lots of (C) notices from NOTICE to LICENSE file > * Cross-check that we list all dependencies, and that removed deps (such as > for DIH etc) are removed from NOTICE/LICENSE > I wonder if > [https://github.com/apache/solr/blob/main/solr/licenses/README.committers.txt] > should also be relocated to either `dev-docs/` or `help/` to make it easier > to find. It is hard to get the license/notice stuff right, so we need a good > guide for committers! > See [https://infra.apache.org/licensing-howto.html] for the requirements. PS: > Any preference whether we should rename the files without {{.txt}} suffix? > Also, our source and binary distributions are quite different, and would > ideally have different LICENSE and NOTICE files compared to the binary > distro. I think the Apache Whisker tool could potientailly help with this > [https://creadur.apache.org/whisker/index.html] but have not looked deeply. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Comment Edited] (SOLR-15929) Clean up NOTICE and LICENSE files for Solr
[ https://issues.apache.org/jira/browse/SOLR-15929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895429#comment-17895429 ] Jan Høydahl edited comment on SOLR-15929 at 11/4/24 9:27 PM: - I think we may need a spreadsheet with a line per Java dependency and also the adminui ones, with columns for license-type, whether it has notice, whether it is a test only dep, whether the licenses/txt files are correct, whether it is part of slim distro etc. We should also iterate all (C) notices in current LICENSE/NOTICE files to find which are no longer relevant and which must be retained/updated - may be some forked piece of source code not mentioned in version catalog. Then we can scrap current LICENSE.txt and NOTICE.txt and make a script that generates these in three variants during tarball build: - Binary, full - Binary, slim - Src - only for src code we ship, not for binary maven deps I wonder if we could get help form SBOM tooling here? Would be cool to also publish a structured SBOM file with our tarballs. was (Author: janhoy): I think we may need a spreadsheet with a line per Java dependency and also the adminui ones, with columns for license-type, whether it has notice, whether it is a test only dep, whether the licenses/txt files are correct, whether it is part of slim distro etc. We should also iterate all (C) notices in current LICENSE/NOTICE files to find which are no longer relevant and which must be retained/updated - may be some forked piece of source code not mentioned in version catalog. Then we can scrap current LICENSE.txt and NOTICE.txt and make a script that generates these in three variants during tarball build: - Binary, full - Binary, slim - Src - only for src code we ship, not for binary maven deps > Clean up NOTICE and LICENSE files for Solr > -- > > Key: SOLR-15929 > URL: https://issues.apache.org/jira/browse/SOLR-15929 > Project: Solr > Issue Type: Improvement >Reporter: Jan Høydahl >Priority: Major > > Spinoff from SOLR-15862 and SOLR-2406: > We need a total cleanup of both these files > * Move lots of (C) notices from NOTICE to LICENSE file > * Cross-check that we list all dependencies, and that removed deps (such as > for DIH etc) are removed from NOTICE/LICENSE > I wonder if > [https://github.com/apache/solr/blob/main/solr/licenses/README.committers.txt] > should also be relocated to either `dev-docs/` or `help/` to make it easier > to find. It is hard to get the license/notice stuff right, so we need a good > guide for committers! > See [https://infra.apache.org/licensing-howto.html] for the requirements. PS: > Any preference whether we should rename the files without {{.txt}} suffix? > Also, our source and binary distributions are quite different, and would > ideally have different LICENSE and NOTICE files compared to the binary > distro. I think the Apache Whisker tool could potientailly help with this > [https://creadur.apache.org/whisker/index.html] but have not looked deeply. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17456: TransactionLog ctor integrity [solr]
dsmiley merged PR #2762: URL: https://github.com/apache/solr/pull/2762 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17390: EmbeddedSolrServer now considers the ResponseParser [solr]
dsmiley merged PR #2774: URL: https://github.com/apache/solr/pull/2774 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17456) TransactionLog NPE
[ https://issues.apache.org/jira/browse/SOLR-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895443#comment-17895443 ] ASF subversion and git services commented on SOLR-17456: Commit e849fd0540fb4f0e013a1f73e93c3e85a933ed83 in solr's branch refs/heads/main from David Smiley [ https://gitbox.apache.org/repos/asf?p=solr.git;h=e849fd0540f ] SOLR-17456: TransactionLog ctor integrity (#2762) The TransactionLog constructor can't handle an existing file being present; it shouldn't be there. Should throw an exception in this case, NOT log a warning which would leave the object in a partially constructed state. This should happen in the first place, of course. I see no evidence it has occurred. > TransactionLog NPE > -- > > Key: SOLR-17456 > URL: https://issues.apache.org/jira/browse/SOLR-17456 > Project: Solr > Issue Type: Bug >Reporter: David Smiley >Priority: Minor > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > In an erroneous case, a TransactionLog should throw an exception if an > unexpected log file exists instead of merely log a warning in its > constructor. The latter leaves the file in a partially constructed state > that leads to NPEs when it's used later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17487) Can't POST a dense vector that contains two or more occurences of the same float value
[ https://issues.apache.org/jira/browse/SOLR-17487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895390#comment-17895390 ] David Smiley commented on SOLR-17487: - Solr's [default solrconfig.xml|https://github.com/apache/solr/blob/branch_9x/solr/server/solr/configsets/_default/conf/solrconfig.xml] does *not* do this. > Can't POST a dense vector that contains two or more occurences of the same > float value > -- > > Key: SOLR-17487 > URL: https://issues.apache.org/jira/browse/SOLR-17487 > Project: Solr > Issue Type: Bug > Components: UpdateRequestProcessors >Affects Versions: 9.7, 9.6.1 >Reporter: Guillaume Jactat >Priority: Major > Attachments: image-2024-10-10-18-05-01-195.png, > image-2024-10-10-18-07-14-904.png, image-2024-10-10-18-07-19-370.png, > image-2024-10-10-23-27-26-566.png, vector-384.json, vector-384.xml, > vector-768.json > > > *EDIT 10/10/2024* : > After a detailed analysis of the problematic vectors, I found that the > “missing” dimensions were actually dimensions of the same value. > In concrete terms, the values present several times in the posted vectors are > deduplicated by Solr. > You can see for yourself that the vectors supplied as attachments have the > common characteristic of containing {*}two or more occurences of the very > same float value{*}. The embedding model I use (all-minilm:33m) seems to > generate many such cases. > It seems that {*}Solr only takes into account the first occurrence of these > values{*}. As a result, the length of the final vector is no longer correct. > The following screenshot show exactly what happens. With a smaller vector > field type of size 5. We can see that the vector [1, 5, 3, 4, 5] becomes [1, > 5, 3, 4]. > !image-2024-10-10-23-27-26-566.png! > > - > Hello, > > I'm using Solr 9.7 as a vector database. I've come across something I can't > explain : I POST my documents as JSON and I've got a vector field of > dimension {*}768{*}. > > The JSON document I POST has a vector field, which is an array of length 768. > Each value is a float. > > Solr complains that my array is only *767* long... > I've compared the JSON I POST and the array parsed by Solr and written in the > logs And indeed, one of the 768 values has simply disappeared in the > process. > > The problem can easily be reproduced. All you have to do is : > * In your "schema.xml", declare the following dense vector field type : > {code:java} > vectorDimension="768" similarityFunction="cosine"/>{code} > * In your schema.xml, declare the followig dense vector dynamic field : > {code:java} > stored="true"/>{code} > * Use the Solr Admin UI to post the *attached document* to your Solr core. > * You should get the following error : "{*}incorrect vector dimension. The > vector value has size 767 while it is expected a vector with size 768"{*} > > * Furthermore, while the POSTed vector has 768 size, the vector written in > the logs is only 767... One value is missing. You can easily spot the missing > value with a simple diff. > Maybe someone will find the reason why this specific vector leads to this > issue. Of course, I have plenty of others documents that get indexed without > any issue. > In case it helps, the value that disappears from the 768 vector is > "0.0335415453". It's the 384th dimension (starting from 1) > !image-2024-10-10-18-07-19-370.png! > Thanks for reading -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] Prevent conflicting connection flags from being used via OptionGroup [solr]
malliaridis commented on code in PR #2725: URL: https://github.com/apache/solr/pull/2725#discussion_r1828214894 ## solr/core/src/java/org/apache/solr/cli/ApiTool.java: ## @@ -36,6 +36,16 @@ * Used to send an arbitrary HTTP request to a Solr API endpoint. */ public class ApiTool extends ToolBase { + + private static final Option SOLR_URL_OPTION = + Option.builder() Review Comment: We should probably add the `-s` option here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] Prevent conflicting connection flags from being used via OptionGroup [solr]
epugh commented on PR #2725: URL: https://github.com/apache/solr/pull/2725#issuecomment-2455461246 I go back and forth on making everything a Java object. `cli.getOption("my-option")` to me reads better than `cli.getOption(MY_OPTION)`, having said that, I think the enhanced IDE integration is the way to go...!Glad to see the tests get fixed! I will review in the AM and merge. Is this for branch_9x as well??? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-15929) Clean up NOTICE and LICENSE files for Solr
[ https://issues.apache.org/jira/browse/SOLR-15929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895417#comment-17895417 ] Christos Malliaridis commented on SOLR-15929: - What is the correct way for "Move lots of (C) notices from NOTICE to LICENSE file"? Just cut out the text block and paste it at the end of the license file? > Clean up NOTICE and LICENSE files for Solr > -- > > Key: SOLR-15929 > URL: https://issues.apache.org/jira/browse/SOLR-15929 > Project: Solr > Issue Type: Improvement >Reporter: Jan Høydahl >Priority: Major > > Spinoff from SOLR-15862 and SOLR-2406: > We need a total cleanup of both these files > * Move lots of (C) notices from NOTICE to LICENSE file > * Cross-check that we list all dependencies, and that removed deps (such as > for DIH etc) are removed from NOTICE/LICENSE > I wonder if > [https://github.com/apache/solr/blob/main/solr/licenses/README.committers.txt] > should also be relocated to either `dev-docs/` or `help/` to make it easier > to find. It is hard to get the license/notice stuff right, so we need a good > guide for committers! > See [https://infra.apache.org/licensing-howto.html] for the requirements. PS: > Any preference whether we should rename the files without {{.txt}} suffix? > Also, our source and binary distributions are quite different, and would > ideally have different LICENSE and NOTICE files compared to the binary > distro. I think the Apache Whisker tool could potientailly help with this > [https://creadur.apache.org/whisker/index.html] but have not looked deeply. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Closed] (SOLR-17487) Can't POST a dense vector that contains two or more occurences of the same float value
[ https://issues.apache.org/jira/browse/SOLR-17487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guillaume Jactat closed SOLR-17487. --- > Can't POST a dense vector that contains two or more occurences of the same > float value > -- > > Key: SOLR-17487 > URL: https://issues.apache.org/jira/browse/SOLR-17487 > Project: Solr > Issue Type: Bug > Components: UpdateRequestProcessors >Affects Versions: 9.7, 9.6.1 >Reporter: Guillaume Jactat >Priority: Major > Attachments: image-2024-10-10-18-05-01-195.png, > image-2024-10-10-18-07-14-904.png, image-2024-10-10-18-07-19-370.png, > image-2024-10-10-23-27-26-566.png, vector-384.json, vector-384.xml, > vector-768.json > > > *EDIT 10/10/2024* : > After a detailed analysis of the problematic vectors, I found that the > “missing” dimensions were actually dimensions of the same value. > In concrete terms, the values present several times in the posted vectors are > deduplicated by Solr. > You can see for yourself that the vectors supplied as attachments have the > common characteristic of containing {*}two or more occurences of the very > same float value{*}. The embedding model I use (all-minilm:33m) seems to > generate many such cases. > It seems that {*}Solr only takes into account the first occurrence of these > values{*}. As a result, the length of the final vector is no longer correct. > The following screenshot show exactly what happens. With a smaller vector > field type of size 5. We can see that the vector [1, 5, 3, 4, 5] becomes [1, > 5, 3, 4]. > !image-2024-10-10-23-27-26-566.png! > > - > Hello, > > I'm using Solr 9.7 as a vector database. I've come across something I can't > explain : I POST my documents as JSON and I've got a vector field of > dimension {*}768{*}. > > The JSON document I POST has a vector field, which is an array of length 768. > Each value is a float. > > Solr complains that my array is only *767* long... > I've compared the JSON I POST and the array parsed by Solr and written in the > logs And indeed, one of the 768 values has simply disappeared in the > process. > > The problem can easily be reproduced. All you have to do is : > * In your "schema.xml", declare the following dense vector field type : > {code:java} > vectorDimension="768" similarityFunction="cosine"/>{code} > * In your schema.xml, declare the followig dense vector dynamic field : > {code:java} > stored="true"/>{code} > * Use the Solr Admin UI to post the *attached document* to your Solr core. > * You should get the following error : "{*}incorrect vector dimension. The > vector value has size 767 while it is expected a vector with size 768"{*} > > * Furthermore, while the POSTed vector has 768 size, the vector written in > the logs is only 767... One value is missing. You can easily spot the missing > value with a simple diff. > Maybe someone will find the reason why this specific vector leads to this > issue. Of course, I have plenty of others documents that get indexed without > any issue. > In case it helps, the value that disappears from the 768 vector is > "0.0335415453". It's the 384th dimension (starting from 1) > !image-2024-10-10-18-07-19-370.png! > Thanks for reading -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17453: Leverage waitForState() instead of busy waiting [solr]
dsmiley commented on PR #2737: URL: https://github.com/apache/solr/pull/2737#issuecomment-2455879410 Probably just a one-liner left and I'll merge away :-) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17456) TransactionLog NPE
[ https://issues.apache.org/jira/browse/SOLR-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895451#comment-17895451 ] ASF subversion and git services commented on SOLR-17456: Commit 919bc994cc3618638d678e03d8a96f244786faad in solr's branch refs/heads/branch_9x from David Smiley [ https://gitbox.apache.org/repos/asf?p=solr.git;h=919bc994cc3 ] SOLR-17456: TransactionLog ctor integrity (#2762) The TransactionLog constructor can't handle an existing file being present; it shouldn't be there. Should throw an exception in this case, NOT log a warning which would leave the object in a partially constructed state. This should happen in the first place, of course. I see no evidence it has occurred. (cherry picked from commit e849fd0540fb4f0e013a1f73e93c3e85a933ed83) > TransactionLog NPE > -- > > Key: SOLR-17456 > URL: https://issues.apache.org/jira/browse/SOLR-17456 > Project: Solr > Issue Type: Bug >Reporter: David Smiley >Priority: Minor > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > In an erroneous case, a TransactionLog should throw an exception if an > unexpected log file exists instead of merely log a warning in its > constructor. The latter leaves the file in a partially constructed state > that leads to NPEs when it's used later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17390) EmbeddedSolrServer should support a ResponseParser
[ https://issues.apache.org/jira/browse/SOLR-17390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895446#comment-17895446 ] ASF subversion and git services commented on SOLR-17390: Commit c5c538a9e025bda77ad591ee82beaa6a6732c408 in solr's branch refs/heads/main from David Smiley [ https://gitbox.apache.org/repos/asf?p=solr.git;h=c5c538a9e02 ] SOLR-17390: EmbeddedSolrServer now considers the ResponseParser (#2774) And * Moved HttpSolrCall.getResponseWriter to SolrQueryRequest * Subtle improvements to make ContentStream work when they might not have > EmbeddedSolrServer should support a ResponseParser > -- > > Key: SOLR-17390 > URL: https://issues.apache.org/jira/browse/SOLR-17390 > Project: Solr > Issue Type: Improvement >Reporter: David Smiley >Assignee: David Smiley >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > By default, a SolrRequest has a null/unspecified ResponseParser; it's handled > automatically within SolrJ. But an explicit one communicates an intent for > the client code to need it, like JsonMapResponseParser, > InputStreamResponseParser, or NoOpResponseParser (particularly those 3). > EmbeddedSolrServer doesn't look at this; the NamedList right out of the > core/handler is normalized (via javabin round-trip) and returned. While that > makes sense _normally_, a ResponseParser should also be supported. This > enables tests that might want to use EmbeddedSolrServer but that which need > to test JSON or XML (for convenience of xpath/json expressions, for example). > Also, the newer V2 API generated clients would need this to support > EmbeddedSolrServer as they are currently based off of > InputStreamResponseParser. > Doing this means determining the correct ResponseWriter (not assuming JavaBin > during normalization). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17453: Leverage waitForState() instead of busy waiting [solr]
dsmiley commented on code in PR #2737: URL: https://github.com/apache/solr/pull/2737#discussion_r1828513954 ## solr/test-framework/src/java/org/apache/solr/cloud/AbstractDistribZkTestBase.java: ## @@ -242,45 +240,15 @@ public static void waitForCollectionToDisappear( log.info("Collection has disappeared - collection:{}", collection); } - static void waitForNewLeader( - CloudSolrClient cloudClient, String shardName, Replica oldLeader, TimeOut timeOut) + static void waitForNewLeader(CloudSolrClient cloudClient, String shardName, Replica oldLeader) throws Exception { -log.info("Will wait for a node to become leader for {} secs", timeOut.timeLeft(SECONDS)); +log.info("Will wait for a node to become leader for 15 secs"); ZkStateReader zkStateReader = ZkStateReader.from(cloudClient); -zkStateReader.forceUpdateCollection(DEFAULT_COLLECTION); - -for (; ; ) { - ClusterState clusterState = zkStateReader.getClusterState(); - DocCollection coll = clusterState.getCollection("collection1"); - Slice slice = coll.getSlice(shardName); - if (slice.getLeader() != null - && !slice.getLeader().equals(oldLeader) - && slice.getLeader().getState() == Replica.State.ACTIVE) { -if (log.isInfoEnabled()) { - log.info( - "Old leader {}, new leader {}. New leader got elected in {} ms", - oldLeader, - slice.getLeader(), - timeOut.timeElapsed(MILLISECONDS)); -} -break; - } - - if (timeOut.hasTimedOut()) { Review Comment: I like your change except for one small thing: You propagate the exception (e.g. TimeoutException) but previously the test code here would explicitly fail. I think we should explicitly fail. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-17390) EmbeddedSolrServer should support a ResponseParser
[ https://issues.apache.org/jira/browse/SOLR-17390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved SOLR-17390. - Fix Version/s: 9.8 Resolution: Fixed > EmbeddedSolrServer should support a ResponseParser > -- > > Key: SOLR-17390 > URL: https://issues.apache.org/jira/browse/SOLR-17390 > Project: Solr > Issue Type: Improvement >Reporter: David Smiley >Assignee: David Smiley >Priority: Major > Labels: pull-request-available > Fix For: 9.8 > > Time Spent: 1h > Remaining Estimate: 0h > > By default, a SolrRequest has a null/unspecified ResponseParser; it's handled > automatically within SolrJ. But an explicit one communicates an intent for > the client code to need it, like JsonMapResponseParser, > InputStreamResponseParser, or NoOpResponseParser (particularly those 3). > EmbeddedSolrServer doesn't look at this; the NamedList right out of the > core/handler is normalized (via javabin round-trip) and returned. While that > makes sense _normally_, a ResponseParser should also be supported. This > enables tests that might want to use EmbeddedSolrServer but that which need > to test JSON or XML (for convenience of xpath/json expressions, for example). > Also, the newer V2 API generated clients would need this to support > EmbeddedSolrServer as they are currently based off of > InputStreamResponseParser. > Doing this means determining the correct ResponseWriter (not assuming JavaBin > during normalization). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-17456) TransactionLog NPE
[ https://issues.apache.org/jira/browse/SOLR-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved SOLR-17456. - Fix Version/s: 9.8 Assignee: David Smiley Resolution: Fixed > TransactionLog NPE > -- > > Key: SOLR-17456 > URL: https://issues.apache.org/jira/browse/SOLR-17456 > Project: Solr > Issue Type: Bug >Reporter: David Smiley >Assignee: David Smiley >Priority: Minor > Labels: pull-request-available > Fix For: 9.8 > > Time Spent: 40m > Remaining Estimate: 0h > > In an erroneous case, a TransactionLog should throw an exception if an > unexpected log file exists instead of merely log a warning in its > constructor. The latter leaves the file in a partially constructed state > that leads to NPEs when it's used later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17390) EmbeddedSolrServer should support a ResponseParser
[ https://issues.apache.org/jira/browse/SOLR-17390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895452#comment-17895452 ] ASF subversion and git services commented on SOLR-17390: Commit 337c3d6ccb2e6157f79f3488ef0deb7b4a852734 in solr's branch refs/heads/branch_9x from David Smiley [ https://gitbox.apache.org/repos/asf?p=solr.git;h=337c3d6ccb2 ] SOLR-17390: EmbeddedSolrServer now considers the ResponseParser (#2774) And * Moved HttpSolrCall.getResponseWriter to SolrQueryRequest * Subtle improvements to make ContentStream work when they might not have (cherry picked from commit c5c538a9e025bda77ad591ee82beaa6a6732c408) > EmbeddedSolrServer should support a ResponseParser > -- > > Key: SOLR-17390 > URL: https://issues.apache.org/jira/browse/SOLR-17390 > Project: Solr > Issue Type: Improvement >Reporter: David Smiley >Assignee: David Smiley >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > By default, a SolrRequest has a null/unspecified ResponseParser; it's handled > automatically within SolrJ. But an explicit one communicates an intent for > the client code to need it, like JsonMapResponseParser, > InputStreamResponseParser, or NoOpResponseParser (particularly those 3). > EmbeddedSolrServer doesn't look at this; the NamedList right out of the > core/handler is normalized (via javabin round-trip) and returned. While that > makes sense _normally_, a ResponseParser should also be supported. This > enables tests that might want to use EmbeddedSolrServer but that which need > to test JSON or XML (for convenience of xpath/json expressions, for example). > Also, the newer V2 API generated clients would need this to support > EmbeddedSolrServer as they are currently based off of > InputStreamResponseParser. > Doing this means determining the correct ResponseWriter (not assuming JavaBin > during normalization). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] Prevent conflicting connection flags from being used via OptionGroup [solr]
malliaridis commented on PR #2725: URL: https://github.com/apache/solr/pull/2725#issuecomment-2455611441 > I go back and forth on making everything a Java object. cli.getOption("my-option") to me reads better than cli.getOption(MY_OPTION) I agree on that, but I also feel that having long and short variants of options makes `cli.getOption("my-option") kinda confusing. And it has proven to be also error-prone when working with strings. Forgetting to update a referene is easier with strings than with object references. This is probably the most important reason to go for objects. > Is this for branch_9x as well??? No, 9x would require all the deprecated options as well, resulting to `OptionGroup`s. If we make a completely different PR we could introduce similar changes, but backporting won't do here withot breaking backwards compatibility. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Comment Edited] (SOLR-15929) Clean up NOTICE and LICENSE files for Solr
[ https://issues.apache.org/jira/browse/SOLR-15929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895429#comment-17895429 ] Jan Høydahl edited comment on SOLR-15929 at 11/4/24 9:05 PM: - I think we may need a spreadsheet with a line per Java dependency and also the adminui ones, with columns for license-type, whether it has notice, whether it is a test only dep, whether the licenses/txt files are correct, whether it is part of slim distro etc. Then we can scrap current LICENSE.txt and NOTICE.txt and make a script that generates these in three variants during tarball build: - Binary, full - Binary, slim - Src (superset including test deps) was (Author: janhoy): I think we may need a spreadsheet with a line per Java dependency and also the adminui ones, with columns for license-type, whether it has notice, whether it is a test only dep, whether the licenses/txt files are correct, whether it is part of slim distro etc. Then we can scrap current LICENSE.txt and NOTICE.txt and make a script that generates these in three variants during tarball build: - Binary, full - Binary, slim - Src > Clean up NOTICE and LICENSE files for Solr > -- > > Key: SOLR-15929 > URL: https://issues.apache.org/jira/browse/SOLR-15929 > Project: Solr > Issue Type: Improvement >Reporter: Jan Høydahl >Priority: Major > > Spinoff from SOLR-15862 and SOLR-2406: > We need a total cleanup of both these files > * Move lots of (C) notices from NOTICE to LICENSE file > * Cross-check that we list all dependencies, and that removed deps (such as > for DIH etc) are removed from NOTICE/LICENSE > I wonder if > [https://github.com/apache/solr/blob/main/solr/licenses/README.committers.txt] > should also be relocated to either `dev-docs/` or `help/` to make it easier > to find. It is hard to get the license/notice stuff right, so we need a good > guide for committers! > See [https://infra.apache.org/licensing-howto.html] for the requirements. PS: > Any preference whether we should rename the files without {{.txt}} suffix? > Also, our source and binary distributions are quite different, and would > ideally have different LICENSE and NOTICE files compared to the binary > distro. I think the Apache Whisker tool could potientailly help with this > [https://creadur.apache.org/whisker/index.html] but have not looked deeply. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-15929) Clean up NOTICE and LICENSE files for Solr
[ https://issues.apache.org/jira/browse/SOLR-15929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895429#comment-17895429 ] Jan Høydahl commented on SOLR-15929: I think we may need a spreadsheet with a line per Java dependency and also the adminui ones, with columns for license-type, whether it has notice, whether it is a test only dep, whether the licenses/txt files are correct, whether it is part of slim distro etc. Then we can scrap current LICENSE.txt and NOTICE.txt and make a script that generates these in three variants during tarball build: - Binary, full - Binary, slim - Src > Clean up NOTICE and LICENSE files for Solr > -- > > Key: SOLR-15929 > URL: https://issues.apache.org/jira/browse/SOLR-15929 > Project: Solr > Issue Type: Improvement >Reporter: Jan Høydahl >Priority: Major > > Spinoff from SOLR-15862 and SOLR-2406: > We need a total cleanup of both these files > * Move lots of (C) notices from NOTICE to LICENSE file > * Cross-check that we list all dependencies, and that removed deps (such as > for DIH etc) are removed from NOTICE/LICENSE > I wonder if > [https://github.com/apache/solr/blob/main/solr/licenses/README.committers.txt] > should also be relocated to either `dev-docs/` or `help/` to make it easier > to find. It is hard to get the license/notice stuff right, so we need a good > guide for committers! > See [https://infra.apache.org/licensing-howto.html] for the requirements. PS: > Any preference whether we should rename the files without {{.txt}} suffix? > Also, our source and binary distributions are quite different, and would > ideally have different LICENSE and NOTICE files compared to the binary > distro. I think the Apache Whisker tool could potientailly help with this > [https://creadur.apache.org/whisker/index.html] but have not looked deeply. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-7871) Platform independent config file instead of solr.in.sh and solr.in.cmd
[ https://issues.apache.org/jira/browse/SOLR-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17895424#comment-17895424 ] Jan Høydahl commented on SOLR-7871: --- As long as the bike shed is yaml I’m all in 🤣🤣🤣 Not sure I see the need for config per environments but it would be nice to piggy back on the production mode to do more strict startup checks than in dev etc. Would the yaml file have SOLR_foo style keys and manipulate env or solr.foo style keys and set sysprops? > Platform independent config file instead of solr.in.sh and solr.in.cmd > -- > > Key: SOLR-7871 > URL: https://issues.apache.org/jira/browse/SOLR-7871 > Project: Solr > Issue Type: Improvement > Components: scripts and tools >Affects Versions: 5.2.1 >Reporter: Jan Høydahl >Priority: Major > Labels: bin/solr > Attachments: SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, > SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, > SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, > SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch, SOLR-7871.patch > > > Spinoff from SOLR-7043 > The config files {{solr.in.sh}} and {{solr.in.cmd}} are currently executable > batch files, but all they do is to set environment variables for the start > scripts on the format {{key=value}} > Suggest to instead have one central platform independent config file e.g. > {{bin/solr.yml}} or {{bin/solrstart.properties}} which is parsed by > {{SolrCLI.java}}. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17535 ClusterState.collectionStream() in lieu of getCollectionStates() [solr]
dsmiley commented on code in PR #2834: URL: https://github.com/apache/solr/pull/2834#discussion_r1827774012 ## solr/core/src/java/org/apache/solr/handler/designer/SchemaDesignerConfigSetHelper.java: ## @@ -168,24 +167,12 @@ Map analyzeField(String configSet, String fieldName, String fiel } List listCollectionsForConfig(String configSet) { -final List collections = new ArrayList<>(); -Map states = -zkStateReader().getClusterState().getCollectionStates(); -for (Map.Entry e : states.entrySet()) { - final String coll = e.getKey(); - if (coll.startsWith(DESIGNER_PREFIX)) { -continue; // ignore temp - } - - try { -if (configSet.equals(e.getValue().get().getConfigName()) && e.getValue().get() != null) { - collections.add(coll); -} - } catch (Exception exc) { -log.warn("Failed to get config name for {}", coll, exc); - } -} -return collections; +return zkStateReader() +.getClusterState() +.collectionStream() Review Comment: You are saying this because of the collection name filter that was there. As I indicated in my self-PR review, I think that was bogus/erroneous. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17256: Remove usages of deprecated SolrRequest.setBasePath [solr]
dsmiley commented on code in PR #2811: URL: https://github.com/apache/solr/pull/2811#discussion_r1827794290 ## solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBSolrClient.java: ## @@ -480,14 +480,35 @@ private static boolean isTimeExceeded(long timeAllowedNano, long timeOutTime) { return timeAllowedNano > 0 && System.nanoTime() > timeOutTime; } + private NamedList doMakeRequest(Endpoint endpoint, SolrRequest solrRequest) + throws SolrServerException, IOException { +final var solrClient = getClient(endpoint); +return doMakeRequest(solrClient, endpoint.getBaseUrl(), endpoint.getCore(), solrRequest); + } + + // TODO This special casing can be removed if either: (1) SOLR-16367 is completed, or (2) + // LBHttp2SolrClient.getClient() is modified to return a client already pointed at the correct URL + private NamedList doMakeRequest( + SolrClient solrClient, String baseUrl, String collection, SolrRequest solrRequest) + throws SolrServerException, IOException { +// Some implementations of LBSolrClient.getClient(...) return a Http2SolrClient that may not be +// pointed at the desired URL (or any URL for that matter). We special case that here to ensure +// the appropriate URL is provided. +if (solrClient instanceof Http2SolrClient) { + final var httpSolrClient = (Http2SolrClient) solrClient; + return httpSolrClient.requestWithBaseUrl(baseUrl, (c) -> c.request(solrRequest, collection)); +} + +return solrClient.request(solrRequest, collection); Review Comment: I agree it's definitely worth its own ticket! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17256: Remove usages of deprecated SolrRequest.setBasePath [solr]
gerlowskija commented on code in PR #2811: URL: https://github.com/apache/solr/pull/2811#discussion_r1827753327 ## solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBSolrClient.java: ## @@ -480,14 +480,35 @@ private static boolean isTimeExceeded(long timeAllowedNano, long timeOutTime) { return timeAllowedNano > 0 && System.nanoTime() > timeOutTime; } + private NamedList doMakeRequest(Endpoint endpoint, SolrRequest solrRequest) + throws SolrServerException, IOException { +final var solrClient = getClient(endpoint); +return doMakeRequest(solrClient, endpoint.getBaseUrl(), endpoint.getCore(), solrRequest); + } + + // TODO This special casing can be removed if either: (1) SOLR-16367 is completed, or (2) + // LBHttp2SolrClient.getClient() is modified to return a client already pointed at the correct URL + private NamedList doMakeRequest( + SolrClient solrClient, String baseUrl, String collection, SolrRequest solrRequest) + throws SolrServerException, IOException { +// Some implementations of LBSolrClient.getClient(...) return a Http2SolrClient that may not be +// pointed at the desired URL (or any URL for that matter). We special case that here to ensure +// the appropriate URL is provided. +if (solrClient instanceof Http2SolrClient) { + final var httpSolrClient = (Http2SolrClient) solrClient; + return httpSolrClient.requestWithBaseUrl(baseUrl, (c) -> c.request(solrRequest, collection)); +} + +return solrClient.request(solrRequest, collection); Review Comment: Agreed, but IMO that probably deserves its own ticket. Switching the Jetty LB client to work this way would probably let us reuse/share some of the client-management code from the Apache LB client...which is great!...but it'd also turn things into a slightly larger refactor than I want to tackle here. If we're agreed on this approach I can create a ticket for that work and update the TODO comment here to say essentially: "rip this out when tackling SOLR-12345"? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17256: Remove usages of deprecated SolrRequest.setBasePath [solr]
gerlowskija commented on code in PR #2811: URL: https://github.com/apache/solr/pull/2811#discussion_r1827753327 ## solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBSolrClient.java: ## @@ -480,14 +480,35 @@ private static boolean isTimeExceeded(long timeAllowedNano, long timeOutTime) { return timeAllowedNano > 0 && System.nanoTime() > timeOutTime; } + private NamedList doMakeRequest(Endpoint endpoint, SolrRequest solrRequest) + throws SolrServerException, IOException { +final var solrClient = getClient(endpoint); +return doMakeRequest(solrClient, endpoint.getBaseUrl(), endpoint.getCore(), solrRequest); + } + + // TODO This special casing can be removed if either: (1) SOLR-16367 is completed, or (2) + // LBHttp2SolrClient.getClient() is modified to return a client already pointed at the correct URL + private NamedList doMakeRequest( + SolrClient solrClient, String baseUrl, String collection, SolrRequest solrRequest) + throws SolrServerException, IOException { +// Some implementations of LBSolrClient.getClient(...) return a Http2SolrClient that may not be +// pointed at the desired URL (or any URL for that matter). We special case that here to ensure +// the appropriate URL is provided. +if (solrClient instanceof Http2SolrClient) { + final var httpSolrClient = (Http2SolrClient) solrClient; + return httpSolrClient.requestWithBaseUrl(baseUrl, (c) -> c.request(solrRequest, collection)); +} + +return solrClient.request(solrRequest, collection); Review Comment: Agreed, but IMO that probably deserves its own ticket. Switching the Jetty LB client to work this way would probably let us reuse/share some of the client-management code from the Apache LB client...which is great!...but it'd also turn things into a slightly larger refactor than I want to tackle here. If we're agreed on this approach I can create a ticket for that work and update the TODO comment here to say essentially: "rip this out when tackling SOLR-#"? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17535 ClusterState.collectionStream() in lieu of getCollectionStates() [solr]
dsmiley commented on code in PR #2834: URL: https://github.com/apache/solr/pull/2834#discussion_r1827771870 ## solr/solrj/src/java/org/apache/solr/common/cloud/ClusterState.java: ## @@ -400,29 +402,14 @@ public Set getHostAllowList() { return hostAllowList; } - /** - * Iterate over collections. Unlike {@link #getCollectionStates()} collections passed to the - * consumer are guaranteed to exist. - * - * @param consumer collection consumer. - */ + /** Streams the resolved DocCollections. Use this sparingly in case there are many collections. */ + public Stream collectionStream() { +return collectionStates.values().stream().map(CollectionRef::get).filter(Objects::nonNull); + } + + /** Streams the resolved DocCollections. Use this sparingly in case there are many collections. */ Review Comment: It was deliberate as I thought it was good enough, but I'll add more words -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org