Re: [PR] SOLR-17649: Fix Json faceting on multivalue number types [solr]

2025-02-05 Thread via GitHub


thomaswoeckinger commented on PR #3158:
URL: https://github.com/apache/solr/pull/3158#issuecomment-2636198561

   @dsmiley @gerlowskija May you know who is responsible for this part of 
faceting code, to get a reviewer


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Updated] (SOLR-17649) Multivalue facets on enum field type returns empty result when using JsonFacet API

2025-02-05 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SOLR-17649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Wöckinger updated SOLR-17649:

Summary: Multivalue facets on enum field type returns empty result when 
using JsonFacet API  (was: Multivalue facets on enum field type returns empty 
result when using JsonFacet)

> Multivalue facets on enum field type returns empty result when using 
> JsonFacet API
> --
>
> Key: SOLR-17649
> URL: https://issues.apache.org/jira/browse/SOLR-17649
> Project: Solr
>  Issue Type: Bug
>  Components: Facet Module, FacetComponent, faceting
>Affects Versions: 9.4, 9.5, 9.4.1, 9.6, 9.7, 9.6.1, 9.8
>Reporter: Thomas Wöckinger
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When using JsonFacet API on a multivalued EnumFieldType with facet method 
> 'enum' the {color:#232629}FacetFieldProcessorByArrayDV{color} will be used.
> At line 96 ({color:#232629}FacetFieldProcessorByArrayDV.java){color} there is 
> a check about allBuckets or missing buckets, which simply skip collect 
> process.
> So at the moment there is no support for Multivalue faceting on facets which 
> are using FacetFieldProcessorByArrayDV facet processor for collecting facets.
> Tested this behavior from 9.4 onwards, this feature was working on the 8.x 
> releases.
> Multivalue faceting on EnumFieldType should be supported again.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17649) Multivalue facets on enum field type returns empty result when using JsonFacet

2025-02-05 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-17649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17923979#comment-17923979
 ] 

Thomas Wöckinger commented on SOLR-17649:
-

[~dsmiley] [ |https://github.com/dsmiley] [~gerlowskija] [ 
|https://github.com/gerlowskija] May you know who is responsible for this part 
of faceting code, to get a reviewer for it.

> Multivalue facets on enum field type returns empty result when using JsonFacet
> --
>
> Key: SOLR-17649
> URL: https://issues.apache.org/jira/browse/SOLR-17649
> Project: Solr
>  Issue Type: Bug
>  Components: Facet Module, FacetComponent, faceting
>Affects Versions: 9.4, 9.5, 9.4.1, 9.6, 9.7, 9.6.1, 9.8
>Reporter: Thomas Wöckinger
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When using JsonFacet API on a multivalued EnumFieldType with facet method 
> 'enum' the {color:#232629}FacetFieldProcessorByArrayDV{color} will be used.
> At line 96 ({color:#232629}FacetFieldProcessorByArrayDV.java){color} there is 
> a check about allBuckets or missing buckets, which simply skip collect 
> process.
> So at the moment there is no support for Multivalue faceting on facets which 
> are using FacetFieldProcessorByArrayDV facet processor for collecting facets.
> Tested this behavior from 9.4 onwards, this feature was working on the 8.x 
> releases.
> Multivalue faceting on EnumFieldType should be supported again.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17351: Decompose filestore "get file" API [solr]

2025-02-05 Thread via GitHub


gerlowskija commented on code in PR #3047:
URL: https://github.com/apache/solr/pull/3047#discussion_r1943253094


##
solr/core/src/java/org/apache/solr/filestore/NodeFileStore.java:
##
@@ -76,135 +66,54 @@ public SolrJerseyResponse getFile(String path, Boolean 
sync, String getFrom, Boo
 final var response = instantiateJerseyResponse(SolrJerseyResponse.class);
 
 if (Boolean.TRUE.equals(sync)) {
-  try {
-fileStore.syncToAllNodes(path);
-return response;
-  } catch (IOException e) {
-throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "Error 
getting file ", e);
-  }
+  ClusterFileStore.syncToAllNodes(fileStore, path);

Review Comment:
   Alright, the most recent iteration removes NodeFileStore/NodeFileStoreApi.  
There's still some room for confusion between ClusterFileStoreApi, 
ClusterFileStore, and DistribFileStore, but overall this PR leaves things 
better than it found it in that regard 👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]

2025-02-05 Thread via GitHub


epugh commented on code in PR #3151:
URL: https://github.com/apache/solr/pull/3151#discussion_r1943278670


##
solr/test-framework/src/java/org/apache/solr/util/RestTestBase.java:
##
@@ -88,13 +88,33 @@ private static void checkUpdateU(String message, String 
update, boolean shouldSu
 if (response != null) fail(m + "update was not successful: " + 
response);
   } else {
 String response = restTestHarness.validateErrorUpdate(update);
-if (response != null) fail(m + "update succeeded, but should have 
failed: " + response);
+if (response == null) fail(m + "update succeeded, but should have 
failed: " + response);
   }
 } catch (SAXException e) {
   throw new RuntimeException("Invalid XML", e);
 }
   }
 
+  public static void checkUpdateU(String update, String... tests) {

Review Comment:
   Not specific per se to this, but I wish we had a clearer plan about the 
future of RestTestBase.  ARe we embracing it?



##
solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java:
##
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.llm.texttovector.update.processor;
+
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.SolrInputField;
+import org.apache.solr.llm.texttovector.model.SolrTextToVectorModel;
+import 
org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.update.AddUpdateCommand;
+import org.apache.solr.update.processor.UpdateRequestProcessor;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.ArrayList;
+import java.util.List;
+
+
+class TextToVectorUpdateProcessor extends UpdateRequestProcessor {
+private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+private final String inputField;
+private final String outputField;
+private final String model;
+private SolrTextToVectorModel textToVector;
+private ManagedTextToVectorModelStore modelStore = null;
+
+public TextToVectorUpdateProcessor(
+String inputField,
+String outputField,
+String model,
+SolrQueryRequest req,
+UpdateRequestProcessor next) {
+super(next);
+this.inputField = inputField;
+this.outputField = outputField;
+this.model = model;
+this.modelStore = 
ManagedTextToVectorModelStore.getManagedModelStore(req.getCore());
+}
+
+/**
+ * @param cmd the update command in input containing the Document to 
process
+ * @throws IOException If there is a low-level I/O error
+ */
+@Override
+public void processAdd(AddUpdateCommand cmd) throws IOException {
+this.textToVector = modelStore.getModel(model);
+if (textToVector == null) {
+throw new SolrException(
+SolrException.ErrorCode.BAD_REQUEST,
+"The model requested '"
++ model
++ "' can't be found in the store: "
++ ManagedTextToVectorModelStore.REST_END_POINT);
+}
+
+SolrInputDocument doc = cmd.getSolrInputDocument();
+SolrInputField inputFieldContent = doc.get(inputField);
+if (!isNullOrEmpty(inputFieldContent, doc, inputField)) {
+String textToVectorise = 
inputFieldContent.getValue().toString();//add null checks and
+float[] vector = textToVector.vectorise(textToVectorise);

Review Comment:
   I was chatting with @iamsanjay this morning, and I was expounding on the 
thought that a lot of folks might want to first index the document with just 
the core text/string/numbers, and then, since enrichment is SLOW, come back 
with a streaming expression and do things like vectorization, and an atomic 
update..  that way you pump your data in as fast as possible, and then enrich 
at your leisure...This model cons

[jira] [Commented] (SOLR-13681) make Lucene's index sorting directly configurable in Solr

2025-02-05 Thread Christine Poerschke (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17923973#comment-17923973
 ] 

Christine Poerschke commented on SOLR-13681:


bq. ticket cross-referencing: SOLR-12239 details how enabling of index sorting 
for an existing collection is problematic. if (speculation) LUCENE-9484 perhaps 
solves that then that would also be beneficial for the changes here it seems.

I don't know if the enabling of an existing collection remains problematic or 
not. However, if it were, might perhaps "Solr 10" provide an "opportunity" 
somehow e.g. to deprecated the SortingMergePolicy in 9.x and remove it in 10x 
and then to have index sorting directly configurable from 10 onwards only?

> make Lucene's index sorting directly configurable in Solr
> -
>
> Key: SOLR-13681
> URL: https://issues.apache.org/jira/browse/SOLR-13681
> Project: Solr
>  Issue Type: New Feature
>Reporter: Christine Poerschke
>Priority: Minor
> Attachments: SOLR-13681-refguide-skel.patch, SOLR-13681.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> History/Background:
> * SOLR-5730 made Lucene's SortingMergePolicy and 
> EarlyTerminatingSortingCollector configurable in Solr 6.0 or later.
> * LUCENE-6766 make index sorting a first-class citizen in Lucene 6.2 or later.
> Current status:
> * In Solr 8.2 use of index sorting is only available via configuration of a 
> (top-level) merge policy that is a SortingMergePolicy and that policy's sort 
> is then passed to the index writer config via the 
> {code}
> if (mergePolicy instanceof SortingMergePolicy) {
>   Sort indexSort = ((SortingMergePolicy) mergePolicy).getSort();
>   iwc.setIndexSort(indexSort);
> }
> {code}
> https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L241-L244
>  code path.
> Proposed change:
> * in-scope for this ticket: To add direct support for index sorting 
> configuration in Solr.
> * out-of-scope for this ticket: deprecation and removal of SortingMergePolicy 
> support



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17653) New DockerSolrServerTestRule that uses TestContainers/Docker

2025-02-05 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-17653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17924309#comment-17924309
 ] 

Jan Høydahl commented on SOLR-17653:


I have some experience with JUnit and testContainers. In general, works like 
magic, but there are a few tings to watch out for
 * You can start a container from a Dockerfile, but it of course takes ages to 
run the test due to the image building, pulling stuff from hub
 * When addressing the new container from a test, it may work with "localhost" 
when run locally, but once in CI/CD it may be something else
 * Solr code running in the container cannot use localhost to address another 
node in another container. Must either use docker networking (preferred) or 
"host.docker.internal".
 * Cleaning up (docker prune system or similar) as part of test harness may 
avoid buildup and disks filling up with old images or volumes.

> New DockerSolrServerTestRule that uses TestContainers/Docker
> 
>
> Key: SOLR-17653
> URL: https://issues.apache.org/jira/browse/SOLR-17653
> Project: Solr
>  Issue Type: Test
>  Components: SolrJ, test-framework
>Reporter: David Smiley
>Priority: Major
>
> We've got a {{SolrClientTestRule}} abstraction in our test infrastructure 
> that makes it easy for a test to work with Solr in an abstracted sense, using 
> a SolrClient to talk to it.  This issue proposes a new implementation that 
> uses an Http SolrClient (of configurable implementation) to talk to SolrCloud 
> (embedded ZK) running on a single node in a Docker container, and using 
> TestContainers to facilitate the integration.  Future extensibility: the 
> implementation could consider multiple nodes thus multiple containers.  The 
> location of this utility could be a new artifact (JAR) living in the docker 
> module with appropriate dependencies, or just put in solr-test-framework.
> This would be extremely useful!  We could then write a hello-world SolrJ 
> SolrCloud test in which Solr is running semi-realistically.  This is all we 
> need for easily adding some backwards-compatibility tests, in either 
> direction (SolrJ 9 to Solr 10, and SolrJ 10 to Solr 9).  Both directions are 
> useful in a rolling upgrade.  Tests using this in our test suite should be 
> demarcated / separated somehow so that running such tests is opt-in, not part 
> of the "check" task.
> The addition of this may draw into question the validity/utility of our 
> Docker bats tests, that could perhaps instead be using this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17649) Multivalue facets on enum field type returns empty result when using JsonFacet API

2025-02-05 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17924310#comment-17924310
 ] 

David Smiley commented on SOLR-17649:
-

Might you surmise how this regression came to be?

> Multivalue facets on enum field type returns empty result when using 
> JsonFacet API
> --
>
> Key: SOLR-17649
> URL: https://issues.apache.org/jira/browse/SOLR-17649
> Project: Solr
>  Issue Type: Bug
>  Components: Facet Module, FacetComponent, faceting
>Affects Versions: 9.4, 9.5, 9.4.1, 9.6, 9.7, 9.6.1, 9.8
>Reporter: Thomas Wöckinger
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When using JsonFacet API on a multivalued EnumFieldType with facet method 
> 'enum' the {color:#232629}FacetFieldProcessorByArrayDV{color} will be used.
> At line 96 ({color:#232629}FacetFieldProcessorByArrayDV.java){color} there is 
> a check about allBuckets or missing buckets, which simply skip collect 
> process.
> So at the moment there is no support for Multivalue faceting on facets which 
> are using FacetFieldProcessorByArrayDV facet processor for collecting facets.
> Tested this behavior from 9.4 onwards, this feature was working on the 8.x 
> releases.
> Multivalue faceting on EnumFieldType should be supported again.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17649: Fix Json faceting on multivalue number types [solr]

2025-02-05 Thread via GitHub


dsmiley commented on code in PR #3158:
URL: https://github.com/apache/solr/pull/3158#discussion_r1943824313


##
solr/core/src/test/org/apache/solr/search/facet/TestJsonFacets.java:
##
@@ -4907,6 +4907,38 @@ public void testQueryJoinBooksAndPages() throws 
Exception {
 + ", books2:{ buckets:[ {val:q,count:1}, {val:w,count:1} ] }"
 + "}");
   }
+  
+  @Test
+  public void testMultivalueEnumTypes() throws Exception {
+final Client client = Client.localClient();
+
+final SolrParams p = params("rows", "0");
+
+client.deleteByQuery("*:*", null);
+
+List docsToAdd = new ArrayList<>(6);

Review Comment:
   Can you add them to an UpdateRequest so they are batched instead of sending 
them individually?  Same LOC in the end but avoids an anti-pattern.  Or is the 
separation pertinent to what's being tested?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17654: DistribFileStore._getRealPath() has issues on Windows [solr]

2025-02-05 Thread via GitHub


janhoy commented on PR #3160:
URL: https://github.com/apache/solr/pull/3160#issuecomment-2638285227

   > _(thinking out loud)_ I'd love to get a JVM running in Windows in a 
container maybe (not possible with macOS host?) or I suppose VirtualBox if I 
have to. Actually, I could probably use AWS free tier and use an AMI with Java 
to do some temporary tinkering. Or maybe someone recommends another option.
   
   I use UTM (https://mac.getutm.app) on M1 mac to spin up Win11 ARM edition. 
You can share your disk and test stuff. PS: Wonder if we could tell 
GithubActions to run tests on windows runner for a PR? Perhaps triggered by 
some file touched?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17130: edismax-matchalldocs-optimization [solr]

2025-02-05 Thread via GitHub


github-actions[bot] closed pull request #2218: SOLR-17130: 
edismax-matchalldocs-optimization
URL: https://github.com/apache/solr/pull/2218


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17130: edismax-matchalldocs-optimization [solr]

2025-02-05 Thread via GitHub


github-actions[bot] commented on PR #2218:
URL: https://github.com/apache/solr/pull/2218#issuecomment-2638303460

   This PR is now closed due to 60 days of inactivity after being marked as 
stale.  Re-opening this PR is still possible, in which case it will be marked 
as active again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-16810: Under certain situations Solr produces managed schema XML with duplicate fields [solr]

2025-02-05 Thread via GitHub


github-actions[bot] closed pull request #1654: SOLR-16810: Under certain 
situations Solr produces managed schema XML with duplicate fields
URL: https://github.com/apache/solr/pull/1654


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-16810: Under certain situations Solr produces managed schema XML with duplicate fields [solr]

2025-02-05 Thread via GitHub


github-actions[bot] commented on PR #1654:
URL: https://github.com/apache/solr/pull/1654#issuecomment-2638303508

   This PR is now closed due to 60 days of inactivity after being marked as 
stale.  Re-opening this PR is still possible, in which case it will be marked 
as active again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17131: Optimize rows=0 since score/sort isn't necessary [solr]

2025-02-05 Thread via GitHub


github-actions[bot] commented on PR #2221:
URL: https://github.com/apache/solr/pull/2221#issuecomment-2638303410

   This PR is now closed due to 60 days of inactivity after being marked as 
stale.  Re-opening this PR is still possible, in which case it will be marked 
as active again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17131: Optimize rows=0 since score/sort isn't necessary [solr]

2025-02-05 Thread via GitHub


github-actions[bot] closed pull request #2221: SOLR-17131: Optimize rows=0 
since score/sort isn't necessary
URL: https://github.com/apache/solr/pull/2221


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17649: Fix Json faceting on multivalue number types [solr]

2025-02-05 Thread via GitHub


thomaswoeckinger commented on PR #3158:
URL: https://github.com/apache/solr/pull/3158#issuecomment-2639051999

   > It's unfortunate we can't check the docValues format, but can we at least 
add a comment explaining this. It's a bit confusing just looking at it without 
context.
   
   FacetFieldProcessorByArrayDV does not handle DocValuesType.SORTED_NUMERIC 
because in method findStartAndEndOrds FieldUtil.getSortedSetDocValues is used, 
which can not handle this kind of DocValuesType.
   
   In detail FacetFieldProcessorByArrayDV  extends FacetFieldProcessorByArray 
which seems to be designed for DocValuesType.SORTED_SET, because the method 
lookupOrd is only possible when having such a type.
   
   Saying that, you may have some suggestions for a code comment.
   
   For me it seems this was not working form the beginning of the 9x branch, 
there was simply not test for it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17635) javabin should deserialize maps as SimpleOrderedMap

2025-02-05 Thread Renato Haeberli (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17924383#comment-17924383
 ] 

Renato Haeberli commented on SOLR-17635:


The default behavior should be to use SimpleOrderMap, and the property is to 
switch it back to NameList?

> javabin should deserialize maps as SimpleOrderedMap
> ---
>
> Key: SOLR-17635
> URL: https://issues.apache.org/jira/browse/SOLR-17635
> Project: Solr
>  Issue Type: Improvement
>Reporter: David Smiley
>Priority: Major
>
> Once SimpleOrderedMap actually implements Map (SOLR-17623), Solr's "javabin" 
> format should deserialize all maps as a SimpleOrderedMap.  This will make it 
> easier to transition away from NamedList/SimpleOrderedMap in responses (such 
> as to a Map or MapWriter) without worry of impacting javabin clients that 
> still expect a NamedList.  
> It may also increase deserialization performance & lower memory at the 
> expense of any former Maps (thus were deserialized as LinkedHashMap, O(1) 
> lookup) becoming O(N) lookup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[PR] SOLR-17654: DistribFileStore._getRealPath() has issues on Windows [solr]

2025-02-05 Thread via GitHub


mlbiscoc opened a new pull request, #3160:
URL: https://github.com/apache/solr/pull/3160

   https://issues.apache.org/jira/browse/SOLR-17654
   
   # Description
   
   Number of tests failing on windows due to beginning slash in a path not 
being stripped to become relative.
   
   # Solution
   
   Revert to old logic but use more modern 
`FileSystems.getDefault().getSeparator()` instead.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [ ] I have reviewed the guidelines for [How to 
Contribute](https://github.com/apache/solr/blob/main/CONTRIBUTING.md) and my 
code conforms to the standards described there to the best of my ability.
   - [ ] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [ ] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended, not available for 
branches on forks living under an organisation)
   - [ ] I have developed this patch against the `main` branch.
   - [ ] I have run `./gradlew check`.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Reference 
Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-16903: Migrate off java.io.File to java.nio.file.Path from core files [solr]

2025-02-05 Thread via GitHub


mlbiscoc commented on code in PR #2924:
URL: https://github.com/apache/solr/pull/2924#discussion_r1943703861


##
solr/core/src/java/org/apache/solr/filestore/DistribFileStore.java:
##
@@ -80,19 +80,21 @@ public DistribFileStore(CoreContainer coreContainer) {
   }
 
   @Override
-  public Path getRealpath(String path) {
+  public Path getRealPath(String path) {
 return _getRealPath(path, solrHome);
   }
 
-  private static Path _getRealPath(String path, Path solrHome) {
-if (File.separatorChar == '\\') {
-  path = path.replace('/', File.separatorChar);
-}
-SolrPaths.assertNotUnc(Path.of(path));
-while (path.startsWith(File.separator)) { // Trim all leading slashes
-  path = path.substring(1);
+  private static Path _getRealPath(String dir, Path solrHome) {
+Path path = Path.of(dir);
+SolrPaths.assertNotUnc(path);
+
+if (path.isAbsolute()) {
+  // Strip the path of from being absolute to become relative to resolve 
with SolrHome
+  path = path.subpath(0, path.getNameCount());
 }

Review Comment:
   [PR](https://github.com/apache/solr/pull/3160) to bring back that old logic.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Updated] (SOLR-17654) DistribFileStore._getRealPath() has issues on Windows

2025-02-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SOLR-17654:
--
Labels: pull-request-available  (was: )

> DistribFileStore._getRealPath() has issues on Windows
> -
>
> Key: SOLR-17654
> URL: https://issues.apache.org/jira/browse/SOLR-17654
> Project: Solr
>  Issue Type: Improvement
>Reporter: Houston Putman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> On Windows, many tests that use the DistribFileStore, such as 
> {{TestPackages}}, {{TestDistribFileStore}} and {{PackageToolTest}} are 
> failing because of an issue in {{DistribFileStore._getRealPath()}}.
> This method tries to remove the beginning slashes from the path, and then 
> tries to make a new path relative to the file store location. However, in the 
> tests, it's failing and showing stuff like "Illegal path 
> \mypkg\v.0.12\jar_a.jar". Clearly in the code, the first "\" should have been 
> removed, so this code is having an issue with Windows.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17351: Decompose filestore "get file" API [solr]

2025-02-05 Thread via GitHub


gerlowskija commented on PR #3047:
URL: https://github.com/apache/solr/pull/3047#issuecomment-2637426498

   Alright - I think I've addressed the feedback so far?  If I've missed 
anything, let me know.  I've brought it up to date with 'main' and will aim to 
merge in the next few days pending any objections?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]

2025-02-05 Thread via GitHub


alessandrobenedetti commented on code in PR #3151:
URL: https://github.com/apache/solr/pull/3151#discussion_r1943491138


##
solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java:
##
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.llm.texttovector.update.processor;
+
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.SolrInputField;
+import org.apache.solr.llm.texttovector.model.SolrTextToVectorModel;
+import 
org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.update.AddUpdateCommand;
+import org.apache.solr.update.processor.UpdateRequestProcessor;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.ArrayList;
+import java.util.List;
+
+
+class TextToVectorUpdateProcessor extends UpdateRequestProcessor {
+private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+private final String inputField;
+private final String outputField;
+private final String model;
+private SolrTextToVectorModel textToVector;
+private ManagedTextToVectorModelStore modelStore = null;

Review Comment:
   Sure!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Created] (SOLR-17654) DistribFileStore._getRealPath() has issues on Windows

2025-02-05 Thread Houston Putman (Jira)
Houston Putman created SOLR-17654:
-

 Summary: DistribFileStore._getRealPath() has issues on Windows
 Key: SOLR-17654
 URL: https://issues.apache.org/jira/browse/SOLR-17654
 Project: Solr
  Issue Type: Improvement
Reporter: Houston Putman


On Windows, many tests that use the DistribFileStore, such as {{TestPackages}}, 
{{TestDistribFileStore}} and {{PackageToolTest}} are failing because of an 
issue in {{DistribFileStore._getRealPath()}}.

This method tries to remove the beginning slashes from the path, and then tries 
to make a new path relative to the file store location. However, in the tests, 
it's failing and showing stuff like "Illegal path \mypkg\v.0.12\jar_a.jar". 
Clearly in the code, the first "\" should have been removed, so this code is 
having an issue with Windows.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]

2025-02-05 Thread via GitHub


alessandrobenedetti commented on code in PR #3151:
URL: https://github.com/apache/solr/pull/3151#discussion_r1943577028


##
solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessorFactory.java:
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.llm.texttovector.update.processor;
+
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.params.SolrParams;
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.schema.DenseVectorField;
+import org.apache.solr.schema.FieldType;
+import org.apache.solr.schema.SchemaField;
+import org.apache.solr.update.processor.UpdateRequestProcessor;
+import org.apache.solr.update.processor.UpdateRequestProcessorFactory;
+
+/**
+ * This class implements an UpdateProcessorFactory for the Text To Vector 
Update Processor.
+ */
+public class TextToVectorUpdateProcessorFactory extends 
UpdateRequestProcessorFactory {
+private static final String INPUT_FIELD_PARAM = "inputField";
+private static final String OUTPUT_FIELD_PARAM = "outputField";
+private static final String MODEL_NAME = "model";
+
+String inputField;
+String outputField;
+String modelName;
+SolrParams params;
+
+
+@Override
+public void init(final NamedList args) {
+if (args != null) {

Review Comment:
   my bad, I took inspiration from an old factory, I'll remove this useless 
check in the next commit!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17654: DistribFileStore._getRealPath() has issues on Windows [solr]

2025-02-05 Thread via GitHub


dsmiley commented on PR #3160:
URL: https://github.com/apache/solr/pull/3160#issuecomment-2638256078

   _(thinking out loud)_
   I'd love to get a JVM running in Windows in a container maybe (not possible 
with macOS host?) or I suppose VirtualBox if I have to.  Actually, I could 
probably use AWS free tier and use an AMI with Java to do some temporary 
tinkering.  Or maybe someone recommends another option.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] Deprecations [solr]

2025-02-05 Thread via GitHub


epugh commented on PR #3159:
URL: https://github.com/apache/solr/pull/3159#issuecomment-2636956809

   I ❤️ deprecations.  How can we have a strategy to make sure these removals 
happen?   I.e, how can we help folks see these areas of work and then move it 
forward?  
   
   ARe you thinking that once this is back ported to 9, we can just go ahead in 
`main` and start removing the deprecated code?
   
   In my head, any deprecated code existing in 10 is a shame.  We should rip 
those band aids off, but maybe I need a more nuanced view of what deprecation 
means.   Every time I see in a test a deprecated method (especially when it's 
not clear exactly how to fix it) it makes me sad ;-).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]

2025-02-05 Thread via GitHub


alessandrobenedetti commented on code in PR #3151:
URL: https://github.com/apache/solr/pull/3151#discussion_r1943628396


##
solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java:
##
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.llm.texttovector.update.processor;
+
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.SolrInputField;
+import org.apache.solr.llm.texttovector.model.SolrTextToVectorModel;
+import 
org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.update.AddUpdateCommand;
+import org.apache.solr.update.processor.UpdateRequestProcessor;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.ArrayList;
+import java.util.List;
+
+
+class TextToVectorUpdateProcessor extends UpdateRequestProcessor {
+private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+private final String inputField;
+private final String outputField;
+private final String model;
+private SolrTextToVectorModel textToVector;
+private ManagedTextToVectorModelStore modelStore = null;
+
+public TextToVectorUpdateProcessor(
+String inputField,
+String outputField,
+String model,
+SolrQueryRequest req,
+UpdateRequestProcessor next) {
+super(next);
+this.inputField = inputField;
+this.outputField = outputField;
+this.model = model;
+this.modelStore = 
ManagedTextToVectorModelStore.getManagedModelStore(req.getCore());
+}
+
+/**
+ * @param cmd the update command in input containing the Document to 
process
+ * @throws IOException If there is a low-level I/O error
+ */
+@Override
+public void processAdd(AddUpdateCommand cmd) throws IOException {
+this.textToVector = modelStore.getModel(model);
+if (textToVector == null) {
+throw new SolrException(
+SolrException.ErrorCode.BAD_REQUEST,
+"The model requested '"
++ model
++ "' can't be found in the store: "
++ ManagedTextToVectorModelStore.REST_END_POINT);
+}
+
+SolrInputDocument doc = cmd.getSolrInputDocument();
+SolrInputField inputFieldContent = doc.get(inputField);
+if (!isNullOrEmpty(inputFieldContent, doc, inputField)) {
+String textToVectorise = 
inputFieldContent.getValue().toString();//add null checks and
+float[] vector = textToVector.vectorise(textToVectorise);
+List vectorAsList = new ArrayList(vector.length);
+for (float f : vector) {
+vectorAsList.add(f);
+}
+doc.addField(outputField, vectorAsList);
+}
+super.processAdd(cmd);
+}
+
+protected boolean isNullOrEmpty(SolrInputField inputFieldContent, 
SolrInputDocument doc, String fieldName) {

Review Comment:
   mmm I see your point, better if we just log a warning say "vectorisation 
failed", with the reason "null or empty source field" ?
   
   
   I suspect that silent failure would be equally problematic to understand why 
there are no vectors? (I also just discovered that the 'vectorise' method could 
throw runtime exception)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] Deprecations [solr]

2025-02-05 Thread via GitHub


dsmiley commented on PR #3159:
URL: https://github.com/apache/solr/pull/3159#issuecomment-2637969249

   > we can just go ahead in main and start removing the deprecated code?
   
   Sure.  Some have JIRAs, even.  There's plenty of other deprecations outside 
this PR.
   
   > any deprecated code existing in 10 is a shame.
   
   I've learned to be realistic in a large code base.  The positive spin I'm at 
peace with is that spending just a little bit of deprecation time in advance 
gives us permission later to have the fun of deleting stuff when it suits us.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]

2025-02-05 Thread via GitHub


alessandrobenedetti commented on code in PR #3151:
URL: https://github.com/apache/solr/pull/3151#discussion_r1943633074


##
solr/modules/llm/src/test/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessorFactoryTest.java:
##
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.llm.texttovector.update.processor;
+
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.params.MultiMapSolrParams;
+import org.apache.solr.common.params.SolrParams;
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.llm.TestLlmBase;
+import org.apache.solr.request.SolrQueryRequestBase;
+import org.junit.AfterClass;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import java.util.HashMap;
+import java.util.Map;
+
+
+public class TextToVectorUpdateProcessorFactoryTest extends TestLlmBase {
+  private TextToVectorUpdateProcessorFactory factoryToTest =
+  new TextToVectorUpdateProcessorFactory();
+  private NamedList args = new NamedList<>();
+  
+  @BeforeClass
+  public static void initArgs() throws Exception {
+setupTest("solrconfig-llm.xml", "schema.xml", false, false);
+  }
+
+  @AfterClass
+  public static void after() throws Exception {
+afterTest();
+  }
+
+  @Test
+  public void init_fullArgs_shouldInitFullClassificationParams() {
+args.add("inputField", "_text_");
+args.add("outputField", "vector");
+args.add("model", "model1");
+factoryToTest.init(args);
+
+assertEquals("_text_", factoryToTest.getInputField());
+assertEquals("vector", factoryToTest.getOutputField());
+assertEquals("model1", factoryToTest.getModelName());
+  }
+
+  @Test
+  public void init_nullInputField_shouldThrowExceptionWithDetailedMessage() {
+args.add("outputField", "vector");
+args.add("model", "model1");
+
+SolrException e = assertThrows(SolrException.class, () -> 
factoryToTest.init(args));
+assertEquals("Text to Vector UpdateProcessor 'inputField' can not be 
null", e.getMessage());
+  }
+
+  @Test
+  public void 
init_notExistentInputField_shouldThrowExceptionWithDetailedMessage() throws 
Exception {
+args.add("inputField", "notExistentInput");
+args.add("outputField", "vector");
+args.add("model", "model1");
+
+Map params = new HashMap<>();
+MultiMapSolrParams mmparams = new MultiMapSolrParams(params);
+SolrQueryRequestBase req = new 
SolrQueryRequestBase(solrClientTestRule.getCoreContainer().getCore("collection1"),
 (SolrParams) mmparams) {};

Review Comment:
   I admit I don't know, I'm not that java savvy, I suspect it has to do with 
instatiating a subclass of an abstract class?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13681) make Lucene's index sorting directly configurable in Solr

2025-02-05 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17924252#comment-17924252
 ] 

David Smiley commented on SOLR-13681:
-

Yes please do that.  Major releases are indeed a time to simplify something, 
removing old things, even if it breaks someone.  If we're not sure if existing 
users of some advanced feature like this will be compatible, I still think a 
major release is okay.  Just call out the risk/unknown in the upgrade notes 
asciidoc file.

> make Lucene's index sorting directly configurable in Solr
> -
>
> Key: SOLR-13681
> URL: https://issues.apache.org/jira/browse/SOLR-13681
> Project: Solr
>  Issue Type: New Feature
>Reporter: Christine Poerschke
>Priority: Minor
> Attachments: SOLR-13681-refguide-skel.patch, SOLR-13681.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> History/Background:
> * SOLR-5730 made Lucene's SortingMergePolicy and 
> EarlyTerminatingSortingCollector configurable in Solr 6.0 or later.
> * LUCENE-6766 make index sorting a first-class citizen in Lucene 6.2 or later.
> Current status:
> * In Solr 8.2 use of index sorting is only available via configuration of a 
> (top-level) merge policy that is a SortingMergePolicy and that policy's sort 
> is then passed to the index writer config via the 
> {code}
> if (mergePolicy instanceof SortingMergePolicy) {
>   Sort indexSort = ((SortingMergePolicy) mergePolicy).getSort();
>   iwc.setIndexSort(indexSort);
> }
> {code}
> https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L241-L244
>  code path.
> Proposed change:
> * in-scope for this ticket: To add direct support for index sorting 
> configuration in Solr.
> * out-of-scope for this ticket: deprecation and removal of SortingMergePolicy 
> support



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]

2025-02-05 Thread via GitHub


alessandrobenedetti commented on code in PR #3151:
URL: https://github.com/apache/solr/pull/3151#discussion_r1943636345


##
solr/modules/llm/src/test/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessorTest.java:
##
@@ -0,0 +1,117 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.llm.texttovector.update.processor;
+
+import org.apache.solr.client.solrj.SolrQuery;
+import org.apache.solr.llm.TestLlmBase;
+import 
org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+
+public class TextToVectorUpdateProcessorTest extends TestLlmBase {
+
+@BeforeClass
+public static void init() throws Exception {
+setupTest("solrconfig-llm-indexing.xml", "schema.xml", false, false);
+
+}
+
+@Test
+public void processAdd_inputField_shouldVectoriseInputField()
+throws Exception {
+loadModel("dummy-model.json");
+assertU(adoc("id", "99", "_text_", "Vegeta is the saiyan prince."));
+assertU(adoc("id", "98", "_text_", "Vegeta is the saiyan prince."));
+assertU(commit());
+
+final String solrQuery = "*:*";
+final SolrQuery query = new SolrQuery();
+query.setQuery(solrQuery);
+query.add("fl", "id,vector");
+
+assertJQ(
+"/query" + query.toQueryString(),
+"/response/numFound==2]",
+"/response/docs/[0]/id=='99'",
+"/response/docs/[0]/vector==[1.0, 2.0, 3.0, 4.0]",
+"/response/docs/[1]/id=='98'",
+"/response/docs/[1]/vector==[1.0, 2.0, 3.0, 4.0]");
+
+restTestHarness.delete(ManagedTextToVectorModelStore.REST_END_POINT + 
"/dummy-1");

Review Comment:
   it was cleanup, but not all tests need it, so I added it explicitly, added a 
line comment to make it clearer



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]

2025-02-05 Thread via GitHub


alessandrobenedetti commented on code in PR #3151:
URL: https://github.com/apache/solr/pull/3151#discussion_r1943637205


##
solr/modules/llm/src/test/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessorTest.java:
##
@@ -0,0 +1,117 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.llm.texttovector.update.processor;
+
+import org.apache.solr.client.solrj.SolrQuery;
+import org.apache.solr.llm.TestLlmBase;
+import 
org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+
+public class TextToVectorUpdateProcessorTest extends TestLlmBase {
+
+@BeforeClass
+public static void init() throws Exception {
+setupTest("solrconfig-llm-indexing.xml", "schema.xml", false, false);
+
+}
+
+@Test
+public void processAdd_inputField_shouldVectoriseInputField()
+throws Exception {
+loadModel("dummy-model.json");
+assertU(adoc("id", "99", "_text_", "Vegeta is the saiyan prince."));
+assertU(adoc("id", "98", "_text_", "Vegeta is the saiyan prince."));
+assertU(commit());
+
+final String solrQuery = "*:*";
+final SolrQuery query = new SolrQuery();
+query.setQuery(solrQuery);
+query.add("fl", "id,vector");
+
+assertJQ(
+"/query" + query.toQueryString(),
+"/response/numFound==2]",
+"/response/docs/[0]/id=='99'",
+"/response/docs/[0]/vector==[1.0, 2.0, 3.0, 4.0]",
+"/response/docs/[1]/id=='98'",
+"/response/docs/[1]/vector==[1.0, 2.0, 3.0, 4.0]");
+
+restTestHarness.delete(ManagedTextToVectorModelStore.REST_END_POINT + 
"/dummy-1");
+}
+
+/*
+This test looks for the 'dummy-1' model, but such model is not loaded, the 
model store is empty, so the update fails
+ */
+@Test
+public void processAdd_modelNotFound_shouldRaiseException() {
+assertFailedU("This update should fail but actually succeeded", 
adoc("id", "99", "_text_", "Vegeta is the saiyan prince."));
+
+checkUpdateU(adoc("id", "99", "_text_", "Vegeta is the saiyan 
prince."),
+"/response/lst[@name='error']/str[@name='msg']=\"The model 
requested 'dummy-1' can't be found in the store: 
/schema/text-to-vector-model-store\"",
+"/response/lst[@name='error']/int[@name='code']='400'");
+}
+
+@Test
+public void processAdd_emptyInputField_shouldLogAndIndexWithNoVector() 
throws Exception {
+loadModel("dummy-model.json");
+assertU(adoc("id", "99", "_text_", ""));
+assertU(adoc("id", "98", "_text_", "Vegeta is the saiyan prince."));
+assertU(commit());
+
+final String solrQuery = "*:*";
+final SolrQuery query = new SolrQuery();
+query.setQuery(solrQuery);
+query.add("fl", "id,vector");
+
+assertJQ(
+"/query" + query.toQueryString(),
+"/response/numFound==2]",
+"/response/docs/[0]/id=='99'",
+"!/response/docs/[0]/vector==", //no vector field for the 
document 99

Review Comment:
   it took an afternoon almost to find that, it deserved a comment :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Created] (SOLR-17655) Deprecate ExternalFileField in lieu of in-place docValue updates

2025-02-05 Thread David Smiley (Jira)
David Smiley created SOLR-17655:
---

 Summary: Deprecate ExternalFileField in lieu of in-place docValue 
updates
 Key: SOLR-17655
 URL: https://issues.apache.org/jira/browse/SOLR-17655
 Project: Solr
  Issue Type: Task
Reporter: David Smiley


ExternalFileField is an old capability of Solr that pre-dated in-place partial 
updates of numeric DocValue fields.  There are some issues with it, and it has 
code to be maintained.  It's time to deprecate it in 9.9 so it can be removed 
in 10.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Updated] (SOLR-17655) Deprecate ExternalFileField in lieu of in-place docValue updates

2025-02-05 Thread David Smiley (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-17655:

Description: ExternalFileField is an old capability of Solr that pre-dated 
in-place partial updates of numeric DocValue fields.  There are [some issues 
with 
it|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SOLR%20AND%20text%20~%20%22FileFloatSource%22%20AND%20resolution%20is%20null%20ORDER%20BY%20created%20DESC],
 and it has code to be maintained.  It's time to deprecate it in 9.9 so it can 
be removed in 10.  (was: ExternalFileField is an old capability of Solr that 
pre-dated in-place partial updates of numeric DocValue fields.  There are some 
issues with it, and it has code to be maintained.  It's time to deprecate it in 
9.9 so it can be removed in 10.)

> Deprecate ExternalFileField in lieu of in-place docValue updates
> 
>
> Key: SOLR-17655
> URL: https://issues.apache.org/jira/browse/SOLR-17655
> Project: Solr
>  Issue Type: Task
>Reporter: David Smiley
>Priority: Major
>
> ExternalFileField is an old capability of Solr that pre-dated in-place 
> partial updates of numeric DocValue fields.  There are [some issues with 
> it|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SOLR%20AND%20text%20~%20%22FileFloatSource%22%20AND%20resolution%20is%20null%20ORDER%20BY%20created%20DESC],
>  and it has code to be maintained.  It's time to deprecate it in 9.9 so it 
> can be removed in 10.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17655) Deprecate ExternalFileField in lieu of in-place docValue updates

2025-02-05 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17924257#comment-17924257
 ] 

David Smiley commented on SOLR-17655:
-

Some users [have 
issues|https://lists.apache.org/thread/xwr8j0ydmxjrq7q74020728jg1bqv2k3] with 
ExternalFileField.

> Deprecate ExternalFileField in lieu of in-place docValue updates
> 
>
> Key: SOLR-17655
> URL: https://issues.apache.org/jira/browse/SOLR-17655
> Project: Solr
>  Issue Type: Task
>Reporter: David Smiley
>Priority: Major
>
> ExternalFileField is an old capability of Solr that pre-dated in-place 
> partial updates of numeric DocValue fields.  There are [some issues with 
> it|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SOLR%20AND%20text%20~%20%22FileFloatSource%22%20AND%20resolution%20is%20null%20ORDER%20BY%20created%20DESC],
>  and it has code to be maintained.  It's time to deprecate it in 9.9 so it 
> can be removed in 10.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Updated] (SOLR-17655) Deprecate ExternalFileField in lieu of in-place docValue updates

2025-02-05 Thread David Smiley (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-17655:

Priority: Blocker  (was: Major)

> Deprecate ExternalFileField in lieu of in-place docValue updates
> 
>
> Key: SOLR-17655
> URL: https://issues.apache.org/jira/browse/SOLR-17655
> Project: Solr
>  Issue Type: Task
>Reporter: David Smiley
>Priority: Blocker
> Fix For: 9.9
>
>
> ExternalFileField is an old capability of Solr that pre-dated in-place 
> partial updates of numeric DocValue fields.  There are [some issues with 
> it|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SOLR%20AND%20text%20~%20%22FileFloatSource%22%20AND%20resolution%20is%20null%20ORDER%20BY%20created%20DESC],
>  and it has code to be maintained.  It's time to deprecate it in 9.9 so it 
> can be removed in 10.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17655) Deprecate ExternalFileField in lieu of in-place docValue updates

2025-02-05 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17924259#comment-17924259
 ] 

David Smiley commented on SOLR-17655:
-

BTW there are a variety of pieces of code here that can be removed.  I'm 
thinking VersionedFile and FileFloatSource, in addition to obviously 
ExternalFileField.

> Deprecate ExternalFileField in lieu of in-place docValue updates
> 
>
> Key: SOLR-17655
> URL: https://issues.apache.org/jira/browse/SOLR-17655
> Project: Solr
>  Issue Type: Task
>Reporter: David Smiley
>Priority: Major
>
> ExternalFileField is an old capability of Solr that pre-dated in-place 
> partial updates of numeric DocValue fields.  There are [some issues with 
> it|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SOLR%20AND%20text%20~%20%22FileFloatSource%22%20AND%20resolution%20is%20null%20ORDER%20BY%20created%20DESC],
>  and it has code to be maintained.  It's time to deprecate it in 9.9 so it 
> can be removed in 10.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Updated] (SOLR-17655) Deprecate ExternalFileField in lieu of in-place docValue updates

2025-02-05 Thread David Smiley (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-17655:

Fix Version/s: 9.9

> Deprecate ExternalFileField in lieu of in-place docValue updates
> 
>
> Key: SOLR-17655
> URL: https://issues.apache.org/jira/browse/SOLR-17655
> Project: Solr
>  Issue Type: Task
>Reporter: David Smiley
>Priority: Major
> Fix For: 9.9
>
>
> ExternalFileField is an old capability of Solr that pre-dated in-place 
> partial updates of numeric DocValue fields.  There are [some issues with 
> it|https://issues.apache.org/jira/issues/?jql=project%20%3D%20SOLR%20AND%20text%20~%20%22FileFloatSource%22%20AND%20resolution%20is%20null%20ORDER%20BY%20created%20DESC],
>  and it has code to be maintained.  It's time to deprecate it in 9.9 so it 
> can be removed in 10.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-16903: Migrate off java.io.File to java.nio.file.Path from core files [solr]

2025-02-05 Thread via GitHub


mlbiscoc commented on code in PR #2924:
URL: https://github.com/apache/solr/pull/2924#discussion_r1943652914


##
solr/core/src/java/org/apache/solr/filestore/DistribFileStore.java:
##
@@ -80,19 +80,21 @@ public DistribFileStore(CoreContainer coreContainer) {
   }
 
   @Override
-  public Path getRealpath(String path) {
+  public Path getRealPath(String path) {
 return _getRealPath(path, solrHome);
   }
 
-  private static Path _getRealPath(String path, Path solrHome) {
-if (File.separatorChar == '\\') {
-  path = path.replace('/', File.separatorChar);
-}
-SolrPaths.assertNotUnc(Path.of(path));
-while (path.startsWith(File.separator)) { // Trim all leading slashes
-  path = path.substring(1);
+  private static Path _getRealPath(String dir, Path solrHome) {
+Path path = Path.of(dir);
+SolrPaths.assertNotUnc(path);
+
+if (path.isAbsolute()) {
+  // Strip the path of from being absolute to become relative to resolve 
with SolrHome
+  path = path.subpath(0, path.getNameCount());
 }

Review Comment:
   Thanks for catching Houston. I can make a PR to revert that as the safest 
thing but I don't have a windows machine on hand to easily test this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]

2025-02-05 Thread via GitHub


alessandrobenedetti commented on code in PR #3151:
URL: https://github.com/apache/solr/pull/3151#discussion_r1943595213


##
solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessorFactory.java:
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.llm.texttovector.update.processor;
+
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.params.SolrParams;
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.schema.DenseVectorField;
+import org.apache.solr.schema.FieldType;
+import org.apache.solr.schema.SchemaField;
+import org.apache.solr.update.processor.UpdateRequestProcessor;
+import org.apache.solr.update.processor.UpdateRequestProcessorFactory;
+
+/**
+ * This class implements an UpdateProcessorFactory for the Text To Vector 
Update Processor.
+ */
+public class TextToVectorUpdateProcessorFactory extends 
UpdateRequestProcessorFactory {
+private static final String INPUT_FIELD_PARAM = "inputField";
+private static final String OUTPUT_FIELD_PARAM = "outputField";
+private static final String MODEL_NAME = "model";
+
+String inputField;
+String outputField;
+String modelName;
+SolrParams params;
+
+
+@Override
+public void init(final NamedList args) {
+if (args != null) {
+params = args.toSolrParams();
+inputField = params.get(INPUT_FIELD_PARAM);
+checkNotNull(INPUT_FIELD_PARAM, inputField);
+
+outputField = params.get(OUTPUT_FIELD_PARAM);
+checkNotNull(OUTPUT_FIELD_PARAM, outputField);
+
+modelName = params.get(MODEL_NAME);
+checkNotNull(MODEL_NAME, modelName);

Review Comment:
   much cleaner, thanks David!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-16903: Migrate off java.io.File to java.nio.file.Path from core files [solr]

2025-02-05 Thread via GitHub


HoustonPutman commented on code in PR #2924:
URL: https://github.com/apache/solr/pull/2924#discussion_r1943584596


##
solr/core/src/java/org/apache/solr/filestore/DistribFileStore.java:
##
@@ -80,19 +80,21 @@ public DistribFileStore(CoreContainer coreContainer) {
   }
 
   @Override
-  public Path getRealpath(String path) {
+  public Path getRealPath(String path) {
 return _getRealPath(path, solrHome);
   }
 
-  private static Path _getRealPath(String path, Path solrHome) {
-if (File.separatorChar == '\\') {
-  path = path.replace('/', File.separatorChar);
-}
-SolrPaths.assertNotUnc(Path.of(path));
-while (path.startsWith(File.separator)) { // Trim all leading slashes
-  path = path.substring(1);
+  private static Path _getRealPath(String dir, Path solrHome) {
+Path path = Path.of(dir);
+SolrPaths.assertNotUnc(path);
+
+if (path.isAbsolute()) {
+  // Strip the path of from being absolute to become relative to resolve 
with SolrHome
+  path = path.subpath(0, path.getNameCount());
 }

Review Comment:
   This is broken on Windows. A number of tests fail because it's not stripping 
leading "\" characters. Not sure if we should revert to the previous logic, or 
do it another way.
   
   I made https://issues.apache.org/jira/browse/SOLR-17654, but it seems this 
is a new thing, so we can probably close that.
   
   @mlbiscoc 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Resolved] (SOLR-17497) Pull replicas throws AlreadyClosedException

2025-02-05 Thread Jason Gerlowski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski resolved SOLR-17497.

Fix Version/s: main (10.0)
   9.8
 Assignee: Sanjay Dutt
   Resolution: Fixed

> Pull replicas throws AlreadyClosedException  
> -
>
> Key: SOLR-17497
> URL: https://issues.apache.org/jira/browse/SOLR-17497
> Project: Solr
>  Issue Type: Task
>Reporter: Sanjay Dutt
>Assignee: Sanjay Dutt
>Priority: Major
> Fix For: main (10.0), 9.8
>
> Attachments: Screenshot 2024-10-23 at 6.01.02 PM.png
>
>
> Recently, a common exception (org.apache.lucene.store.AlreadyClosedException: 
> this Directory is closed) seen in multiple failed test cases. 
> FAILED:  org.apache.solr.cloud.TestPullReplica.testKillPullReplica
> FAILED:  
> org.apache.solr.cloud.SplitShardWithNodeRoleTest.testSolrClusterWithNodeRoleWithPull
> FAILED:  org.apache.solr.cloud.TestPullReplica.testAddDocs
>  
>  
> {code:java}
> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
> uncaught exception in thread: Thread[id=10271, 
> name=fsyncService-6341-thread-1, state=RUNNABLE, 
> group=TGRP-SplitShardWithNodeRoleTest]
>         at 
> __randomizedtesting.SeedInfo.seed([3F7DACB3BC44C3C4:E5DB3E97188A8EB9]:0)
> Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is 
> closed
>         at __randomizedtesting.SeedInfo.seed([3F7DACB3BC44C3C4]:0)
>         at 
> app//org.apache.lucene.store.BaseDirectory.ensureOpen(BaseDirectory.java:50)
>         at 
> app//org.apache.lucene.store.ByteBuffersDirectory.sync(ByteBuffersDirectory.java:237)
>         at 
> app//org.apache.lucene.tests.store.MockDirectoryWrapper.sync(MockDirectoryWrapper.java:214)
>         at 
> app//org.apache.solr.handler.IndexFetcher$DirectoryFile.sync(IndexFetcher.java:2034)
>         at 
> app//org.apache.solr.handler.IndexFetcher$FileFetcher.lambda$fetch$0(IndexFetcher.java:1803)
>         at 
> app//org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$1(ExecutorUtil.java:449)
>         at 
> java.base@11.0.24/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at 
> java.base@11.0.24/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base@11.0.24/java.lang.Thread.run(Thread.java:829)
>  {code}
>  
> Interesting thing about these test cases is that they all share same kind of 
> setup where each has one shard and two replicas – one NRT and another is PULL.
>  
> Going through one of the test case execution step.
> FAILED:  org.apache.solr.cloud.TestPullReplica.testKillPullReplica
>  
> Test flow
> 1. Create a collection with 1 NRT and 1 PULL replica
> 2. waitForState
> 3. waitForNumDocsInAllActiveReplicas(0); // *Name says it all*
> 4. Index another document.
> 5. waitForNumDocsInAllActiveReplicas(1);
> 6. Stop Pull replica
> 7. Index another document
> 8. waitForNumDocsInAllActiveReplicas(2);
> 9. Start Pull Replica
> 10. waitForState
> 11. waitForNumDocsInAllActiveReplicas(2);
>  
> As per the logs the whole sequence executed successfully. Here is the link to 
> the logs: 
> [https://ge.apache.org/s/yxydiox3gvlf2/tests/task/:solr:core:test/details/org.apache.solr.cloud.TestPullReplica/testKillPullReplica/1/output]
>  (link may stop working in the future)
>  
> Last step where they are making sure that all the active replicas should have 
> two documents each has logged a info which is another proof that it completed 
> successfully. 
>  
> {code:java}
> 616575 INFO 
> (TEST-TestPullReplica.testKillPullReplica-seed#[F30CC837FDD0DC28]) [n: c: s: 
> r: x: t:] o.a.s.c.TestPullReplica Replica core_node3 
> (https://127.0.0.1:35647/solr/pull_replica_test_kill_pull_replica_shard1_replica_n1/)
>  has all 2 docs 616606 INFO (qtp1091538342-13057-null-11348) 
> [n:127.0.0.1:38207_solr c:pull_replica_test_kill_pull_replica s:shard1 
> r:core_node4 x:pull_replica_test_kill_pull_replica_shard1_replica_p2 
> t:null-11348] o.a.s.c.S.Request webapp=/solr path=/select 
> params={q=*:*&wt=javabin&version=2} rid=null-11348 hits=2 status=0 QTime=0 
> 616607 INFO 
> (TEST-TestPullReplica.testKillPullReplica-seed#[F30CC837FDD0DC28]) [n: c: s: 
> r: x: t:] o.a.s.c.TestPullReplica Replica core_node4 
> (https://127.0.0.1:38207/solr/pull_replica_test_kill_pull_replica_shard1_replica_p2/)
>  has all 2 docs{code}
>  
> *Where is the issue then?*
> In the logs it has been observed, that after restarting the PULL replica. The 
> recovery process started and after fetching all the files info from the NRT, 
> the replication aborted and logged "User aborted replication"
>  
> {code:java}
> o.a.s.h.IndexFetcher User aborted Replication => 
> org.apache.solr.handler.IndexFetcher$Replica

Re: [PR] SOLR-17641: Disable the Security Manager for Java 24+ [solr]

2025-02-05 Thread via GitHub


HoustonPutman commented on PR #3153:
URL: https://github.com/apache/solr/pull/3153#issuecomment-2637917612

   > Agree with @epugh - isn't Security Manager all no-ops even with 21 (which 
IIRC is the minimum Java for Solr 10)
   
   I've never heard that. Do you have a link for that?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17641: Disable the Security Manager for Java 24+ [solr]

2025-02-05 Thread via GitHub


uschindler commented on PR #3153:
URL: https://github.com/apache/solr/pull/3153#issuecomment-2637926350

   > > Agree with @epugh - isn't Security Manager all no-ops even with 21 
(which IIRC is the minimum Java for Solr 10)
   > 
   > I've never heard that. Do you have a link for that?
   
   No, it isn't. It's fully working in 21. It gets a noop in 24.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]

2025-02-05 Thread via GitHub


alessandrobenedetti commented on code in PR #3151:
URL: https://github.com/apache/solr/pull/3151#discussion_r1943608528


##
solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessorFactory.java:
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.llm.texttovector.update.processor;
+
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.params.SolrParams;
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.schema.DenseVectorField;
+import org.apache.solr.schema.FieldType;
+import org.apache.solr.schema.SchemaField;
+import org.apache.solr.update.processor.UpdateRequestProcessor;
+import org.apache.solr.update.processor.UpdateRequestProcessorFactory;
+
+/**
+ * This class implements an UpdateProcessorFactory for the Text To Vector 
Update Processor.
+ */
+public class TextToVectorUpdateProcessorFactory extends 
UpdateRequestProcessorFactory {
+private static final String INPUT_FIELD_PARAM = "inputField";
+private static final String OUTPUT_FIELD_PARAM = "outputField";
+private static final String MODEL_NAME = "model";
+
+String inputField;
+String outputField;
+String modelName;
+SolrParams params;
+
+
+@Override
+public void init(final NamedList args) {
+if (args != null) {
+params = args.toSolrParams();
+inputField = params.get(INPUT_FIELD_PARAM);
+checkNotNull(INPUT_FIELD_PARAM, inputField);
+
+outputField = params.get(OUTPUT_FIELD_PARAM);
+checkNotNull(OUTPUT_FIELD_PARAM, outputField);
+
+modelName = params.get(MODEL_NAME);
+checkNotNull(MODEL_NAME, modelName);
+}
+}
+
+private void checkNotNull(String paramName, Object param) {
+if (param == null) {
+throw new SolrException(
+SolrException.ErrorCode.SERVER_ERROR,
+"Text to Vector UpdateProcessor '" + paramName + "' can 
not be null");
+}
+}
+
+@Override
+public UpdateRequestProcessor getInstance(SolrQueryRequest req, 
SolrQueryResponse rsp, UpdateRequestProcessor next) {
+req.getCore().getLatestSchema().getField(inputField);

Review Comment:
   it checks that 'inputField' is defined in the schema.
   With the latest commit I changed it to make it more explicit but I am open 
to suggestions



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]

2025-02-05 Thread via GitHub


alessandrobenedetti commented on code in PR #3151:
URL: https://github.com/apache/solr/pull/3151#discussion_r1943614530


##
solr/modules/llm/src/test-files/solr/collection1/conf/solrconfig-llm-indexing-notDenseVectorField.xml:
##


Review Comment:
   I could, but the reason I added it  is that I struggled to find testing 
methods such as org.apache.solr.util.RestTestBase#assertU(java.lang.String) 
that takes the chain as a parameter.
   So I added as the default and I could test it.
   
   I would not want to be the default when indexing docs for the query time 
test.
   If you have any suggestion I'm open to changes



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]

2025-02-05 Thread via GitHub


alessandrobenedetti commented on code in PR #3151:
URL: https://github.com/apache/solr/pull/3151#discussion_r1943616272


##
solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java:
##
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.llm.texttovector.update.processor;

Review Comment:
   I agree, texttovector is horribly unreadable, maybe 'textvectorisation' ? 
adding it in the coming commit



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]

2025-02-05 Thread via GitHub


dsmiley commented on code in PR #3151:
URL: https://github.com/apache/solr/pull/3151#discussion_r1943529534


##
solr/modules/llm/src/test/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessorFactoryTest.java:
##
@@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.llm.texttovector.update.processor;
+
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.params.MultiMapSolrParams;
+import org.apache.solr.common.params.SolrParams;
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.llm.TestLlmBase;
+import org.apache.solr.request.SolrQueryRequestBase;
+import org.junit.AfterClass;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import java.util.HashMap;
+import java.util.Map;
+
+
+public class TextToVectorUpdateProcessorFactoryTest extends TestLlmBase {
+  private TextToVectorUpdateProcessorFactory factoryToTest =
+  new TextToVectorUpdateProcessorFactory();
+  private NamedList args = new NamedList<>();
+  
+  @BeforeClass
+  public static void initArgs() throws Exception {
+setupTest("solrconfig-llm.xml", "schema.xml", false, false);
+  }
+
+  @AfterClass
+  public static void after() throws Exception {
+afterTest();
+  }
+
+  @Test
+  public void init_fullArgs_shouldInitFullClassificationParams() {
+args.add("inputField", "_text_");
+args.add("outputField", "vector");
+args.add("model", "model1");
+factoryToTest.init(args);
+
+assertEquals("_text_", factoryToTest.getInputField());
+assertEquals("vector", factoryToTest.getOutputField());
+assertEquals("model1", factoryToTest.getModelName());
+  }
+
+  @Test
+  public void init_nullInputField_shouldThrowExceptionWithDetailedMessage() {
+args.add("outputField", "vector");
+args.add("model", "model1");
+
+SolrException e = assertThrows(SolrException.class, () -> 
factoryToTest.init(args));
+assertEquals("Text to Vector UpdateProcessor 'inputField' can not be 
null", e.getMessage());
+  }
+
+  @Test
+  public void 
init_notExistentInputField_shouldThrowExceptionWithDetailedMessage() throws 
Exception {
+args.add("inputField", "notExistentInput");
+args.add("outputField", "vector");
+args.add("model", "model1");
+
+Map params = new HashMap<>();
+MultiMapSolrParams mmparams = new MultiMapSolrParams(params);
+SolrQueryRequestBase req = new 
SolrQueryRequestBase(solrClientTestRule.getCoreContainer().getCore("collection1"),
 (SolrParams) mmparams) {};

Review Comment:
   It's an anonymous inner class.  What's probably throwing you off is that 
there are no method overrides, which 99% of the time is the point of doing an 
anonymous inner class.  Here it's because SQRB is abstract so he's forced to 
subclass it in order to use it.  I've been thinking of this case recently and I 
think we should simply make that impl not abstract.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]

2025-02-05 Thread via GitHub


dsmiley commented on code in PR #3151:
URL: https://github.com/apache/solr/pull/3151#discussion_r1943539435


##
solr/test-framework/src/java/org/apache/solr/util/RestTestBase.java:
##
@@ -88,13 +88,33 @@ private static void checkUpdateU(String message, String 
update, boolean shouldSu
 if (response != null) fail(m + "update was not successful: " + 
response);
   } else {
 String response = restTestHarness.validateErrorUpdate(update);
-if (response != null) fail(m + "update succeeded, but should have 
failed: " + response);
+if (response == null) fail(m + "update succeeded, but should have 
failed: " + response);
   }
 } catch (SAXException e) {
   throw new RuntimeException("Invalid XML", e);
 }
   }
 
+  public static void checkUpdateU(String update, String... tests) {

Review Comment:
   At the moment, RestTestBase is common to basically any test using a 
"REST-based model store"; which the LLM stuff recently added a new variant of 
and hence RestTestBase is used.  RestTestBase is used a lot.  Preferrably we 
wouldn't depend too much on our class hierarchy to accomplish re-usable things. 
 But there's no realistic action to take right now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-16391: Convert create-core, core-status, /luke to JAX-RS [solr]

2025-02-05 Thread via GitHub


gerlowskija commented on code in PR #3054:
URL: https://github.com/apache/solr/pull/3054#discussion_r1943369180


##
solr/core/src/java/org/apache/solr/handler/admin/CoreAdminOperation.java:
##
@@ -92,32 +82,12 @@ public enum CoreAdminOperation implements CoreAdminOp {
   CREATE_OP(
   CREATE,
   it -> {
-assert TestInjection.injectRandomDelayInCoreCreation();

Review Comment:
   Eric found this later on his own, but for anyone else, this line was moved 
but not removed.  See 
[here](https://github.com/apache/solr/pull/3054#discussion_r1925326281)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]

2025-02-05 Thread via GitHub


alessandrobenedetti commented on code in PR #3151:
URL: https://github.com/apache/solr/pull/3151#discussion_r1943488899


##
solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessorFactory.java:
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.llm.texttovector.update.processor;
+
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.params.SolrParams;
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.schema.DenseVectorField;
+import org.apache.solr.schema.FieldType;
+import org.apache.solr.schema.SchemaField;
+import org.apache.solr.update.processor.UpdateRequestProcessor;
+import org.apache.solr.update.processor.UpdateRequestProcessorFactory;
+
+/**
+ * This class implements an UpdateProcessorFactory for the Text To Vector 
Update Processor.
+ */
+public class TextToVectorUpdateProcessorFactory extends 
UpdateRequestProcessorFactory {
+private static final String INPUT_FIELD_PARAM = "inputField";
+private static final String OUTPUT_FIELD_PARAM = "outputField";
+private static final String MODEL_NAME = "model";
+
+String inputField;
+String outputField;
+String modelName;
+SolrParams params;

Review Comment:
   Done!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]

2025-02-05 Thread via GitHub


alessandrobenedetti commented on code in PR #3151:
URL: https://github.com/apache/solr/pull/3151#discussion_r1943553277


##
solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java:
##
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.llm.texttovector.update.processor;
+
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.SolrInputField;
+import org.apache.solr.llm.texttovector.model.SolrTextToVectorModel;
+import 
org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.update.AddUpdateCommand;
+import org.apache.solr.update.processor.UpdateRequestProcessor;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.ArrayList;
+import java.util.List;
+
+
+class TextToVectorUpdateProcessor extends UpdateRequestProcessor {
+private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+private final String inputField;
+private final String outputField;
+private final String model;
+private SolrTextToVectorModel textToVector;
+private ManagedTextToVectorModelStore modelStore = null;
+
+public TextToVectorUpdateProcessor(
+String inputField,
+String outputField,
+String model,
+SolrQueryRequest req,
+UpdateRequestProcessor next) {
+super(next);
+this.inputField = inputField;
+this.outputField = outputField;
+this.model = model;
+this.modelStore = 
ManagedTextToVectorModelStore.getManagedModelStore(req.getCore());
+}
+
+/**
+ * @param cmd the update command in input containing the Document to 
process
+ * @throws IOException If there is a low-level I/O error
+ */
+@Override
+public void processAdd(AddUpdateCommand cmd) throws IOException {
+this.textToVector = modelStore.getModel(model);

Review Comment:
   I was debugging the flow to have a better understanding of the lifecycle of 
an update request processor.
   
   From what I see from the test, the factory instantiates a new update request 
processor every time a new update request is received.
   I think it's ok to keep it a class member, but let me see if I can move the 
instantiation to the factory.
   Ideally I wanted that to happen when the factory is initiate but It seems 
that the update request processor factory is not compatible with resource 
loading (as far as I debugged and checked)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]

2025-02-05 Thread via GitHub


alessandrobenedetti commented on code in PR #3151:
URL: https://github.com/apache/solr/pull/3151#discussion_r1943565203


##
solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java:
##
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.llm.texttovector.update.processor;
+
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.SolrInputField;
+import org.apache.solr.llm.texttovector.model.SolrTextToVectorModel;
+import 
org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.update.AddUpdateCommand;
+import org.apache.solr.update.processor.UpdateRequestProcessor;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.ArrayList;
+import java.util.List;
+
+
+class TextToVectorUpdateProcessor extends UpdateRequestProcessor {
+private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+private final String inputField;
+private final String outputField;
+private final String model;
+private SolrTextToVectorModel textToVector;
+private ManagedTextToVectorModelStore modelStore = null;
+
+public TextToVectorUpdateProcessor(
+String inputField,
+String outputField,
+String model,
+SolrQueryRequest req,
+UpdateRequestProcessor next) {
+super(next);
+this.inputField = inputField;
+this.outputField = outputField;
+this.model = model;
+this.modelStore = 
ManagedTextToVectorModelStore.getManagedModelStore(req.getCore());
+}
+
+/**
+ * @param cmd the update command in input containing the Document to 
process
+ * @throws IOException If there is a low-level I/O error
+ */
+@Override
+public void processAdd(AddUpdateCommand cmd) throws IOException {
+this.textToVector = modelStore.getModel(model);
+if (textToVector == null) {
+throw new SolrException(
+SolrException.ErrorCode.BAD_REQUEST,
+"The model requested '"
++ model
++ "' can't be found in the store: "
++ ManagedTextToVectorModelStore.REST_END_POINT);
+}
+
+SolrInputDocument doc = cmd.getSolrInputDocument();
+SolrInputField inputFieldContent = doc.get(inputField);
+if (!isNullOrEmpty(inputFieldContent, doc, inputField)) {
+String textToVectorise = 
inputFieldContent.getValue().toString();//add null checks and
+float[] vector = textToVector.vectorise(textToVectorise);

Review Comment:
   1) @cpoerschke : I double checked and the langchain4j library 'embed' method 
(that's used in our 'vecctorise' method), doesn't return any exception, but I 
gree we should investigate what happens if that request fails (my best guess is 
we get an empty vector or null, I'll add that to tests)
   
   2) @epugh : given that 'update.chain' is a parameter, if you configure a 
chain with no vector enrichment and a chain with vector enrichment, what 
prevents you from first index using the 'no vectors' chain and then slowly 
updating the index with atomic updates that add vectors (using the 
vector-chain)? We should double check and add to the documentation once we 
consolidate the code, what do you think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]

2025-02-05 Thread via GitHub


alessandrobenedetti commented on code in PR #3151:
URL: https://github.com/apache/solr/pull/3151#discussion_r1943565203


##
solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java:
##
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.llm.texttovector.update.processor;
+
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.SolrInputField;
+import org.apache.solr.llm.texttovector.model.SolrTextToVectorModel;
+import 
org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.update.AddUpdateCommand;
+import org.apache.solr.update.processor.UpdateRequestProcessor;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.ArrayList;
+import java.util.List;
+
+
+class TextToVectorUpdateProcessor extends UpdateRequestProcessor {
+private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+private final String inputField;
+private final String outputField;
+private final String model;
+private SolrTextToVectorModel textToVector;
+private ManagedTextToVectorModelStore modelStore = null;
+
+public TextToVectorUpdateProcessor(
+String inputField,
+String outputField,
+String model,
+SolrQueryRequest req,
+UpdateRequestProcessor next) {
+super(next);
+this.inputField = inputField;
+this.outputField = outputField;
+this.model = model;
+this.modelStore = 
ManagedTextToVectorModelStore.getManagedModelStore(req.getCore());
+}
+
+/**
+ * @param cmd the update command in input containing the Document to 
process
+ * @throws IOException If there is a low-level I/O error
+ */
+@Override
+public void processAdd(AddUpdateCommand cmd) throws IOException {
+this.textToVector = modelStore.getModel(model);
+if (textToVector == null) {
+throw new SolrException(
+SolrException.ErrorCode.BAD_REQUEST,
+"The model requested '"
++ model
++ "' can't be found in the store: "
++ ManagedTextToVectorModelStore.REST_END_POINT);
+}
+
+SolrInputDocument doc = cmd.getSolrInputDocument();
+SolrInputField inputFieldContent = doc.get(inputField);
+if (!isNullOrEmpty(inputFieldContent, doc, inputField)) {
+String textToVectorise = 
inputFieldContent.getValue().toString();//add null checks and
+float[] vector = textToVector.vectorise(textToVectorise);

Review Comment:
   1) @cpoerschke : I double checked and the langchain4j library 'embed' method 
(that's used in our 'vectorise' method), doesn't return any exception, but I 
gree we should investigate what happens if that request fails (my best guess is 
we get an empty vector or null, I'll add that to tests)
   
   2) @epugh : given that 'update.chain' is a parameter, if you configure a 
chain with no vector enrichment and a chain with vector enrichment, what 
prevents you from first index using the 'no vectors' chain and then slowly 
updating the index with atomic updates that add vectors (using the 
vector-chain)? We should double check and add to the documentation once we 
consolidate the code, what do you think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17379) ParsingFieldUpdateProcessorsTest failures using CLDR locale provider

2025-02-05 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17924234#comment-17924234
 ] 

ASF subversion and git services commented on SOLR-17379:


Commit eb07d72af281ea426b8d44006636fcf81094b745 in solr's branch 
refs/heads/branch_9x from Houston Putman
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=eb07d72af28 ]

SOLR-17379: Fix date parsing in Java 23, remove Lucene TestSecurityManager 
(#3154)

* Fix system exit in test - by removing that part of the test

(cherry picked from commit b779ed0590e36f69f7d1ce17e99dc936ab46752f)

Co-authored-by: Chris Hostetter 


> ParsingFieldUpdateProcessorsTest failures using CLDR locale provider
> 
>
> Key: SOLR-17379
> URL: https://issues.apache.org/jira/browse/SOLR-17379
> Project: Solr
>  Issue Type: Test
>Reporter: Chris M. Hostetter
>Priority: Major
>  Labels: pull-request-available
> Attachments: SOLR-17379.test-1.patch, SOLR-17379.test.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Background: https://lists.apache.org/thread/o7xwz8df6j0bx7w2m3w8ptrp4r7q957n
> Test failures from {{ParsingFieldUpdateProcessorsTest.testAKSTZone}} and 
> {{ParsingFieldUpdateProcessorsTest.testParseFrenchDate}} are seemingly 
> guaranteed on JDK23, due to the removal of the {{COMPAT}} local provider 
> option.
> On (some) earlier JDKs, these failures can be reproduced using...
> {noformat}
> ./gradlew test --tests ParsingFieldUpdateProcessorsTest  
> -Ptests.jvmargs="-Djava.locale.providers=CLDR -XX:TieredStopAtLevel=1 
> -XX:+UseParallelGC -XX:ActiveProcessorCount=1 -XX:ReservedCodeCacheSize=120m"
> {noformat}
> ...to force the use off {{CLDR}} and exclude the use of {{COMPAT}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Resolved] (SOLR-17379) ParsingFieldUpdateProcessorsTest failures using CLDR locale provider

2025-02-05 Thread Houston Putman (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Houston Putman resolved SOLR-17379.
---
Fix Version/s: 9.9
 Assignee: Houston Putman
   Resolution: Fixed

> ParsingFieldUpdateProcessorsTest failures using CLDR locale provider
> 
>
> Key: SOLR-17379
> URL: https://issues.apache.org/jira/browse/SOLR-17379
> Project: Solr
>  Issue Type: Test
>Reporter: Chris M. Hostetter
>Assignee: Houston Putman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 9.9
>
> Attachments: SOLR-17379.test-1.patch, SOLR-17379.test.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Background: https://lists.apache.org/thread/o7xwz8df6j0bx7w2m3w8ptrp4r7q957n
> Test failures from {{ParsingFieldUpdateProcessorsTest.testAKSTZone}} and 
> {{ParsingFieldUpdateProcessorsTest.testParseFrenchDate}} are seemingly 
> guaranteed on JDK23, due to the removal of the {{COMPAT}} local provider 
> option.
> On (some) earlier JDKs, these failures can be reproduced using...
> {noformat}
> ./gradlew test --tests ParsingFieldUpdateProcessorsTest  
> -Ptests.jvmargs="-Djava.locale.providers=CLDR -XX:TieredStopAtLevel=1 
> -XX:+UseParallelGC -XX:ActiveProcessorCount=1 -XX:ReservedCodeCacheSize=120m"
> {noformat}
> ...to force the use off {{CLDR}} and exclude the use of {{COMPAT}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]

2025-02-05 Thread via GitHub


alessandrobenedetti commented on PR #3151:
URL: https://github.com/apache/solr/pull/3151#issuecomment-2637867724

   > I wanted to follow-up on my feedback to the LLM module concerning use of 
the word "embedding". I first tried to say that I was not familiar with the 
word, and your response was to remove it (completely?) from the module. If 
"embedding" is an appropriate word then use it. The documentation should 
reference it in the ref guide, even if just an "AKA".
   
   Embedding is widely used in the field, but it's a bit ambiguous and to be 
honest, I'm with you in not using any term that can cause confusion. Do you 
mean I added embedding in here somewhere? If that's the case, It's a mistake, 
point it to me and I'll remove it!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17632: Text to Vector Update Request Processor [solr]

2025-02-05 Thread via GitHub


alessandrobenedetti commented on code in PR #3151:
URL: https://github.com/apache/solr/pull/3151#discussion_r1943565203


##
solr/modules/llm/src/java/org/apache/solr/llm/texttovector/update/processor/TextToVectorUpdateProcessor.java:
##
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.llm.texttovector.update.processor;
+
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.SolrInputField;
+import org.apache.solr.llm.texttovector.model.SolrTextToVectorModel;
+import 
org.apache.solr.llm.texttovector.store.rest.ManagedTextToVectorModelStore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.update.AddUpdateCommand;
+import org.apache.solr.update.processor.UpdateRequestProcessor;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.ArrayList;
+import java.util.List;
+
+
+class TextToVectorUpdateProcessor extends UpdateRequestProcessor {
+private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+private final String inputField;
+private final String outputField;
+private final String model;
+private SolrTextToVectorModel textToVector;
+private ManagedTextToVectorModelStore modelStore = null;
+
+public TextToVectorUpdateProcessor(
+String inputField,
+String outputField,
+String model,
+SolrQueryRequest req,
+UpdateRequestProcessor next) {
+super(next);
+this.inputField = inputField;
+this.outputField = outputField;
+this.model = model;
+this.modelStore = 
ManagedTextToVectorModelStore.getManagedModelStore(req.getCore());
+}
+
+/**
+ * @param cmd the update command in input containing the Document to 
process
+ * @throws IOException If there is a low-level I/O error
+ */
+@Override
+public void processAdd(AddUpdateCommand cmd) throws IOException {
+this.textToVector = modelStore.getModel(model);
+if (textToVector == null) {
+throw new SolrException(
+SolrException.ErrorCode.BAD_REQUEST,
+"The model requested '"
++ model
++ "' can't be found in the store: "
++ ManagedTextToVectorModelStore.REST_END_POINT);
+}
+
+SolrInputDocument doc = cmd.getSolrInputDocument();
+SolrInputField inputFieldContent = doc.get(inputField);
+if (!isNullOrEmpty(inputFieldContent, doc, inputField)) {
+String textToVectorise = 
inputFieldContent.getValue().toString();//add null checks and
+float[] vector = textToVector.vectorise(textToVectorise);

Review Comment:
   1) @cpoerschke : I double checked and the langchain4j library 'embed' method 
(that's used in our 'vectorise' method) returns a RuntimeException .
   That's bad as it was not detected without investigating the internals of the 
code (I hate these practices).
   I'll give it a thought, any suggestion is welcome!
   
   2) @epugh : given that 'update.chain' is a parameter, if you configure a 
chain with no vector enrichment and a chain with vector enrichment, what 
prevents you from first index using the 'no vectors' chain and then slowly 
updating the index with atomic updates that add vectors (using the 
vector-chain)? We should double check and add to the documentation once we 
consolidate the code, what do you think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org