Re: [PR] SOLR-17682: QueryResponseWriter hierarchy refactor [solr]

2025-02-24 Thread via GitHub


dsmiley commented on code in PR #3209:
URL: https://github.com/apache/solr/pull/3209#discussion_r1968827196


##
solr/core/src/java/org/apache/solr/response/RawResponseWriter.java:
##
@@ -88,41 +85,27 @@ protected QueryResponseWriter 
getBaseWriter(SolrQueryRequest request) {
   @Override
   public String getContentType(SolrQueryRequest request, SolrQueryResponse 
response) {
 Object obj = response.getValues().get(CONTENT);
-if (obj != null && (obj instanceof ContentStream)) {
-  return ((ContentStream) obj).getContentType();
+if (obj instanceof ContentStream content) {
+  return content.getContentType();
 }
 return getBaseWriter(request).getContentType(request, response);
   }
 
   @Override
-  public void write(Writer writer, SolrQueryRequest request, SolrQueryResponse 
response)

Review Comment:
   We don't need a Writer/Reader variant of this (I think).



##
solr/core/src/java/org/apache/solr/response/JacksonJsonWriter.java:
##
@@ -43,17 +47,42 @@ public JacksonJsonWriter() {
 jsonfactory = new JsonFactory();
   }
 
+  // let's also implement the binary version since Jackson supports that 
(probably faster)
   @Override
-  public void write(OutputStream out, SolrQueryRequest request, 
SolrQueryResponse response)
+  public void write(
+  OutputStream out, SolrQueryRequest request, SolrQueryResponse response, 
String contentType)
   throws IOException {
-WriterImpl sw = new WriterImpl(jsonfactory, out, request, response);
+out = new NonFlushingStream(out);
+// resolve the encoding

Review Comment:
   this charSet check logic is new.  Since Jackson can specify the encoding and 
since the contentType (with possibly the encoding) is now passed in (new), made 
sense to choose the right one.



##
solr/core/src/java/org/apache/solr/response/TextQueryResponseWriter.java:
##
@@ -81,7 +83,8 @@ private static Writer buildWriter(OutputStream outputStream, 
String charset)
*
* See SOLR-8669.
*/
-  private static class NonFlushingStream extends OutputStream {
+  // nocommit discuss moving to SolrDispatchFilter wrapper.  If keep them move?

Review Comment:
   This is an observation I had to do things better.  Basically, don't we want 
Solr to always prevent a flush, not just specifically here only sometimes for 
some QueryResponseWriters?  If so, it can easily go on the close shield thing 
(see elsewhere in this PR to show how.  Then we can drop this NonFlushingStream 
thing and not worry about wether we should use it in other scenarios.  CC 
@magibney wonder if you have thoughts on this one



##
solr/core/src/test/org/apache/solr/request/TestWriterPerf.java:
##
@@ -179,18 +175,9 @@ void doPerf(String writerName, SolrQueryRequest req, int 
encIter, int decIter) t
 System.gc();
 RTimer timer = new RTimer();
 for (int i = 0; i < encIter; i++) {
-  if (w instanceof BinaryQueryResponseWriter binWriter) {
-out = new ByteArrayOutputStream();
-binWriter.write(out, req, rsp);
-out.close();
-  } else {
-out = new ByteArrayOutputStream();
-// to be fair, from my previous tests, much of the performance will be 
sucked up
-// by java's UTF-8 encoding/decoding, not the actual writing
-Writer writer = new OutputStreamWriter(out, StandardCharsets.UTF_8);

Review Comment:
   I dropped this branch because in normal use we'll write to an OutputStream, 
not a Writer.



##
solr/core/src/java/org/apache/solr/servlet/ServletUtils.java:
##
@@ -125,6 +125,12 @@ public void close() {
 : CLOSE_STREAM_MSG;
 stream = ClosedServletOutputStream.CLOSED_SERVLET_OUTPUT_STREAM;
   }
+
+  @Override
+  public void flush() throws IOException {
+// nocommit discuss

Review Comment:
   And here's the front door of Solr; super simple to prevent flushes here too



##
solr/core/src/java/org/apache/solr/response/BinaryResponseWriter.java:
##
@@ -59,12 +60,6 @@ public void write(OutputStream out, SolrQueryRequest req, 
SolrQueryResponse resp
 }
   }
 
-  @Override
-  public void write(Writer writer, SolrQueryRequest request, SolrQueryResponse 
response)
-  throws IOException {
-throw new RuntimeException("This is a binary writer , Cannot write to a 
characterstream");

Review Comment:
   The existence of this was a code-smell that the hierarchy was wrong; 
promoting my efforts here



##
solr/core/src/java/org/apache/solr/response/QueryResponseWriter.java:
##
@@ -39,23 +41,26 @@
  * A single instance of any registered QueryResponseWriter is created via 
the default constructor
  * and is reused for all relevant queries.
  */
+@ThreadSafe
 public interface QueryResponseWriter extends NamedListInitializedPlugin {
   public static String CONTENT_TYPE_XML_UTF8 = "application/xml; 
charset=UTF-8";
   public static String CONTENT_TYPE_TEX

[PR] GitHub Action: precommit: --continue [solr]

2025-02-24 Thread via GitHub


dsmiley opened a new pull request, #3210:
URL: https://github.com/apache/solr/pull/3210

   We'd like to see all errors found when the "precommit" GitHub Action runs, 
not just the first error.
   
   note: I'll revert the intentionally bad commit that's there to validate it 
works


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17682: QueryResponseWriter hierarchy refactor [solr]

2025-02-24 Thread via GitHub


dsmiley commented on code in PR #3209:
URL: https://github.com/apache/solr/pull/3209#discussion_r1969009364


##
solr/core/src/java/org/apache/solr/response/PrometheusResponseWriter.java:
##
@@ -47,12 +47,15 @@
  * org.apache.solr.handler.admin.MetricsHandler}
  */
 @SuppressWarnings(value = "unchecked")
-public class PrometheusResponseWriter extends RawResponseWriter {

Review Comment:
   @mlbiscoc why did you choose RawResponseWriter?  It's a very special 
QueryResponseWriter and I don't see that the this response writer utilizes it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[PR] SOLR-17682: QueryResponseWriter hierarchy refactor [solr]

2025-02-24 Thread via GitHub


dsmiley opened a new pull request, #3209:
URL: https://github.com/apache/solr/pull/3209

   QRW now does write(OutputStream) and NOT write(Writer). 
BinaryQueryResponseWriter is removed
   QRW.writeToString for tests
   TextQueryResponseWriter added as base for all text QRWs 
QueryResponseWriterUtil is removed
   
   https://issues.apache.org/jira/browse/SOLR-17682


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Updated] (SOLR-17682) Refactor QueryResponseWriter hierarchy to put binary at the base and add TextQueryResponseWriter sub

2025-02-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SOLR-17682:
--
Labels: pull-request-available  (was: )

> Refactor QueryResponseWriter hierarchy to put binary at the base and add 
> TextQueryResponseWriter sub
> 
>
> Key: SOLR-17682
> URL: https://issues.apache.org/jira/browse/SOLR-17682
> Project: Solr
>  Issue Type: Improvement
>Reporter: David Smiley
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The QueryResponseWriter hierarchy should be inverted.  Instead of Writer/Text 
> being at the base with a subclass (BinaryResponseWriter) doing 
> OutputStream/Binary, it should be inverted.  QueryResponseWriter should have 
> write(OutputStream,...) and there should be a subclass/interface 
> TextResponseWriter for the textual formats.  Once this is done, there are 
> some awkward methods that do casting (a code smell) that will instead be 
> simplified.  There will be no use for QueryResponseWriterUtil.  This is all 
> best shown in a PR to see why it's better.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17669: Reduce Memory Consumption by 80-90% when using Dynamic fields (DocumentObjectBinder) [solr]

2025-02-24 Thread via GitHub


dsmiley commented on PR #3179:
URL: https://github.com/apache/solr/pull/3179#issuecomment-2680779168

   I tried to move the CHANGES.txt entry below to Optimizations but I don't 
have permissions to push to your fork of Solr -- [see 
this](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 (which is also in our PR checklist).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-16903: Switch CoreContainer#getSolrHome to return Path instead of String [solr]

2025-02-24 Thread via GitHub


dsmiley merged PR #3204:
URL: https://github.com/apache/solr/pull/3204


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-16903) Use Path instead of File in Java Code

2025-02-24 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-16903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17930070#comment-17930070
 ] 

ASF subversion and git services commented on SOLR-16903:


Commit 53fe9cddc7b309f03ba92926bc02e6af100cdf1a in solr's branch 
refs/heads/main from Andrey Bozhko
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=53fe9cddc7b ]

SOLR-16903: Switch CoreContainer.getSolrHome to return Path instead of String 
(#3204)



> Use Path instead of File in Java Code
> -
>
> Key: SOLR-16903
> URL: https://issues.apache.org/jira/browse/SOLR-16903
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 9.3
>Reporter: Eric Pugh
>Priority: Minor
>  Labels: newdev, pull-request-available
> Fix For: main (10.0)
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> As a community, we have decided to migrate to using java.nio.file.Path in 
> place of java.io.File in our codebase.
> This ticket is to go through the codebase and make that change.  We'd also 
> like to add the java.io.File pattern to our forbidden-apis setup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-16391: Convert "modify-coll" API to JAX-RS [solr]

2025-02-24 Thread via GitHub


gerlowskija commented on PR #2930:
URL: https://github.com/apache/solr/pull/2930#issuecomment-2679415513

   > It has zero consequences for any caller.
   
   It's not really about consequences, it's about being consistent/intuitive.  
Even users who know about the `PATCH` verb and its semantics will probably 
reach for `PUT` if they're already used to using it on update-field, and 
update-config, and update-collprop, and update-configset-file, and 
update-clusterprop, and update-filestore-entry, and update-clusterprop, and 
update-noderole, and so on.
   
   But like I said - this isn't a hill I'm willing to die on: if anyone else 
votes for PATCH to tilt the cumulative consensus, or we find another API where 
it'd work, I'm happy to go that route.
   
   In the meantime I'm going to try out the "raw-Map" approach and follow up on 
SOLR-13271 regarding whether we need to support the arbitrary properties now 
that we have "Collection Props" as a more formalized thing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-16391: Convert create-core, core-status to JAX-RS [solr]

2025-02-24 Thread via GitHub


HoustonPutman commented on PR #3054:
URL: https://github.com/apache/solr/pull/3054#issuecomment-2679306623

   Hey @gerlowskija this is causing a few errors.
   
   First I've found is that the `getCoreStatus()` api takes a `Boolean`, but 
then de-refrences it to a `boolean`. So requests that don't provide the option 
get a NullPointerException for `indexInfo` at line 92 of `CoreStatus`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Resolved] (SOLR-3954) Option to have updateHandler and DIH skip updateLog

2025-02-24 Thread Eric Pugh (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Pugh resolved SOLR-3954.
-
Resolution: Won't Fix

DIH has moved to https://github.com/SearchScale/dataimporthandler

> Option to have updateHandler and DIH skip updateLog
> ---
>
> Key: SOLR-3954
> URL: https://issues.apache.org/jira/browse/SOLR-3954
> Project: Solr
>  Issue Type: Improvement
>  Components: update
>Affects Versions: 4.0
>Reporter: Shawn Heisey
>Priority: Major
> Fix For: 6.0, 4.9
>
>
> The updateLog feature makes updates take longer, likely because of the I/O 
> time required to write the additional information to disk.  It may take as 
> much as three times as long for the indexing portion of the process.  I'm not 
> sure whether it affects the time to commit, but I would imagine that the 
> difference there is small or zero.  When doing incremental updates/deletes on 
> an existing index, the time lag is probably very small and unimportant.
> When doing a full reindex (which may happen via DIH), especially if this is 
> done in a build core that is then swapped with a live core, this performance 
> hit is unacceptable.  It seems to make the import take about three times as 
> long.
> An option to have an update skip the updateLog would be very useful for these 
> situations.  It should have a method in SolrJ and be exposed in DIH as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17518: Deprecate UpdateRequest.getXml() and replace it with XMLRequestWriter [solr]

2025-02-24 Thread via GitHub


psalagnac commented on code in PR #3200:
URL: https://github.com/apache/solr/pull/3200#discussion_r1968195273


##
solr/solrj/src/java/org/apache/solr/client/solrj/request/UpdateRequest.java:
##
@@ -368,147 +368,39 @@ public void setDeleteQuery(List deleteQuery) {
   // --
   // --
 
+  /**
+   * @deprecated Method will be removed in Solr 10.0. Use {@link 
XMLRequestWriter} instead.
+   */
+  @Deprecated(since = "9.9")
   @Override
   public Collection getContentStreams() throws IOException {
 return ClientUtils.toContentStreams(getXML(), ClientUtils.TEXT_XML);
   }
 
+  /**
+   * @deprecated Method will be removed in Solr 10.0. Use {@link 
XMLRequestWriter} instead.
+   */
+  @Deprecated(since = "9.9")
   public String getXML() throws IOException {
 StringWriter writer = new StringWriter();
 writeXML(writer);
 writer.flush();
 
 // If action is COMMIT or OPTIMIZE, it is sent with params
 String xml = writer.toString();
-// System.out.println( "SEND:"+xml );
 return (xml.length() > 0) ? xml : null;
   }
 
-  private List>> getDocLists(
-  Map> documents) {
-List>> docLists = new 
ArrayList<>();
-Map> docList = null;
-if (this.documents != null) {
-
-  Boolean lastOverwrite = true;
-  Integer lastCommitWithin = -1;
-
-  Set>> entries = 
this.documents.entrySet();
-  for (Entry> entry : entries) {
-Map map = entry.getValue();
-Boolean overwrite = null;
-Integer commitWithin = null;
-if (map != null) {
-  overwrite = (Boolean) entry.getValue().get(OVERWRITE);
-  commitWithin = (Integer) entry.getValue().get(COMMIT_WITHIN);
-}
-if (!Objects.equals(overwrite, lastOverwrite)
-|| !Objects.equals(commitWithin, lastCommitWithin)
-|| docLists.isEmpty()) {
-  docList = new LinkedHashMap<>();
-  docLists.add(docList);
-}
-docList.put(entry.getKey(), entry.getValue());
-lastCommitWithin = commitWithin;
-lastOverwrite = overwrite;
-  }
-}
-
-if (docIterator != null) {
-  docList = new LinkedHashMap<>();
-  docLists.add(docList);
-  while (docIterator.hasNext()) {
-SolrInputDocument doc = docIterator.next();
-if (doc != null) {
-  docList.put(doc, null);
-}
-  }
-}
-
-return docLists;
-  }
-
   /**
-   * @since solr 1.4
+   * @deprecated Method will be removed in Solr 10.0. Use {@link 
XMLRequestWriter} instead.
*/
+  @Deprecated(since = "9.9")
   public UpdateRequest writeXML(Writer writer) throws IOException {
-List>> getDocLists = 
getDocLists(documents);
-
-for (Map> docs : getDocLists) {
-
-  if ((docs != null && docs.size() > 0)) {
-Entry> firstDoc = 
docs.entrySet().iterator().next();
-Map map = firstDoc.getValue();
-Integer cw = null;
-Boolean ow = null;
-if (map != null) {
-  cw = (Integer) firstDoc.getValue().get(COMMIT_WITHIN);
-  ow = (Boolean) firstDoc.getValue().get(OVERWRITE);
-}
-if (ow == null) ow = true;
-int commitWithin = (cw != null && cw != -1) ? cw : this.commitWithin;
-boolean overwrite = ow;
-if (commitWithin > -1 || overwrite != true) {
-  writer.write(
-  "");
-} else {
-  writer.write("");
-}
-
-Set>> entries = 
docs.entrySet();
-for (Entry> entry : entries) {
-  ClientUtils.writeXML(entry.getKey(), writer);
-}
-
-writer.write("");
-  }
-}
-
-// Add the delete commands
-boolean deleteI = deleteById != null && deleteById.size() > 0;
-boolean deleteQ = deleteQuery != null && deleteQuery.size() > 0;
-if (deleteI || deleteQ) {
-  if (commitWithin > 0) {
-writer.append("");
-  } else {
-writer.append("");
-  }
-  if (deleteI) {
-for (Map.Entry> entry : 
deleteById.entrySet()) {
-  writer.append(" map = entry.getValue();
-  if (map != null) {
-Long version = (Long) map.get(VER);
-String route = (String) map.get(_ROUTE_);
-if (version != null) {
-  writer.append(" 
version=\"").append(String.valueOf(version)).append('"');
-}
-
-if (route != null) {
-  writer.append(" _route_=\"").append(route).append('"');
-}
-  }
-  writer.append(">");
-
-  XML.escapeCharData(entry.getKey(), writer);
-  writer.append("");
-}
-  }
-  if (deleteQ) {
-for (String q : deleteQuery) {
-  writer.append("");
-  XML.escapeCharData(q, writer);
-  writer.append("");
-}
-  }
-  writer.append("");
-}
+XMLRequestWriter requestWriter = new XM

Re: [PR] SOLR-16391: Convert create-core, core-status to JAX-RS [solr]

2025-02-24 Thread via GitHub


HoustonPutman commented on PR #3054:
URL: https://github.com/apache/solr/pull/3054#issuecomment-2679559395

   Also `TestReplicationHandler.doTestIndexAndConfigAliasReplication`:
   
   ```
   TestReplicationHandler > doTestIndexAndConfigAliasReplication FAILED
   java.lang.ClassCastException: class java.util.LinkedHashMap cannot be 
cast to class org.apache.solr.common.util.NamedList (java.util.LinkedHashMap is 
in module java.base of loader 'bootstrap'; 
org.apache.solr.common.util.NamedList is in unnamed module of loader 'app')
   at 
__randomizedtesting.SeedInfo.seed([AD42828812B09E3B:5A316CD0D45831DD]:0)
   at 
org.apache.solr.handler.TestReplicationHandler.watchCoreStartAt(TestReplicationHandler.java:1688)
   at 
org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigAliasReplication(TestReplicationHandler.java:1360)
   ```
   
   We either need to fix the return type or fix the test to use the new type.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13271) Implement a read-only mode for a collection

2025-02-24 Thread Jason Gerlowski (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17929897#comment-17929897
 ] 

Jason Gerlowski commented on SOLR-13271:


I noticed recently that MODIFYCOLLECTION allows users to set arbitrary 
properties for a collection, that are distinct from that collection's 
"[collection 
properties|https://solr.apache.org/guide/solr/latest/deployment-guide/collection-management.html#collectionprop]";.

As far as I can tell from a bit of code spelunking, that came about as a part 
of this ticket.  The first draft/patch here added {{readOnly}} support by 
adding support for arbitrary properties more generally. Later revisions pivoted 
to handling {{readOnly}} as a "first-class" property, similar to 
replicationFactor, collection.configName, etc, but the plumbing for arbitrary 
props stuck around into the final commit and wasn't commented on (afaict).

("Collection props" had been around for about a year at the time this ticket 
was merged, having been previously added by 
[SOLR-11960|https://issues.apache.org/jira/browse/SOLR-11960]...but they didn't 
have a ton of use at that point, which maybe explains why they never came up as 
an option here?)

Anyway, my question at this point: is there a reason to keep the arbitrary-prop 
support in MODIFYCOLLECTION, or is there a reason to support these two 
different avenues of prop-setting?  Absent any objections or corrections here, 
I'll file a separate ticket for that and discussion can continue there...

> Implement a read-only mode for a collection
> ---
>
> Key: SOLR-13271
> URL: https://issues.apache.org/jira/browse/SOLR-13271
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 8.0, 9.0
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
> Fix For: 8.1, 9.0
>
> Attachments: SOLR-13271.patch, SOLR-13271.patch
>
>
> Spin-off from SOLR-11127. In some scenarios it's useful to be able to block 
> any index updates to a collection, while still being able to search its 
> contents.
> Currently the scope of this issue is SolrCloud, ie. standalone Solr will not 
> be supported.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17650: Fix tests for unordered buffered updates [solr]

2025-02-24 Thread via GitHub


HoustonPutman commented on code in PR #3197:
URL: https://github.com/apache/solr/pull/3197#discussion_r1968316944


##
solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java:
##
@@ -1027,6 +1028,64 @@ public static String assertJQ(SolrQueryRequest req, 
double delta, String... test
 }
   }
 
+  public static  String assertThatJQ(SolrQueryRequest req, Matcher test) 
throws Exception {
+return assertThatJQ(req, "", test);
+  }
+
+  /**
+   * Validates a query completes and, using JSON deserialization, returns an 
object that passes the
+   * given Matcher test.
+   *
+   * Please use this with care: this makes it easy to match complete 
structures, but doing so can
+   * result in fragile tests if you are matching more than what you want to 
test.
+   *
+   * @param req Solr request to execute
+   * @param message Failure message for test
+   * @param test Matcher for the given object returned from deserializing the 
response
+   * @return The request response as a JSON String if the test matcher passes
+   */
+  @SuppressWarnings("unchecked")
+  public static  String assertThatJQ(SolrQueryRequest req, String message, 
Matcher test)

Review Comment:
   I thought about this, but the entire test is set up to use the test harness, 
so I wanted to implement this test to fit in with the surrounding tests as best 
as possible. 
   
   Adding one more convenience method to the test harness to fit the existing 
workflow I think is pretty minor compared to starting a whole new way for this 
test to behave. Deprecating the TestHarness (and actually removing it) is going 
to be a MAJOR ordeal, and honestly IMO changing a test to 1/2 use the test 
harness and 1/2 use something else will probably make that harder in the end.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17518: Deprecate UpdateRequest.getXml() and replace it with XMLRequestWriter [solr]

2025-02-24 Thread via GitHub


psalagnac commented on code in PR #3200:
URL: https://github.com/apache/solr/pull/3200#discussion_r1968216711


##
solr/solrj/src/java/org/apache/solr/client/solrj/impl/XMLRequestWriter.java:
##
@@ -0,0 +1,216 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.client.solrj.impl;
+
+import java.io.BufferedWriter;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.io.OutputStreamWriter;
+import java.io.Writer;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.Set;
+import org.apache.solr.client.solrj.SolrRequest;
+import org.apache.solr.client.solrj.request.RequestWriter;
+import org.apache.solr.client.solrj.request.UpdateRequest;
+import org.apache.solr.client.solrj.util.ClientUtils;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.params.ShardParams;
+import org.apache.solr.common.util.ContentStream;
+import org.apache.solr.common.util.XML;
+
+public class XMLRequestWriter extends RequestWriter {
+
+  /**
+   * Use this to do a push writing instead of pull. If this method returns 
null {@link

Review Comment:
   That's not new code, not sure on history. This javadoc is also in base class 
`ContentWriter`.
   I agree it does not make sense... I'll update it with this PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17518: Deprecate UpdateRequest.getXml() and replace it with XMLRequestWriter [solr]

2025-02-24 Thread via GitHub


epugh commented on code in PR #3200:
URL: https://github.com/apache/solr/pull/3200#discussion_r1968218397


##
solr/solrj/src/java/org/apache/solr/client/solrj/impl/XMLRequestWriter.java:
##
@@ -0,0 +1,216 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.client.solrj.impl;
+
+import java.io.BufferedWriter;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.io.OutputStreamWriter;
+import java.io.Writer;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.Set;
+import org.apache.solr.client.solrj.SolrRequest;
+import org.apache.solr.client.solrj.request.RequestWriter;
+import org.apache.solr.client.solrj.request.UpdateRequest;
+import org.apache.solr.client.solrj.util.ClientUtils;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.params.ShardParams;
+import org.apache.solr.common.util.ContentStream;
+import org.apache.solr.common.util.XML;
+
+public class XMLRequestWriter extends RequestWriter {
+
+  /**
+   * Use this to do a push writing instead of pull. If this method returns 
null {@link

Review Comment:
   awesome!   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17518: Deprecate UpdateRequest.getXml() and replace it with XMLRequestWriter [solr]

2025-02-24 Thread via GitHub


psalagnac commented on code in PR #3200:
URL: https://github.com/apache/solr/pull/3200#discussion_r1968216711


##
solr/solrj/src/java/org/apache/solr/client/solrj/impl/XMLRequestWriter.java:
##
@@ -0,0 +1,216 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.client.solrj.impl;
+
+import java.io.BufferedWriter;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.io.OutputStreamWriter;
+import java.io.Writer;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.Set;
+import org.apache.solr.client.solrj.SolrRequest;
+import org.apache.solr.client.solrj.request.RequestWriter;
+import org.apache.solr.client.solrj.request.UpdateRequest;
+import org.apache.solr.client.solrj.util.ClientUtils;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.params.ShardParams;
+import org.apache.solr.common.util.ContentStream;
+import org.apache.solr.common.util.XML;
+
+public class XMLRequestWriter extends RequestWriter {
+
+  /**
+   * Use this to do a push writing instead of pull. If this method returns 
null {@link

Review Comment:
   That's not now code, not sure on history. This javadoc is also in base class 
`ContentWriter`.
   I agree it does not make sense... I'll update it with this PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17518: Deprecate UpdateRequest.getXml() and replace it with XMLRequestWriter [solr]

2025-02-24 Thread via GitHub


psalagnac commented on code in PR #3200:
URL: https://github.com/apache/solr/pull/3200#discussion_r1968216711


##
solr/solrj/src/java/org/apache/solr/client/solrj/impl/XMLRequestWriter.java:
##
@@ -0,0 +1,216 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.client.solrj.impl;
+
+import java.io.BufferedWriter;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.io.OutputStreamWriter;
+import java.io.Writer;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.Set;
+import org.apache.solr.client.solrj.SolrRequest;
+import org.apache.solr.client.solrj.request.RequestWriter;
+import org.apache.solr.client.solrj.request.UpdateRequest;
+import org.apache.solr.client.solrj.util.ClientUtils;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.params.ShardParams;
+import org.apache.solr.common.util.ContentStream;
+import org.apache.solr.common.util.XML;
+
+public class XMLRequestWriter extends RequestWriter {
+
+  /**
+   * Use this to do a push writing instead of pull. If this method returns 
null {@link

Review Comment:
   That's not new code, not sure on history. This javadoc is also in base class 
`RequestWriter`.
   I agree it does not make sense... I'll update it with this PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17518: Deprecate UpdateRequest.getXml() and replace it with XMLRequestWriter [solr]

2025-02-24 Thread via GitHub


psalagnac commented on PR #3200:
URL: https://github.com/apache/solr/pull/3200#issuecomment-2679343190

   > I assume you moved some things around but didn't really write code here 
(i.e. all code, variable names was the choice of original authors)?
   
   Yes, just moved things around.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17023: Use Modern NLP Models via ONNX and Apache OpenNLP with Solr [solr]

2025-02-24 Thread via GitHub


epugh commented on PR #1999:
URL: https://github.com/apache/solr/pull/1999#issuecomment-2679572157

   Yep, need to wait for Lucene 10, otherwise we get some unit test failures:
   
   ```
   gradlew test --tests 
TestOpenNLPExtractNamedEntitiesUpdateProcessorFactory.testExtractFieldRegexReplaceAll
 -Dtests.seed=F5DD0B40AC590A66 -Dtests.locale=pt-GW -Dtests.timezone=PRT 
-Dtests.asserts=true -Dtests.file.encoding=UTF-8
   ```
   
   ```
   > java.lang.NoSuchMethodError: 'opennlp.tools.util.Span[] 
opennlp.tools.sentdetect.SentenceDetectorME.sentPosDetect(java.lang.String)'
  > at 
__randomizedtesting.SeedInfo.seed([F5DD0B40AC590A66:4351D7F53AC9F7A4]:0)
  > at 
org.apache.lucene.analysis.opennlp.tools.NLPSentenceDetectorOp.splitSentences(NLPSentenceDetectorOp.java:41)
  > at 
org.apache.lucene.analysis.opennlp.OpenNLPSentenceBreakIterator.setText(OpenNLPSentenceBreakIterator.java:199)
  > at 
org.apache.lucene.analysis.util.SegmentingTokenizerBase.reset(SegmentingTokenizerBase.java:89)
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] Finally fix TestCoordinatorRole for good. [solr]

2025-02-24 Thread via GitHub


HoustonPutman merged PR #3205:
URL: https://github.com/apache/solr/pull/3205


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Resolved] (SOLR-14202) UpdateProcessor/also in DIH with a ScriptTransformer that does Atomic Updates leaks searchers

2025-02-24 Thread Eric Pugh (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Pugh resolved SOLR-14202.
--
Resolution: Won't Fix

DIH has moved to https://github.com/SearchScale/dataimporthandler

> UpdateProcessor/also in DIH with a ScriptTransformer that does Atomic Updates 
> leaks searchers
> -
>
> Key: SOLR-14202
> URL: https://issues.apache.org/jira/browse/SOLR-14202
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 8.3, 8.4
>Reporter: Jörn Franke
>Priority: Major
> Attachments: eoe.zip, eoedihleak.zip
>
>
> The data directory of a collection is growing and growing. It seems that old 
> segments are not deleted. They are only deleting during start of Solr.
> How to reproduce. Have any collection (e.g. the example collection) and start 
> indexing documents. Even during the indexing the data directory is growing 
> significantly - much more than expected (several magnitudes). if certain 
> documents are updated (without significantly increasing the amount of data) 
> the index data directory grows again several magnitudes. Even for small 
> collections the needed space explodes.
> This reduces significantly if Solr is stopped and then started. During 
> startup (not shutdown) Solr purges all those segments if not needed (* 
> sometimes some but not a significant amount is deleted during shutdown). This 
> is of course not a good workaround for normal operations.
> It does not seem to have a affect on queries (their performance do not seem 
> to change).
> The configs have not changed before the upgrade and after (e.g. from Solr 8.2 
> to 8.3 to 8.4, not cross major versions), so I assume it could be related to 
> Solr 8.4. It may have been also in Solr 8.3 (not sure), but not in 8.2.
>  
> IndexConfig is pretty much default: Lock type: native, autoCommit: 15000, 
> openSearcher=false, autoSoftCommit -1 (reproducible with autoCommit 5000).
> Nevertheless, it did not happen in previous versions of Solr and the config 
> did not change.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Resolved] (SOLR-4241) Add object to SolrJ for interpreting DIH status

2025-02-24 Thread Eric Pugh (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Pugh resolved SOLR-4241.
-
Resolution: Won't Fix

DIH has moved to https://github.com/SearchScale/dataimporthandler

> Add object to SolrJ for interpreting DIH status
> ---
>
> Key: SOLR-4241
> URL: https://issues.apache.org/jira/browse/SOLR-4241
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java, SolrJ
>Reporter: Shawn Heisey
>Priority: Major
> Fix For: 6.0, 4.9
>
>
> Objects exist in SolrJ for easy interpretation of special handlers - 
> SolrPing/SolrPingResponse is a prime example.  I believe it would be a good 
> idea to add similar capabilities for easily interpreting DIH status.
> The only sticky point I can see is the fact that the dataimport handler is a 
> contrib module.  This might mean that this new capability would have to be 
> separated into a small jar file in a solrj contrib section.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Created] (SOLR-17682) Refactor QueryResponseWriter hierarchy to put binary at the base and add TextQueryResponseWriter sub

2025-02-24 Thread David Smiley (Jira)
David Smiley created SOLR-17682:
---

 Summary: Refactor QueryResponseWriter hierarchy to put binary at 
the base and add TextQueryResponseWriter sub
 Key: SOLR-17682
 URL: https://issues.apache.org/jira/browse/SOLR-17682
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley


The QueryResponseWriter hierarchy should be inverted.  Instead of Writer/Text 
being at the base with a subclass (BinaryResponseWriter) doing 
OutputStream/Binary, it should be inverted.  QueryResponseWriter should have 
write(OutputStream,...) and there should be a subclass/interface 
TextResponseWriter for the textual formats.  Once this is done, there are some 
awkward methods that do casting (a code smell) that will instead be simplified. 
 There will be no use for QueryResponseWriterUtil.  This is all best shown in a 
PR to see why it's better.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13731) javabin must support a 1:1 mapping of the JSON update format

2025-02-24 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17929615#comment-17929615
 ] 

Noble Paul commented on SOLR-13731:
---

Constructing a SolrInputDocument etc is much more complex than streaming a 
bunch of maps. Yes. users can just construct a payload of a javabin file and 
post it 

> javabin  must support a 1:1 mapping of the JSON update format
> -
>
> Key: SOLR-13731
> URL: https://issues.apache.org/jira/browse/SOLR-13731
> Project: Solr
>  Issue Type: Task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.4
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Objects like SolrInputDocument is serialized in such a way that the size is 
> known in advance. All objects should ideally support streaming friendly types.
> This is backward compatible . basically javabin will continue to serialize 
> using the old format , but will accept more efficient formats as input



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17677: HashRangeQuery doesn't NEED SolrIndexSearcher [solr]

2025-02-24 Thread via GitHub


gerlowskija commented on code in PR #3206:
URL: https://github.com/apache/solr/pull/3206#discussion_r1967557670


##
solr/core/src/java/org/apache/solr/search/join/HashRangeQuery.java:
##
@@ -91,6 +96,9 @@ private int[] getCache(LeafReaderContext context) throws 
IOException {
 if (cacheHelper == null) {
   return null;
 }
+if (!(searcher instanceof SolrIndexSearcher)) { // e.g. delete-by-query

Review Comment:
   [Q] How important is the caching, do you know?  Is it enough of a change 
that it's worth documenting the degraded performance when run as a DBQ?
   
   (Not implying it is, just asking from a due-diligence perspective.)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Created] (SOLR-17681) Text to Vector Filter for Indexing

2025-02-24 Thread Yugank (Jira)
Yugank created SOLR-17681:
-

 Summary: Text to Vector Filter for Indexing
 Key: SOLR-17681
 URL: https://issues.apache.org/jira/browse/SOLR-17681
 Project: Solr
  Issue Type: New Feature
Reporter: Yugank


Scope of this issue is to introduce support for automatic text vectorisation in 
Apache Solr, directly in the Analyzer chain during indexing.

Since solr already have the capability of Text to Vector Query Parser thanks to 
Alessandro Benedetti. We should start looking at a Filter that can do the 
encoding using the same model we have uploaded for the query parser. This would 
make solr self sufficient without the use of external LLM service for encoding 
even during index time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-16903: Switch CoreContainer#getSolrHome to return Path instead of String [solr]

2025-02-24 Thread via GitHub


AndreyBozhko commented on code in PR #3204:
URL: https://github.com/apache/solr/pull/3204#discussion_r1967834442


##
solr/core/src/java/org/apache/solr/handler/admin/SystemInfoHandler.java:
##
@@ -145,7 +145,7 @@ public void handleRequestBody(SolrQueryRequest req, 
SolrQueryResponse rsp) throw
   rsp.add("zkHost", 
getCoreContainer(req).getZkController().getZkServerAddress());
 }
 if (cc != null) {
-  rsp.add("solr_home", cc.getSolrHome());
+  rsp.add("solr_home", cc.getSolrHome().toString());

Review Comment:
   Makes sense - the JavaBinCodec looks to be doing the same thing. 
https://github.com/apache/solr/blob/76c09a35dba42913a6bcb281b52b00f87564624a/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L411-L413



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17677: HashRangeQuery doesn't NEED SolrIndexSearcher [solr]

2025-02-24 Thread via GitHub


dsmiley merged PR #3206:
URL: https://github.com/apache/solr/pull/3206


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17677: HashRangeQuery doesn't NEED SolrIndexSearcher [solr]

2025-02-24 Thread via GitHub


dsmiley commented on code in PR #3206:
URL: https://github.com/apache/solr/pull/3206#discussion_r1967881072


##
solr/core/src/java/org/apache/solr/search/join/HashRangeQuery.java:
##
@@ -91,6 +96,9 @@ private int[] getCache(LeafReaderContext context) throws 
IOException {
 if (cacheHelper == null) {
   return null;
 }
+if (!(searcher instanceof SolrIndexSearcher)) { // e.g. delete-by-query

Review Comment:
   It's telling that the cache this thing uses is completely optional.  You 
have to go out of your way to register it in your solrconfig.xml.  So this 
query definitely doesn't need it, and it's likely people are using this query 
without this cache given how easy it would be to forget to configure it or even 
know about it.
   
   There's no degraded performance concern since it hasn't been working at all 
in a DBQ :-)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17677) {!join} in delete-by-query throws ClassCastException and closes IndexWriter

2025-02-24 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17929815#comment-17929815
 ] 

ASF subversion and git services commented on SOLR-17677:


Commit 3a492203cf4d9b7ed9431de9378049992f7da355 in solr's branch 
refs/heads/main from David Smiley
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=3a492203cf4 ]

SOLR-17677: HashRangeQuery doesn't NEED SolrIndexSearcher (#3206)



> {!join} in delete-by-query throws ClassCastException and closes IndexWriter
> ---
>
> Key: SOLR-17677
> URL: https://issues.apache.org/jira/browse/SOLR-17677
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 9.8
>Reporter: Jason Gerlowski
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Solr's JoinQuery implementation explicitly casts the provided "IndexSearcher" 
> to a "SolrIndexSearcher".  In most contexts this assumption bears out, but 
> not always.
> One counter-example is Solr's "Delete By Query" codepath, which runs the 
> deletion query using a "raw" Lucene IndexSearcher.  (Presumably this is 
> because the new searcher has just been opened?).  Any DBQ containing a join 
> query will throw a ClassCastException, which then bubbles up to the 
> IndexWriter as a "tragic" Lucene exception, force-closing the IndexWriter and 
> throwing the surrounding SolrCore in to a bad state:
> {code}
> 2025-02-18 19:39:25.339 ERROR (qtp1426725223-177-localhost-73) 
> [c:techproducts s:shard2 r:core_node3 x:techproducts_shard2_replica_n1 
> t:localhost-73] o.a.s.h.RequestHandlerBase Server exception => 
> org.apache.solr.common.SolrException: this IndexWriter is closed
> at 
> org.apache.solr.common.SolrException.wrapLuceneTragicExceptionIfNecessary(SolrException.java:218)
> org.apache.solr.common.SolrException: this IndexWriter is closed
> at 
> org.apache.solr.common.SolrException.wrapLuceneTragicExceptionIfNecessary(SolrException.java:218)
>  ~[?:?]
> at 
> org.apache.solr.handler.RequestHandlerBase.normalizeReceivedException(RequestHandlerBase.java:272)
>  ~[?:?]
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:238)
>  ~[?:?]
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2880) ~[?:?]
> at 
> org.apache.solr.servlet.HttpSolrCall.executeCoreRequest(HttpSolrCall.java:890)
>  ~[?:?]
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:576) 
> ~[?:?]
> at 
> org.apache.solr.servlet.SolrDispatchFilter.dispatch(SolrDispatchFilter.java:241)
>  ~[?:?]
> at 
> org.apache.solr.servlet.SolrDispatchFilter.lambda$doFilterRetry$0(SolrDispatchFilter.java:198)
>  ~[?:?]
> at 
> org.apache.solr.servlet.ServletUtils.traceHttpRequestExecution2(ServletUtils.java:227)
>  ~[?:?]
> at 
> org.apache.solr.servlet.ServletUtils.rateLimitRequest(ServletUtils.java:197) 
> ~[?:?]
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilterRetry(SolrDispatchFilter.java:192)
>  ~[?:?]
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:181)
>  ~[?:?]
> at javax.servlet.http.HttpFilter.doFilter(HttpFilter.java:97) 
> ~[jetty-servlet-api-4.0.6.jar:?]
> at 
> org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:210) 
> ~[jetty-servlet-10.0.22.jar:10.0.22]
> at 
> org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1635)
>  ~[jetty-servlet-10.0.22.jar:10.0.22]
> at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:527) 
> ~[jetty-servlet-10.0.22.jar:10.0.22]
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:131) 
> ~[jetty-server-10.0.22.jar:10.0.22]
> at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:598) 
> ~[jetty-security-10.0.22.jar:10.0.22]
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
>  ~[jetty-server-10.0.22.jar:10.0.22]
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:223)
>  ~[jetty-server-10.0.22.jar:10.0.22]
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1580)
>  ~[jetty-server-10.0.22.jar:10.0.22]
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:221)
>  ~[jetty-server-10.0.22.jar:10.0.22]
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1384)
>  ~[jetty-server-10.0.22.jar:10.0.22]
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:176)
>  ~[jetty-server-10.0.22.j

Re: [PR] SOLR-16903: Switch CoreContainer#getSolrHome to return Path instead of String [solr]

2025-02-24 Thread via GitHub


dsmiley commented on PR #3204:
URL: https://github.com/apache/solr/pull/3204#issuecomment-2678884231

   I plan to merge this tonight; it's very straightforward


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17669: Reduce Memory Consumption by 80-90% when using Dynamic fields (DocumentObjectBinder) [solr]

2025-02-24 Thread via GitHub


dsmiley commented on PR #3179:
URL: https://github.com/apache/solr/pull/3179#issuecomment-2678890820

   I'll merge this tonight.  If you don't get to CHANGES.txt; I'll do it.  I 
edited the JIRA issue description to better identify what this is about; it was 
confusing to speak of "dynamic fields" -- everyone will think you mean the 
schema itself.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17669: Reduce Memory Consumption by 80-90% when using Dynamic fields (DocumentObjectBinder) [solr]

2025-02-24 Thread via GitHub


ds-manzinger commented on PR #3179:
URL: https://github.com/apache/solr/pull/3179#issuecomment-2678911271

   Hi, i commited changes.txt about 1 hour ago


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17023: Use Modern NLP Models via ONNX and Apache OpenNLP with Solr [solr]

2025-02-24 Thread via GitHub


epugh commented on code in PR #1999:
URL: https://github.com/apache/solr/pull/1999#discussion_r1967948785


##
solr/modules/analysis-extras/src/java/org/apache/solr/update/processor/DocumentCategorizerUpdateProcessorFactory.java:
##
@@ -0,0 +1,569 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.update.processor;
+
+import static org.apache.solr.common.SolrException.ErrorCode.SERVER_ERROR;
+
+import ai.onnxruntime.OrtException;
+import java.io.File;
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.regex.Matcher;
+import java.util.regex.Pattern;
+import java.util.regex.PatternSyntaxException;
+import opennlp.dl.InferenceOptions;
+import opennlp.dl.doccat.DocumentCategorizerDL;
+import opennlp.dl.doccat.scoring.AverageClassificationScoringStrategy;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.SolrInputField;
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.common.util.Pair;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.filestore.ClusterFileStore;
+import org.apache.solr.filestore.FileStore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.update.AddUpdateCommand;
+import 
org.apache.solr.update.processor.FieldMutatingUpdateProcessor.FieldNameSelector;
+import 
org.apache.solr.update.processor.FieldMutatingUpdateProcessorFactory.SelectorParams;
+import org.apache.solr.util.plugin.SolrCoreAware;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class DocumentCategorizerUpdateProcessorFactory extends 
UpdateRequestProcessorFactory
+implements SolrCoreAware {
+
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  public static final String SOURCE_PARAM = "source";
+  public static final String DEST_PARAM = "dest";
+  public static final String PATTERN_PARAM = "pattern";
+  public static final String REPLACEMENT_PARAM = "replacement";
+  public static final String MODEL_PARAM = "modelFile";
+  public static final String VOCAB_PARAM = "vocabFile";
+
+  private Path solrHome;
+
+  private SelectorParams srcInclusions = new SelectorParams();
+  private Collection srcExclusions = new ArrayList<>();
+
+  private FieldNameSelector srcSelector = null;
+
+  private String model = null;
+  private String vocab = null;
+  private String analyzerFieldType = null;
+
+  /**
+   * If pattern is null, this this is a literal field name. If pattern is 
non-null then this is a
+   * replacement string that may contain meta-characters (ie: capture group 
identifiers)
+   *
+   * @see #pattern
+   */
+  private String dest = null;
+
+  /**
+   * @see #dest
+   */
+  private Pattern pattern = null;
+
+  protected final FieldNameSelector getSourceSelector() {
+if (null != srcSelector) return srcSelector;
+
+throw new SolrException(
+SERVER_ERROR, "selector was never initialized, inform(SolrCore) never 
called???");
+  }
+
+  @Override
+  public void init(NamedList args) {
+
+// high level (loose) check for which type of config we have.
+//
+// individual init methods do more strict syntax checking
+if (0 <= args.indexOf(SOURCE_PARAM, 0) && 0 <= args.indexOf(DEST_PARAM, 
0)) {
+  initSourceSelectorSyntax(args);
+} else if (0 <= args.indexOf(PATTERN_PARAM, 0) && 0 <= 
args.indexOf(REPLACEMENT_PARAM, 0)) {
+  initSimpleRegexReplacement(args);
+} else {
+  throw new SolrException(
+  SERVER_ERROR,
+  "A combination of either '"
+  + SOURCE_PARAM
+  + "' + '"
+  + DEST_PARAM
+  + "', or '"
+  + REPLACEMENT_PARAM
+  + "' + '"
+  + PATTERN_PARAM
+  + "' init params are mandatory");
+}
+
+Object

[jira] [Commented] (SOLR-13731) javabin must support a 1:1 mapping of the JSON update format

2025-02-24 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17929833#comment-17929833
 ] 

David Smiley commented on SOLR-13731:
-

There are many classes in SolrJ and it's not always clear to us/users which 
ones are internal, not to mention that in this case, we may disagree.  I don't 
see why a user using SolrJ would go out of their way to use JavaBinCodec as you 
describe.  Constructing a SolrInputDocument is easy :).  
{{ConcurrentUpdateHttp2SolrClient}} streams updates.

> javabin  must support a 1:1 mapping of the JSON update format
> -
>
> Key: SOLR-13731
> URL: https://issues.apache.org/jira/browse/SOLR-13731
> Project: Solr
>  Issue Type: Task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.4
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Objects like SolrInputDocument is serialized in such a way that the size is 
> known in advance. All objects should ideally support streaming friendly types.
> This is backward compatible . basically javabin will continue to serialize 
> using the old format , but will accept more efficient formats as input



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[PR] Add CVE-2024-6763 to our vex file [solr-site]

2025-02-24 Thread via GitHub


gerlowskija opened a new pull request, #143:
URL: https://github.com/apache/solr-site/pull/143

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org