Re: [PR] SOLR-17682: QueryResponseWriter hierarchy refactor [solr]
dsmiley commented on code in PR #3209: URL: https://github.com/apache/solr/pull/3209#discussion_r1968827196 ## solr/core/src/java/org/apache/solr/response/RawResponseWriter.java: ## @@ -88,41 +85,27 @@ protected QueryResponseWriter getBaseWriter(SolrQueryRequest request) { @Override public String getContentType(SolrQueryRequest request, SolrQueryResponse response) { Object obj = response.getValues().get(CONTENT); -if (obj != null && (obj instanceof ContentStream)) { - return ((ContentStream) obj).getContentType(); +if (obj instanceof ContentStream content) { + return content.getContentType(); } return getBaseWriter(request).getContentType(request, response); } @Override - public void write(Writer writer, SolrQueryRequest request, SolrQueryResponse response) Review Comment: We don't need a Writer/Reader variant of this (I think). ## solr/core/src/java/org/apache/solr/response/JacksonJsonWriter.java: ## @@ -43,17 +47,42 @@ public JacksonJsonWriter() { jsonfactory = new JsonFactory(); } + // let's also implement the binary version since Jackson supports that (probably faster) @Override - public void write(OutputStream out, SolrQueryRequest request, SolrQueryResponse response) + public void write( + OutputStream out, SolrQueryRequest request, SolrQueryResponse response, String contentType) throws IOException { -WriterImpl sw = new WriterImpl(jsonfactory, out, request, response); +out = new NonFlushingStream(out); +// resolve the encoding Review Comment: this charSet check logic is new. Since Jackson can specify the encoding and since the contentType (with possibly the encoding) is now passed in (new), made sense to choose the right one. ## solr/core/src/java/org/apache/solr/response/TextQueryResponseWriter.java: ## @@ -81,7 +83,8 @@ private static Writer buildWriter(OutputStream outputStream, String charset) * * See SOLR-8669. */ - private static class NonFlushingStream extends OutputStream { + // nocommit discuss moving to SolrDispatchFilter wrapper. If keep them move? Review Comment: This is an observation I had to do things better. Basically, don't we want Solr to always prevent a flush, not just specifically here only sometimes for some QueryResponseWriters? If so, it can easily go on the close shield thing (see elsewhere in this PR to show how. Then we can drop this NonFlushingStream thing and not worry about wether we should use it in other scenarios. CC @magibney wonder if you have thoughts on this one ## solr/core/src/test/org/apache/solr/request/TestWriterPerf.java: ## @@ -179,18 +175,9 @@ void doPerf(String writerName, SolrQueryRequest req, int encIter, int decIter) t System.gc(); RTimer timer = new RTimer(); for (int i = 0; i < encIter; i++) { - if (w instanceof BinaryQueryResponseWriter binWriter) { -out = new ByteArrayOutputStream(); -binWriter.write(out, req, rsp); -out.close(); - } else { -out = new ByteArrayOutputStream(); -// to be fair, from my previous tests, much of the performance will be sucked up -// by java's UTF-8 encoding/decoding, not the actual writing -Writer writer = new OutputStreamWriter(out, StandardCharsets.UTF_8); Review Comment: I dropped this branch because in normal use we'll write to an OutputStream, not a Writer. ## solr/core/src/java/org/apache/solr/servlet/ServletUtils.java: ## @@ -125,6 +125,12 @@ public void close() { : CLOSE_STREAM_MSG; stream = ClosedServletOutputStream.CLOSED_SERVLET_OUTPUT_STREAM; } + + @Override + public void flush() throws IOException { +// nocommit discuss Review Comment: And here's the front door of Solr; super simple to prevent flushes here too ## solr/core/src/java/org/apache/solr/response/BinaryResponseWriter.java: ## @@ -59,12 +60,6 @@ public void write(OutputStream out, SolrQueryRequest req, SolrQueryResponse resp } } - @Override - public void write(Writer writer, SolrQueryRequest request, SolrQueryResponse response) - throws IOException { -throw new RuntimeException("This is a binary writer , Cannot write to a characterstream"); Review Comment: The existence of this was a code-smell that the hierarchy was wrong; promoting my efforts here ## solr/core/src/java/org/apache/solr/response/QueryResponseWriter.java: ## @@ -39,23 +41,26 @@ * A single instance of any registered QueryResponseWriter is created via the default constructor * and is reused for all relevant queries. */ +@ThreadSafe public interface QueryResponseWriter extends NamedListInitializedPlugin { public static String CONTENT_TYPE_XML_UTF8 = "application/xml; charset=UTF-8"; public static String CONTENT_TYPE_TEX
[PR] GitHub Action: precommit: --continue [solr]
dsmiley opened a new pull request, #3210: URL: https://github.com/apache/solr/pull/3210 We'd like to see all errors found when the "precommit" GitHub Action runs, not just the first error. note: I'll revert the intentionally bad commit that's there to validate it works -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17682: QueryResponseWriter hierarchy refactor [solr]
dsmiley commented on code in PR #3209: URL: https://github.com/apache/solr/pull/3209#discussion_r1969009364 ## solr/core/src/java/org/apache/solr/response/PrometheusResponseWriter.java: ## @@ -47,12 +47,15 @@ * org.apache.solr.handler.admin.MetricsHandler} */ @SuppressWarnings(value = "unchecked") -public class PrometheusResponseWriter extends RawResponseWriter { Review Comment: @mlbiscoc why did you choose RawResponseWriter? It's a very special QueryResponseWriter and I don't see that the this response writer utilizes it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[PR] SOLR-17682: QueryResponseWriter hierarchy refactor [solr]
dsmiley opened a new pull request, #3209: URL: https://github.com/apache/solr/pull/3209 QRW now does write(OutputStream) and NOT write(Writer). BinaryQueryResponseWriter is removed QRW.writeToString for tests TextQueryResponseWriter added as base for all text QRWs QueryResponseWriterUtil is removed https://issues.apache.org/jira/browse/SOLR-17682 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-17682) Refactor QueryResponseWriter hierarchy to put binary at the base and add TextQueryResponseWriter sub
[ https://issues.apache.org/jira/browse/SOLR-17682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated SOLR-17682: -- Labels: pull-request-available (was: ) > Refactor QueryResponseWriter hierarchy to put binary at the base and add > TextQueryResponseWriter sub > > > Key: SOLR-17682 > URL: https://issues.apache.org/jira/browse/SOLR-17682 > Project: Solr > Issue Type: Improvement >Reporter: David Smiley >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > The QueryResponseWriter hierarchy should be inverted. Instead of Writer/Text > being at the base with a subclass (BinaryResponseWriter) doing > OutputStream/Binary, it should be inverted. QueryResponseWriter should have > write(OutputStream,...) and there should be a subclass/interface > TextResponseWriter for the textual formats. Once this is done, there are > some awkward methods that do casting (a code smell) that will instead be > simplified. There will be no use for QueryResponseWriterUtil. This is all > best shown in a PR to see why it's better. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17669: Reduce Memory Consumption by 80-90% when using Dynamic fields (DocumentObjectBinder) [solr]
dsmiley commented on PR #3179: URL: https://github.com/apache/solr/pull/3179#issuecomment-2680779168 I tried to move the CHANGES.txt entry below to Optimizations but I don't have permissions to push to your fork of Solr -- [see this](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork) (which is also in our PR checklist). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-16903: Switch CoreContainer#getSolrHome to return Path instead of String [solr]
dsmiley merged PR #3204: URL: https://github.com/apache/solr/pull/3204 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-16903) Use Path instead of File in Java Code
[ https://issues.apache.org/jira/browse/SOLR-16903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17930070#comment-17930070 ] ASF subversion and git services commented on SOLR-16903: Commit 53fe9cddc7b309f03ba92926bc02e6af100cdf1a in solr's branch refs/heads/main from Andrey Bozhko [ https://gitbox.apache.org/repos/asf?p=solr.git;h=53fe9cddc7b ] SOLR-16903: Switch CoreContainer.getSolrHome to return Path instead of String (#3204) > Use Path instead of File in Java Code > - > > Key: SOLR-16903 > URL: https://issues.apache.org/jira/browse/SOLR-16903 > Project: Solr > Issue Type: Improvement >Affects Versions: 9.3 >Reporter: Eric Pugh >Priority: Minor > Labels: newdev, pull-request-available > Fix For: main (10.0) > > Time Spent: 6h 10m > Remaining Estimate: 0h > > As a community, we have decided to migrate to using java.nio.file.Path in > place of java.io.File in our codebase. > This ticket is to go through the codebase and make that change. We'd also > like to add the java.io.File pattern to our forbidden-apis setup. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-16391: Convert "modify-coll" API to JAX-RS [solr]
gerlowskija commented on PR #2930: URL: https://github.com/apache/solr/pull/2930#issuecomment-2679415513 > It has zero consequences for any caller. It's not really about consequences, it's about being consistent/intuitive. Even users who know about the `PATCH` verb and its semantics will probably reach for `PUT` if they're already used to using it on update-field, and update-config, and update-collprop, and update-configset-file, and update-clusterprop, and update-filestore-entry, and update-clusterprop, and update-noderole, and so on. But like I said - this isn't a hill I'm willing to die on: if anyone else votes for PATCH to tilt the cumulative consensus, or we find another API where it'd work, I'm happy to go that route. In the meantime I'm going to try out the "raw-Map" approach and follow up on SOLR-13271 regarding whether we need to support the arbitrary properties now that we have "Collection Props" as a more formalized thing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-16391: Convert create-core, core-status to JAX-RS [solr]
HoustonPutman commented on PR #3054: URL: https://github.com/apache/solr/pull/3054#issuecomment-2679306623 Hey @gerlowskija this is causing a few errors. First I've found is that the `getCoreStatus()` api takes a `Boolean`, but then de-refrences it to a `boolean`. So requests that don't provide the option get a NullPointerException for `indexInfo` at line 92 of `CoreStatus`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-3954) Option to have updateHandler and DIH skip updateLog
[ https://issues.apache.org/jira/browse/SOLR-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Pugh resolved SOLR-3954. - Resolution: Won't Fix DIH has moved to https://github.com/SearchScale/dataimporthandler > Option to have updateHandler and DIH skip updateLog > --- > > Key: SOLR-3954 > URL: https://issues.apache.org/jira/browse/SOLR-3954 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 4.0 >Reporter: Shawn Heisey >Priority: Major > Fix For: 6.0, 4.9 > > > The updateLog feature makes updates take longer, likely because of the I/O > time required to write the additional information to disk. It may take as > much as three times as long for the indexing portion of the process. I'm not > sure whether it affects the time to commit, but I would imagine that the > difference there is small or zero. When doing incremental updates/deletes on > an existing index, the time lag is probably very small and unimportant. > When doing a full reindex (which may happen via DIH), especially if this is > done in a build core that is then swapped with a live core, this performance > hit is unacceptable. It seems to make the import take about three times as > long. > An option to have an update skip the updateLog would be very useful for these > situations. It should have a method in SolrJ and be exposed in DIH as well. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17518: Deprecate UpdateRequest.getXml() and replace it with XMLRequestWriter [solr]
psalagnac commented on code in PR #3200: URL: https://github.com/apache/solr/pull/3200#discussion_r1968195273 ## solr/solrj/src/java/org/apache/solr/client/solrj/request/UpdateRequest.java: ## @@ -368,147 +368,39 @@ public void setDeleteQuery(List deleteQuery) { // -- // -- + /** + * @deprecated Method will be removed in Solr 10.0. Use {@link XMLRequestWriter} instead. + */ + @Deprecated(since = "9.9") @Override public Collection getContentStreams() throws IOException { return ClientUtils.toContentStreams(getXML(), ClientUtils.TEXT_XML); } + /** + * @deprecated Method will be removed in Solr 10.0. Use {@link XMLRequestWriter} instead. + */ + @Deprecated(since = "9.9") public String getXML() throws IOException { StringWriter writer = new StringWriter(); writeXML(writer); writer.flush(); // If action is COMMIT or OPTIMIZE, it is sent with params String xml = writer.toString(); -// System.out.println( "SEND:"+xml ); return (xml.length() > 0) ? xml : null; } - private List>> getDocLists( - Map> documents) { -List>> docLists = new ArrayList<>(); -Map> docList = null; -if (this.documents != null) { - - Boolean lastOverwrite = true; - Integer lastCommitWithin = -1; - - Set>> entries = this.documents.entrySet(); - for (Entry> entry : entries) { -Map map = entry.getValue(); -Boolean overwrite = null; -Integer commitWithin = null; -if (map != null) { - overwrite = (Boolean) entry.getValue().get(OVERWRITE); - commitWithin = (Integer) entry.getValue().get(COMMIT_WITHIN); -} -if (!Objects.equals(overwrite, lastOverwrite) -|| !Objects.equals(commitWithin, lastCommitWithin) -|| docLists.isEmpty()) { - docList = new LinkedHashMap<>(); - docLists.add(docList); -} -docList.put(entry.getKey(), entry.getValue()); -lastCommitWithin = commitWithin; -lastOverwrite = overwrite; - } -} - -if (docIterator != null) { - docList = new LinkedHashMap<>(); - docLists.add(docList); - while (docIterator.hasNext()) { -SolrInputDocument doc = docIterator.next(); -if (doc != null) { - docList.put(doc, null); -} - } -} - -return docLists; - } - /** - * @since solr 1.4 + * @deprecated Method will be removed in Solr 10.0. Use {@link XMLRequestWriter} instead. */ + @Deprecated(since = "9.9") public UpdateRequest writeXML(Writer writer) throws IOException { -List>> getDocLists = getDocLists(documents); - -for (Map> docs : getDocLists) { - - if ((docs != null && docs.size() > 0)) { -Entry> firstDoc = docs.entrySet().iterator().next(); -Map map = firstDoc.getValue(); -Integer cw = null; -Boolean ow = null; -if (map != null) { - cw = (Integer) firstDoc.getValue().get(COMMIT_WITHIN); - ow = (Boolean) firstDoc.getValue().get(OVERWRITE); -} -if (ow == null) ow = true; -int commitWithin = (cw != null && cw != -1) ? cw : this.commitWithin; -boolean overwrite = ow; -if (commitWithin > -1 || overwrite != true) { - writer.write( - ""); -} else { - writer.write(""); -} - -Set>> entries = docs.entrySet(); -for (Entry> entry : entries) { - ClientUtils.writeXML(entry.getKey(), writer); -} - -writer.write(""); - } -} - -// Add the delete commands -boolean deleteI = deleteById != null && deleteById.size() > 0; -boolean deleteQ = deleteQuery != null && deleteQuery.size() > 0; -if (deleteI || deleteQ) { - if (commitWithin > 0) { -writer.append(""); - } else { -writer.append(""); - } - if (deleteI) { -for (Map.Entry> entry : deleteById.entrySet()) { - writer.append(" map = entry.getValue(); - if (map != null) { -Long version = (Long) map.get(VER); -String route = (String) map.get(_ROUTE_); -if (version != null) { - writer.append(" version=\"").append(String.valueOf(version)).append('"'); -} - -if (route != null) { - writer.append(" _route_=\"").append(route).append('"'); -} - } - writer.append(">"); - - XML.escapeCharData(entry.getKey(), writer); - writer.append(""); -} - } - if (deleteQ) { -for (String q : deleteQuery) { - writer.append(""); - XML.escapeCharData(q, writer); - writer.append(""); -} - } - writer.append(""); -} +XMLRequestWriter requestWriter = new XM
Re: [PR] SOLR-16391: Convert create-core, core-status to JAX-RS [solr]
HoustonPutman commented on PR #3054: URL: https://github.com/apache/solr/pull/3054#issuecomment-2679559395 Also `TestReplicationHandler.doTestIndexAndConfigAliasReplication`: ``` TestReplicationHandler > doTestIndexAndConfigAliasReplication FAILED java.lang.ClassCastException: class java.util.LinkedHashMap cannot be cast to class org.apache.solr.common.util.NamedList (java.util.LinkedHashMap is in module java.base of loader 'bootstrap'; org.apache.solr.common.util.NamedList is in unnamed module of loader 'app') at __randomizedtesting.SeedInfo.seed([AD42828812B09E3B:5A316CD0D45831DD]:0) at org.apache.solr.handler.TestReplicationHandler.watchCoreStartAt(TestReplicationHandler.java:1688) at org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigAliasReplication(TestReplicationHandler.java:1360) ``` We either need to fix the return type or fix the test to use the new type. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13271) Implement a read-only mode for a collection
[ https://issues.apache.org/jira/browse/SOLR-13271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17929897#comment-17929897 ] Jason Gerlowski commented on SOLR-13271: I noticed recently that MODIFYCOLLECTION allows users to set arbitrary properties for a collection, that are distinct from that collection's "[collection properties|https://solr.apache.org/guide/solr/latest/deployment-guide/collection-management.html#collectionprop]";. As far as I can tell from a bit of code spelunking, that came about as a part of this ticket. The first draft/patch here added {{readOnly}} support by adding support for arbitrary properties more generally. Later revisions pivoted to handling {{readOnly}} as a "first-class" property, similar to replicationFactor, collection.configName, etc, but the plumbing for arbitrary props stuck around into the final commit and wasn't commented on (afaict). ("Collection props" had been around for about a year at the time this ticket was merged, having been previously added by [SOLR-11960|https://issues.apache.org/jira/browse/SOLR-11960]...but they didn't have a ton of use at that point, which maybe explains why they never came up as an option here?) Anyway, my question at this point: is there a reason to keep the arbitrary-prop support in MODIFYCOLLECTION, or is there a reason to support these two different avenues of prop-setting? Absent any objections or corrections here, I'll file a separate ticket for that and discussion can continue there... > Implement a read-only mode for a collection > --- > > Key: SOLR-13271 > URL: https://issues.apache.org/jira/browse/SOLR-13271 > Project: Solr > Issue Type: New Feature >Affects Versions: 8.0, 9.0 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.1, 9.0 > > Attachments: SOLR-13271.patch, SOLR-13271.patch > > > Spin-off from SOLR-11127. In some scenarios it's useful to be able to block > any index updates to a collection, while still being able to search its > contents. > Currently the scope of this issue is SolrCloud, ie. standalone Solr will not > be supported. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17650: Fix tests for unordered buffered updates [solr]
HoustonPutman commented on code in PR #3197: URL: https://github.com/apache/solr/pull/3197#discussion_r1968316944 ## solr/test-framework/src/java/org/apache/solr/SolrTestCaseJ4.java: ## @@ -1027,6 +1028,64 @@ public static String assertJQ(SolrQueryRequest req, double delta, String... test } } + public static String assertThatJQ(SolrQueryRequest req, Matcher test) throws Exception { +return assertThatJQ(req, "", test); + } + + /** + * Validates a query completes and, using JSON deserialization, returns an object that passes the + * given Matcher test. + * + * Please use this with care: this makes it easy to match complete structures, but doing so can + * result in fragile tests if you are matching more than what you want to test. + * + * @param req Solr request to execute + * @param message Failure message for test + * @param test Matcher for the given object returned from deserializing the response + * @return The request response as a JSON String if the test matcher passes + */ + @SuppressWarnings("unchecked") + public static String assertThatJQ(SolrQueryRequest req, String message, Matcher test) Review Comment: I thought about this, but the entire test is set up to use the test harness, so I wanted to implement this test to fit in with the surrounding tests as best as possible. Adding one more convenience method to the test harness to fit the existing workflow I think is pretty minor compared to starting a whole new way for this test to behave. Deprecating the TestHarness (and actually removing it) is going to be a MAJOR ordeal, and honestly IMO changing a test to 1/2 use the test harness and 1/2 use something else will probably make that harder in the end. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17518: Deprecate UpdateRequest.getXml() and replace it with XMLRequestWriter [solr]
psalagnac commented on code in PR #3200: URL: https://github.com/apache/solr/pull/3200#discussion_r1968216711 ## solr/solrj/src/java/org/apache/solr/client/solrj/impl/XMLRequestWriter.java: ## @@ -0,0 +1,216 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.solr.client.solrj.impl; + +import java.io.BufferedWriter; +import java.io.IOException; +import java.io.OutputStream; +import java.io.OutputStreamWriter; +import java.io.Writer; +import java.nio.charset.StandardCharsets; +import java.util.ArrayList; +import java.util.Collection; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.Objects; +import java.util.Set; +import org.apache.solr.client.solrj.SolrRequest; +import org.apache.solr.client.solrj.request.RequestWriter; +import org.apache.solr.client.solrj.request.UpdateRequest; +import org.apache.solr.client.solrj.util.ClientUtils; +import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.params.ShardParams; +import org.apache.solr.common.util.ContentStream; +import org.apache.solr.common.util.XML; + +public class XMLRequestWriter extends RequestWriter { + + /** + * Use this to do a push writing instead of pull. If this method returns null {@link Review Comment: That's not new code, not sure on history. This javadoc is also in base class `ContentWriter`. I agree it does not make sense... I'll update it with this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17518: Deprecate UpdateRequest.getXml() and replace it with XMLRequestWriter [solr]
epugh commented on code in PR #3200: URL: https://github.com/apache/solr/pull/3200#discussion_r1968218397 ## solr/solrj/src/java/org/apache/solr/client/solrj/impl/XMLRequestWriter.java: ## @@ -0,0 +1,216 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.solr.client.solrj.impl; + +import java.io.BufferedWriter; +import java.io.IOException; +import java.io.OutputStream; +import java.io.OutputStreamWriter; +import java.io.Writer; +import java.nio.charset.StandardCharsets; +import java.util.ArrayList; +import java.util.Collection; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.Objects; +import java.util.Set; +import org.apache.solr.client.solrj.SolrRequest; +import org.apache.solr.client.solrj.request.RequestWriter; +import org.apache.solr.client.solrj.request.UpdateRequest; +import org.apache.solr.client.solrj.util.ClientUtils; +import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.params.ShardParams; +import org.apache.solr.common.util.ContentStream; +import org.apache.solr.common.util.XML; + +public class XMLRequestWriter extends RequestWriter { + + /** + * Use this to do a push writing instead of pull. If this method returns null {@link Review Comment: awesome! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17518: Deprecate UpdateRequest.getXml() and replace it with XMLRequestWriter [solr]
psalagnac commented on code in PR #3200: URL: https://github.com/apache/solr/pull/3200#discussion_r1968216711 ## solr/solrj/src/java/org/apache/solr/client/solrj/impl/XMLRequestWriter.java: ## @@ -0,0 +1,216 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.solr.client.solrj.impl; + +import java.io.BufferedWriter; +import java.io.IOException; +import java.io.OutputStream; +import java.io.OutputStreamWriter; +import java.io.Writer; +import java.nio.charset.StandardCharsets; +import java.util.ArrayList; +import java.util.Collection; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.Objects; +import java.util.Set; +import org.apache.solr.client.solrj.SolrRequest; +import org.apache.solr.client.solrj.request.RequestWriter; +import org.apache.solr.client.solrj.request.UpdateRequest; +import org.apache.solr.client.solrj.util.ClientUtils; +import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.params.ShardParams; +import org.apache.solr.common.util.ContentStream; +import org.apache.solr.common.util.XML; + +public class XMLRequestWriter extends RequestWriter { + + /** + * Use this to do a push writing instead of pull. If this method returns null {@link Review Comment: That's not now code, not sure on history. This javadoc is also in base class `ContentWriter`. I agree it does not make sense... I'll update it with this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17518: Deprecate UpdateRequest.getXml() and replace it with XMLRequestWriter [solr]
psalagnac commented on code in PR #3200: URL: https://github.com/apache/solr/pull/3200#discussion_r1968216711 ## solr/solrj/src/java/org/apache/solr/client/solrj/impl/XMLRequestWriter.java: ## @@ -0,0 +1,216 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.solr.client.solrj.impl; + +import java.io.BufferedWriter; +import java.io.IOException; +import java.io.OutputStream; +import java.io.OutputStreamWriter; +import java.io.Writer; +import java.nio.charset.StandardCharsets; +import java.util.ArrayList; +import java.util.Collection; +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.Objects; +import java.util.Set; +import org.apache.solr.client.solrj.SolrRequest; +import org.apache.solr.client.solrj.request.RequestWriter; +import org.apache.solr.client.solrj.request.UpdateRequest; +import org.apache.solr.client.solrj.util.ClientUtils; +import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.params.ShardParams; +import org.apache.solr.common.util.ContentStream; +import org.apache.solr.common.util.XML; + +public class XMLRequestWriter extends RequestWriter { + + /** + * Use this to do a push writing instead of pull. If this method returns null {@link Review Comment: That's not new code, not sure on history. This javadoc is also in base class `RequestWriter`. I agree it does not make sense... I'll update it with this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17518: Deprecate UpdateRequest.getXml() and replace it with XMLRequestWriter [solr]
psalagnac commented on PR #3200: URL: https://github.com/apache/solr/pull/3200#issuecomment-2679343190 > I assume you moved some things around but didn't really write code here (i.e. all code, variable names was the choice of original authors)? Yes, just moved things around. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17023: Use Modern NLP Models via ONNX and Apache OpenNLP with Solr [solr]
epugh commented on PR #1999: URL: https://github.com/apache/solr/pull/1999#issuecomment-2679572157 Yep, need to wait for Lucene 10, otherwise we get some unit test failures: ``` gradlew test --tests TestOpenNLPExtractNamedEntitiesUpdateProcessorFactory.testExtractFieldRegexReplaceAll -Dtests.seed=F5DD0B40AC590A66 -Dtests.locale=pt-GW -Dtests.timezone=PRT -Dtests.asserts=true -Dtests.file.encoding=UTF-8 ``` ``` > java.lang.NoSuchMethodError: 'opennlp.tools.util.Span[] opennlp.tools.sentdetect.SentenceDetectorME.sentPosDetect(java.lang.String)' > at __randomizedtesting.SeedInfo.seed([F5DD0B40AC590A66:4351D7F53AC9F7A4]:0) > at org.apache.lucene.analysis.opennlp.tools.NLPSentenceDetectorOp.splitSentences(NLPSentenceDetectorOp.java:41) > at org.apache.lucene.analysis.opennlp.OpenNLPSentenceBreakIterator.setText(OpenNLPSentenceBreakIterator.java:199) > at org.apache.lucene.analysis.util.SegmentingTokenizerBase.reset(SegmentingTokenizerBase.java:89) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] Finally fix TestCoordinatorRole for good. [solr]
HoustonPutman merged PR #3205: URL: https://github.com/apache/solr/pull/3205 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-14202) UpdateProcessor/also in DIH with a ScriptTransformer that does Atomic Updates leaks searchers
[ https://issues.apache.org/jira/browse/SOLR-14202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Pugh resolved SOLR-14202. -- Resolution: Won't Fix DIH has moved to https://github.com/SearchScale/dataimporthandler > UpdateProcessor/also in DIH with a ScriptTransformer that does Atomic Updates > leaks searchers > - > > Key: SOLR-14202 > URL: https://issues.apache.org/jira/browse/SOLR-14202 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 8.3, 8.4 >Reporter: Jörn Franke >Priority: Major > Attachments: eoe.zip, eoedihleak.zip > > > The data directory of a collection is growing and growing. It seems that old > segments are not deleted. They are only deleting during start of Solr. > How to reproduce. Have any collection (e.g. the example collection) and start > indexing documents. Even during the indexing the data directory is growing > significantly - much more than expected (several magnitudes). if certain > documents are updated (without significantly increasing the amount of data) > the index data directory grows again several magnitudes. Even for small > collections the needed space explodes. > This reduces significantly if Solr is stopped and then started. During > startup (not shutdown) Solr purges all those segments if not needed (* > sometimes some but not a significant amount is deleted during shutdown). This > is of course not a good workaround for normal operations. > It does not seem to have a affect on queries (their performance do not seem > to change). > The configs have not changed before the upgrade and after (e.g. from Solr 8.2 > to 8.3 to 8.4, not cross major versions), so I assume it could be related to > Solr 8.4. It may have been also in Solr 8.3 (not sure), but not in 8.2. > > IndexConfig is pretty much default: Lock type: native, autoCommit: 15000, > openSearcher=false, autoSoftCommit -1 (reproducible with autoCommit 5000). > Nevertheless, it did not happen in previous versions of Solr and the config > did not change. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-4241) Add object to SolrJ for interpreting DIH status
[ https://issues.apache.org/jira/browse/SOLR-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Pugh resolved SOLR-4241. - Resolution: Won't Fix DIH has moved to https://github.com/SearchScale/dataimporthandler > Add object to SolrJ for interpreting DIH status > --- > > Key: SOLR-4241 > URL: https://issues.apache.org/jira/browse/SOLR-4241 > Project: Solr > Issue Type: Improvement > Components: clients - java, SolrJ >Reporter: Shawn Heisey >Priority: Major > Fix For: 6.0, 4.9 > > > Objects exist in SolrJ for easy interpretation of special handlers - > SolrPing/SolrPingResponse is a prime example. I believe it would be a good > idea to add similar capabilities for easily interpreting DIH status. > The only sticky point I can see is the fact that the dataimport handler is a > contrib module. This might mean that this new capability would have to be > separated into a small jar file in a solrj contrib section. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Created] (SOLR-17682) Refactor QueryResponseWriter hierarchy to put binary at the base and add TextQueryResponseWriter sub
David Smiley created SOLR-17682: --- Summary: Refactor QueryResponseWriter hierarchy to put binary at the base and add TextQueryResponseWriter sub Key: SOLR-17682 URL: https://issues.apache.org/jira/browse/SOLR-17682 Project: Solr Issue Type: Improvement Reporter: David Smiley The QueryResponseWriter hierarchy should be inverted. Instead of Writer/Text being at the base with a subclass (BinaryResponseWriter) doing OutputStream/Binary, it should be inverted. QueryResponseWriter should have write(OutputStream,...) and there should be a subclass/interface TextResponseWriter for the textual formats. Once this is done, there are some awkward methods that do casting (a code smell) that will instead be simplified. There will be no use for QueryResponseWriterUtil. This is all best shown in a PR to see why it's better. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13731) javabin must support a 1:1 mapping of the JSON update format
[ https://issues.apache.org/jira/browse/SOLR-13731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17929615#comment-17929615 ] Noble Paul commented on SOLR-13731: --- Constructing a SolrInputDocument etc is much more complex than streaming a bunch of maps. Yes. users can just construct a payload of a javabin file and post it > javabin must support a 1:1 mapping of the JSON update format > - > > Key: SOLR-13731 > URL: https://issues.apache.org/jira/browse/SOLR-13731 > Project: Solr > Issue Type: Task >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Fix For: 8.4 > > Time Spent: 20m > Remaining Estimate: 0h > > Objects like SolrInputDocument is serialized in such a way that the size is > known in advance. All objects should ideally support streaming friendly types. > This is backward compatible . basically javabin will continue to serialize > using the old format , but will accept more efficient formats as input -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17677: HashRangeQuery doesn't NEED SolrIndexSearcher [solr]
gerlowskija commented on code in PR #3206: URL: https://github.com/apache/solr/pull/3206#discussion_r1967557670 ## solr/core/src/java/org/apache/solr/search/join/HashRangeQuery.java: ## @@ -91,6 +96,9 @@ private int[] getCache(LeafReaderContext context) throws IOException { if (cacheHelper == null) { return null; } +if (!(searcher instanceof SolrIndexSearcher)) { // e.g. delete-by-query Review Comment: [Q] How important is the caching, do you know? Is it enough of a change that it's worth documenting the degraded performance when run as a DBQ? (Not implying it is, just asking from a due-diligence perspective.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Created] (SOLR-17681) Text to Vector Filter for Indexing
Yugank created SOLR-17681: - Summary: Text to Vector Filter for Indexing Key: SOLR-17681 URL: https://issues.apache.org/jira/browse/SOLR-17681 Project: Solr Issue Type: New Feature Reporter: Yugank Scope of this issue is to introduce support for automatic text vectorisation in Apache Solr, directly in the Analyzer chain during indexing. Since solr already have the capability of Text to Vector Query Parser thanks to Alessandro Benedetti. We should start looking at a Filter that can do the encoding using the same model we have uploaded for the query parser. This would make solr self sufficient without the use of external LLM service for encoding even during index time. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-16903: Switch CoreContainer#getSolrHome to return Path instead of String [solr]
AndreyBozhko commented on code in PR #3204: URL: https://github.com/apache/solr/pull/3204#discussion_r1967834442 ## solr/core/src/java/org/apache/solr/handler/admin/SystemInfoHandler.java: ## @@ -145,7 +145,7 @@ public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throw rsp.add("zkHost", getCoreContainer(req).getZkController().getZkServerAddress()); } if (cc != null) { - rsp.add("solr_home", cc.getSolrHome()); + rsp.add("solr_home", cc.getSolrHome().toString()); Review Comment: Makes sense - the JavaBinCodec looks to be doing the same thing. https://github.com/apache/solr/blob/76c09a35dba42913a6bcb281b52b00f87564624a/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L411-L413 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17677: HashRangeQuery doesn't NEED SolrIndexSearcher [solr]
dsmiley merged PR #3206: URL: https://github.com/apache/solr/pull/3206 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17677: HashRangeQuery doesn't NEED SolrIndexSearcher [solr]
dsmiley commented on code in PR #3206: URL: https://github.com/apache/solr/pull/3206#discussion_r1967881072 ## solr/core/src/java/org/apache/solr/search/join/HashRangeQuery.java: ## @@ -91,6 +96,9 @@ private int[] getCache(LeafReaderContext context) throws IOException { if (cacheHelper == null) { return null; } +if (!(searcher instanceof SolrIndexSearcher)) { // e.g. delete-by-query Review Comment: It's telling that the cache this thing uses is completely optional. You have to go out of your way to register it in your solrconfig.xml. So this query definitely doesn't need it, and it's likely people are using this query without this cache given how easy it would be to forget to configure it or even know about it. There's no degraded performance concern since it hasn't been working at all in a DBQ :-) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17677) {!join} in delete-by-query throws ClassCastException and closes IndexWriter
[ https://issues.apache.org/jira/browse/SOLR-17677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17929815#comment-17929815 ] ASF subversion and git services commented on SOLR-17677: Commit 3a492203cf4d9b7ed9431de9378049992f7da355 in solr's branch refs/heads/main from David Smiley [ https://gitbox.apache.org/repos/asf?p=solr.git;h=3a492203cf4 ] SOLR-17677: HashRangeQuery doesn't NEED SolrIndexSearcher (#3206) > {!join} in delete-by-query throws ClassCastException and closes IndexWriter > --- > > Key: SOLR-17677 > URL: https://issues.apache.org/jira/browse/SOLR-17677 > Project: Solr > Issue Type: Improvement >Affects Versions: 9.8 >Reporter: Jason Gerlowski >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > Solr's JoinQuery implementation explicitly casts the provided "IndexSearcher" > to a "SolrIndexSearcher". In most contexts this assumption bears out, but > not always. > One counter-example is Solr's "Delete By Query" codepath, which runs the > deletion query using a "raw" Lucene IndexSearcher. (Presumably this is > because the new searcher has just been opened?). Any DBQ containing a join > query will throw a ClassCastException, which then bubbles up to the > IndexWriter as a "tragic" Lucene exception, force-closing the IndexWriter and > throwing the surrounding SolrCore in to a bad state: > {code} > 2025-02-18 19:39:25.339 ERROR (qtp1426725223-177-localhost-73) > [c:techproducts s:shard2 r:core_node3 x:techproducts_shard2_replica_n1 > t:localhost-73] o.a.s.h.RequestHandlerBase Server exception => > org.apache.solr.common.SolrException: this IndexWriter is closed > at > org.apache.solr.common.SolrException.wrapLuceneTragicExceptionIfNecessary(SolrException.java:218) > org.apache.solr.common.SolrException: this IndexWriter is closed > at > org.apache.solr.common.SolrException.wrapLuceneTragicExceptionIfNecessary(SolrException.java:218) > ~[?:?] > at > org.apache.solr.handler.RequestHandlerBase.normalizeReceivedException(RequestHandlerBase.java:272) > ~[?:?] > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:238) > ~[?:?] > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2880) ~[?:?] > at > org.apache.solr.servlet.HttpSolrCall.executeCoreRequest(HttpSolrCall.java:890) > ~[?:?] > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:576) > ~[?:?] > at > org.apache.solr.servlet.SolrDispatchFilter.dispatch(SolrDispatchFilter.java:241) > ~[?:?] > at > org.apache.solr.servlet.SolrDispatchFilter.lambda$doFilterRetry$0(SolrDispatchFilter.java:198) > ~[?:?] > at > org.apache.solr.servlet.ServletUtils.traceHttpRequestExecution2(ServletUtils.java:227) > ~[?:?] > at > org.apache.solr.servlet.ServletUtils.rateLimitRequest(ServletUtils.java:197) > ~[?:?] > at > org.apache.solr.servlet.SolrDispatchFilter.doFilterRetry(SolrDispatchFilter.java:192) > ~[?:?] > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:181) > ~[?:?] > at javax.servlet.http.HttpFilter.doFilter(HttpFilter.java:97) > ~[jetty-servlet-api-4.0.6.jar:?] > at > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:210) > ~[jetty-servlet-10.0.22.jar:10.0.22] > at > org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1635) > ~[jetty-servlet-10.0.22.jar:10.0.22] > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:527) > ~[jetty-servlet-10.0.22.jar:10.0.22] > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:131) > ~[jetty-server-10.0.22.jar:10.0.22] > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:598) > ~[jetty-security-10.0.22.jar:10.0.22] > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) > ~[jetty-server-10.0.22.jar:10.0.22] > at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:223) > ~[jetty-server-10.0.22.jar:10.0.22] > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1580) > ~[jetty-server-10.0.22.jar:10.0.22] > at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:221) > ~[jetty-server-10.0.22.jar:10.0.22] > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1384) > ~[jetty-server-10.0.22.jar:10.0.22] > at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:176) > ~[jetty-server-10.0.22.j
Re: [PR] SOLR-16903: Switch CoreContainer#getSolrHome to return Path instead of String [solr]
dsmiley commented on PR #3204: URL: https://github.com/apache/solr/pull/3204#issuecomment-2678884231 I plan to merge this tonight; it's very straightforward -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17669: Reduce Memory Consumption by 80-90% when using Dynamic fields (DocumentObjectBinder) [solr]
dsmiley commented on PR #3179: URL: https://github.com/apache/solr/pull/3179#issuecomment-2678890820 I'll merge this tonight. If you don't get to CHANGES.txt; I'll do it. I edited the JIRA issue description to better identify what this is about; it was confusing to speak of "dynamic fields" -- everyone will think you mean the schema itself. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17669: Reduce Memory Consumption by 80-90% when using Dynamic fields (DocumentObjectBinder) [solr]
ds-manzinger commented on PR #3179: URL: https://github.com/apache/solr/pull/3179#issuecomment-2678911271 Hi, i commited changes.txt about 1 hour ago -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17023: Use Modern NLP Models via ONNX and Apache OpenNLP with Solr [solr]
epugh commented on code in PR #1999: URL: https://github.com/apache/solr/pull/1999#discussion_r1967948785 ## solr/modules/analysis-extras/src/java/org/apache/solr/update/processor/DocumentCategorizerUpdateProcessorFactory.java: ## @@ -0,0 +1,569 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.update.processor; + +import static org.apache.solr.common.SolrException.ErrorCode.SERVER_ERROR; + +import ai.onnxruntime.OrtException; +import java.io.File; +import java.io.IOException; +import java.lang.invoke.MethodHandles; +import java.nio.file.Files; +import java.nio.file.Path; +import java.nio.file.Paths; +import java.util.ArrayList; +import java.util.Collection; +import java.util.Collections; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.regex.Matcher; +import java.util.regex.Pattern; +import java.util.regex.PatternSyntaxException; +import opennlp.dl.InferenceOptions; +import opennlp.dl.doccat.DocumentCategorizerDL; +import opennlp.dl.doccat.scoring.AverageClassificationScoringStrategy; +import org.apache.solr.common.SolrException; +import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.SolrInputField; +import org.apache.solr.common.util.NamedList; +import org.apache.solr.common.util.Pair; +import org.apache.solr.core.SolrCore; +import org.apache.solr.filestore.ClusterFileStore; +import org.apache.solr.filestore.FileStore; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.response.SolrQueryResponse; +import org.apache.solr.update.AddUpdateCommand; +import org.apache.solr.update.processor.FieldMutatingUpdateProcessor.FieldNameSelector; +import org.apache.solr.update.processor.FieldMutatingUpdateProcessorFactory.SelectorParams; +import org.apache.solr.util.plugin.SolrCoreAware; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +public class DocumentCategorizerUpdateProcessorFactory extends UpdateRequestProcessorFactory +implements SolrCoreAware { + + private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + + public static final String SOURCE_PARAM = "source"; + public static final String DEST_PARAM = "dest"; + public static final String PATTERN_PARAM = "pattern"; + public static final String REPLACEMENT_PARAM = "replacement"; + public static final String MODEL_PARAM = "modelFile"; + public static final String VOCAB_PARAM = "vocabFile"; + + private Path solrHome; + + private SelectorParams srcInclusions = new SelectorParams(); + private Collection srcExclusions = new ArrayList<>(); + + private FieldNameSelector srcSelector = null; + + private String model = null; + private String vocab = null; + private String analyzerFieldType = null; + + /** + * If pattern is null, this this is a literal field name. If pattern is non-null then this is a + * replacement string that may contain meta-characters (ie: capture group identifiers) + * + * @see #pattern + */ + private String dest = null; + + /** + * @see #dest + */ + private Pattern pattern = null; + + protected final FieldNameSelector getSourceSelector() { +if (null != srcSelector) return srcSelector; + +throw new SolrException( +SERVER_ERROR, "selector was never initialized, inform(SolrCore) never called???"); + } + + @Override + public void init(NamedList args) { + +// high level (loose) check for which type of config we have. +// +// individual init methods do more strict syntax checking +if (0 <= args.indexOf(SOURCE_PARAM, 0) && 0 <= args.indexOf(DEST_PARAM, 0)) { + initSourceSelectorSyntax(args); +} else if (0 <= args.indexOf(PATTERN_PARAM, 0) && 0 <= args.indexOf(REPLACEMENT_PARAM, 0)) { + initSimpleRegexReplacement(args); +} else { + throw new SolrException( + SERVER_ERROR, + "A combination of either '" + + SOURCE_PARAM + + "' + '" + + DEST_PARAM + + "', or '" + + REPLACEMENT_PARAM + + "' + '" + + PATTERN_PARAM + + "' init params are mandatory"); +} + +Object
[jira] [Commented] (SOLR-13731) javabin must support a 1:1 mapping of the JSON update format
[ https://issues.apache.org/jira/browse/SOLR-13731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17929833#comment-17929833 ] David Smiley commented on SOLR-13731: - There are many classes in SolrJ and it's not always clear to us/users which ones are internal, not to mention that in this case, we may disagree. I don't see why a user using SolrJ would go out of their way to use JavaBinCodec as you describe. Constructing a SolrInputDocument is easy :). {{ConcurrentUpdateHttp2SolrClient}} streams updates. > javabin must support a 1:1 mapping of the JSON update format > - > > Key: SOLR-13731 > URL: https://issues.apache.org/jira/browse/SOLR-13731 > Project: Solr > Issue Type: Task >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Fix For: 8.4 > > Time Spent: 20m > Remaining Estimate: 0h > > Objects like SolrInputDocument is serialized in such a way that the size is > known in advance. All objects should ideally support streaming friendly types. > This is backward compatible . basically javabin will continue to serialize > using the old format , but will accept more efficient formats as input -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[PR] Add CVE-2024-6763 to our vex file [solr-site]
gerlowskija opened a new pull request, #143: URL: https://github.com/apache/solr-site/pull/143 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org