date:20241209

Re: [PR] [WIP] Jetty12 + EE8 [solr]

2024-12-09 Thread via GitHub



epugh commented on PR #2876:
URL: https://github.com/apache/solr/pull/2876#issuecomment-2527829121

   > I picked changes from #2835 i.e. Hadoop-auth module being removed and this 
one and pushed a new branch to my fork.
   > 
   > Branch contains code for Jetty 12 + EE10 (Jakarta servlet api) 
https://github.com/iamsanjay/solr/tree/jetty12_ee10
   > 
   > I started working on ee8 because hadoop-auth module does not have any 
jakarta api support, however If #2835 will be merged then I believe we can 
transitioned to ee10 without any issue.
   
   I've been sitting on merging the code, for no really good reason..   Thanks 
for highlighting that it's a bit of a blocker to your progress, I will update 
the PR and merge it today.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-15546: ShortestPathStream to handle ids with colons. [solr]

2024-12-09 Thread via GitHub



epugh commented on PR #236:
URL: https://github.com/apache/solr/pull/236#issuecomment-2527855227

   I'd love to see this get into 10 without the `legacyJoin` stuff, since that 
seems hard to explain?   Thoughts?   If you don't have bandwidth/energy on this 
one, let me know and I can take it on..


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-15625 Improve documentation for the benchmark module. [solr]

2024-12-09 Thread via GitHub



epugh commented on PR #406:
URL: https://github.com/apache/solr/pull/406#issuecomment-2527859524

   I might have violated etiquette, but I emailed @markrmiller  directly ;-)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] User Behavior Insights implementation for Apache Solr [solr]

2024-12-09 Thread via GitHub



epugh commented on code in PR #2452:
URL: https://github.com/apache/solr/pull/2452#discussion_r1875948497


##
solr/core/src/java/org/apache/solr/handler/component/UBIComponent.java:
##
@@ -0,0 +1,424 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.handler.component;
+
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.io.LineNumberReader;
+import java.lang.invoke.MethodHandles;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import org.apache.solr.client.solrj.io.SolrClientCache;
+import org.apache.solr.client.solrj.io.Tuple;
+import org.apache.solr.client.solrj.io.stream.StreamContext;
+import org.apache.solr.client.solrj.io.stream.TupleStream;
+import org.apache.solr.client.solrj.io.stream.expr.DefaultStreamFactory;
+import org.apache.solr.client.solrj.io.stream.expr.StreamExpression;
+import org.apache.solr.client.solrj.io.stream.expr.StreamExpressionParser;
+import org.apache.solr.client.solrj.io.stream.expr.StreamFactory;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.params.SolrParams;
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.common.util.SimpleOrderedMap;
+import org.apache.solr.core.CoreContainer;
+import org.apache.solr.core.PluginInfo;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.handler.LoggingStream;
+import org.apache.solr.response.ResultContext;
+import org.apache.solr.schema.IndexSchema;
+import org.apache.solr.search.DocIterator;
+import org.apache.solr.search.DocList;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.plugin.SolrCoreAware;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * User Behavior Insights (UBI) is an open standard for gathering query and 
event data from users
+ * and storing it in a structured format. UBI can be used for in session 
personalization, implicit
+ * judgements, powering recommendation systems among others. Learn more about 
the UBI standard at https://ubisearch.dev";>https://ubisearch.dev.
+ *
+ * The response from Solr is augmented by this component, and optionally 
the query details can be
+ * tracked and logged to various systems including log files or other backend 
systems.
+ *
+ * Data tracked is a unique query_id for the search request, the end user's 
query, metadata about
+ * the query as a JSON map, and the resulting document id's.
+ *
+ * You provide a streaming expression that is parsed and loaded by the 
component to stream query
+ * data to a target of your choice. If you do not, then the default expression 
of
+ * 'logging(ubi_queries.jsonl,ubiQuery())"' is used which logs data to
+ * $SOLR_HOME/userfiles/ubi_queries.jsonl file.
+ *
+ * You must source your streaming events using the 'ubiQuery()' streaming 
expression to retrieve
+ * the {@link UBIQuery} object that contains the data for recording.
+ *
+ * Event data is tracked by letting the user write events directly to the 
event repository of
+ * your choice, it could be a Solr collection, it could be a file or S3 
bucket, and that is NOT
+ * handled by this component.
+ *
+ * Add the component to a requestHandler in solrconfig.xml like this:
+ *
+ * 
+ * 
+ *
+ * 
+ *   
+ *
+ * ...
+ *
+ *   
+ *   
+ * ubi
+ *   
+ * 
+ *
+ * It can then be enabled at query time by supplying
+ *
+ * ubi=true
+ *
+ * query parameter.
+ *
+ * Ideally this component is used with the JSON Query syntax, as that 
facilitates passing in the
+ * additional data to be tracked with a query. Here is an example:
+ *
+ * 
+ * {
+ * "query" : "apple AND ipod",
+ * "limit":2,
+ * "start":2,
+ * "filter": [
+ *"inStock:true"
+ *  ]
+ * params: {
+ *   "ubi": "true"
+ *   "user_query": "Apple iPod",
+ *   "query_attributes": {
+ * "experiment_name": "super_secret",
+ * "page": 2

Re: [PR] User Behavior Insights implementation for Apache Solr [solr]

2024-12-09 Thread via GitHub



epugh commented on code in PR #2452:
URL: https://github.com/apache/solr/pull/2452#discussion_r1875956112


##
solr/core/src/java/org/apache/solr/handler/component/UBIComponent.java:
##
@@ -0,0 +1,424 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.handler.component;
+
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.io.LineNumberReader;
+import java.lang.invoke.MethodHandles;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import org.apache.solr.client.solrj.io.SolrClientCache;
+import org.apache.solr.client.solrj.io.Tuple;
+import org.apache.solr.client.solrj.io.stream.StreamContext;
+import org.apache.solr.client.solrj.io.stream.TupleStream;
+import org.apache.solr.client.solrj.io.stream.expr.DefaultStreamFactory;
+import org.apache.solr.client.solrj.io.stream.expr.StreamExpression;
+import org.apache.solr.client.solrj.io.stream.expr.StreamExpressionParser;
+import org.apache.solr.client.solrj.io.stream.expr.StreamFactory;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.params.SolrParams;
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.common.util.SimpleOrderedMap;
+import org.apache.solr.core.CoreContainer;
+import org.apache.solr.core.PluginInfo;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.handler.LoggingStream;
+import org.apache.solr.response.ResultContext;
+import org.apache.solr.schema.IndexSchema;
+import org.apache.solr.search.DocIterator;
+import org.apache.solr.search.DocList;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.plugin.SolrCoreAware;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * User Behavior Insights (UBI) is an open standard for gathering query and 
event data from users
+ * and storing it in a structured format. UBI can be used for in session 
personalization, implicit
+ * judgements, powering recommendation systems among others. Learn more about 
the UBI standard at https://ubisearch.dev";>https://ubisearch.dev.
+ *
+ * The response from Solr is augmented by this component, and optionally 
the query details can be
+ * tracked and logged to various systems including log files or other backend 
systems.
+ *
+ * Data tracked is a unique query_id for the search request, the end user's 
query, metadata about
+ * the query as a JSON map, and the resulting document id's.
+ *
+ * You provide a streaming expression that is parsed and loaded by the 
component to stream query
+ * data to a target of your choice. If you do not, then the default expression 
of
+ * 'logging(ubi_queries.jsonl,ubiQuery())"' is used which logs data to
+ * $SOLR_HOME/userfiles/ubi_queries.jsonl file.
+ *
+ * You must source your streaming events using the 'ubiQuery()' streaming 
expression to retrieve
+ * the {@link UBIQuery} object that contains the data for recording.
+ *
+ * Event data is tracked by letting the user write events directly to the 
event repository of
+ * your choice, it could be a Solr collection, it could be a file or S3 
bucket, and that is NOT
+ * handled by this component.
+ *
+ * Add the component to a requestHandler in solrconfig.xml like this:
+ *
+ * 
+ * 
+ *
+ * 
+ *   
+ *
+ * ...
+ *
+ *   
+ *   
+ * ubi
+ *   
+ * 
+ *
+ * It can then be enabled at query time by supplying
+ *
+ * ubi=true
+ *
+ * query parameter.
+ *
+ * Ideally this component is used with the JSON Query syntax, as that 
facilitates passing in the
+ * additional data to be tracked with a query. Here is an example:
+ *
+ * 
+ * {
+ * "query" : "apple AND ipod",
+ * "limit":2,
+ * "start":2,
+ * "filter": [
+ *"inStock:true"
+ *  ]
+ * params: {
+ *   "ubi": "true"
+ *   "user_query": "Apple iPod",
+ *   "query_attributes": {
+ * "experiment_name": "super_secret",
+ * "page": 2

Re: [PR] [WIP] Jetty12 + EE8 [solr]

2024-12-09 Thread via GitHub



iamsanjay commented on PR #2876:
URL: https://github.com/apache/solr/pull/2876#issuecomment-2527421496

   I picked changes from #2835 i.e. Hadoop-auth module being removed and this 
one and pushed a new branch to my fork.
   
   https://github.com/iamsanjay/solr/tree/jetty12_ee10
   
   I started working on ee8 because hadoop-auth module does not have any 
jakarta api support, however If #2835 will be merged then I believe we can 
transitioned to ee10 without any issue. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [I] [Regression] security.json is not uploaded during the first initialization of SolrCloud [solr-operator]

2024-12-09 Thread via GitHub



janhoy commented on issue #720:
URL: https://github.com/apache/solr-operator/issues/720#issuecomment-2527357456

   I created https://github.com/apache/solr-operator/issues/731 to track this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] User Behavior Insights implementation for Apache Solr [solr]

2024-12-09 Thread via GitHub



mkhludnev commented on code in PR #2452:
URL: https://github.com/apache/solr/pull/2452#discussion_r1875992464


##
solr/core/src/java/org/apache/solr/handler/component/UBIComponent.java:
##
@@ -0,0 +1,424 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.handler.component;
+
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.io.LineNumberReader;
+import java.lang.invoke.MethodHandles;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import org.apache.solr.client.solrj.io.SolrClientCache;
+import org.apache.solr.client.solrj.io.Tuple;
+import org.apache.solr.client.solrj.io.stream.StreamContext;
+import org.apache.solr.client.solrj.io.stream.TupleStream;
+import org.apache.solr.client.solrj.io.stream.expr.DefaultStreamFactory;
+import org.apache.solr.client.solrj.io.stream.expr.StreamExpression;
+import org.apache.solr.client.solrj.io.stream.expr.StreamExpressionParser;
+import org.apache.solr.client.solrj.io.stream.expr.StreamFactory;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.params.SolrParams;
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.common.util.SimpleOrderedMap;
+import org.apache.solr.core.CoreContainer;
+import org.apache.solr.core.PluginInfo;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.handler.LoggingStream;
+import org.apache.solr.response.ResultContext;
+import org.apache.solr.schema.IndexSchema;
+import org.apache.solr.search.DocIterator;
+import org.apache.solr.search.DocList;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.plugin.SolrCoreAware;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * User Behavior Insights (UBI) is an open standard for gathering query and 
event data from users
+ * and storing it in a structured format. UBI can be used for in session 
personalization, implicit
+ * judgements, powering recommendation systems among others. Learn more about 
the UBI standard at https://ubisearch.dev";>https://ubisearch.dev.
+ *
+ * The response from Solr is augmented by this component, and optionally 
the query details can be
+ * tracked and logged to various systems including log files or other backend 
systems.
+ *
+ * Data tracked is a unique query_id for the search request, the end user's 
query, metadata about
+ * the query as a JSON map, and the resulting document id's.
+ *
+ * You provide a streaming expression that is parsed and loaded by the 
component to stream query
+ * data to a target of your choice. If you do not, then the default expression 
of
+ * 'logging(ubi_queries.jsonl,ubiQuery())"' is used which logs data to
+ * $SOLR_HOME/userfiles/ubi_queries.jsonl file.
+ *
+ * You must source your streaming events using the 'ubiQuery()' streaming 
expression to retrieve
+ * the {@link UBIQuery} object that contains the data for recording.
+ *
+ * Event data is tracked by letting the user write events directly to the 
event repository of
+ * your choice, it could be a Solr collection, it could be a file or S3 
bucket, and that is NOT
+ * handled by this component.
+ *
+ * Add the component to a requestHandler in solrconfig.xml like this:
+ *
+ * 
+ * 
+ *
+ * 
+ *   
+ *
+ * ...
+ *
+ *   
+ *   
+ * ubi
+ *   
+ * 
+ *
+ * It can then be enabled at query time by supplying
+ *
+ * ubi=true
+ *
+ * query parameter.
+ *
+ * Ideally this component is used with the JSON Query syntax, as that 
facilitates passing in the
+ * additional data to be tracked with a query. Here is an example:
+ *
+ * 
+ * {
+ * "query" : "apple AND ipod",
+ * "limit":2,
+ * "start":2,
+ * "filter": [
+ *"inStock:true"
+ *  ]
+ * params: {
+ *   "ubi": "true"
+ *   "user_query": "Apple iPod",
+ *   "query_attributes": {
+ * "experiment_name": "super_secret",
+ * "page

Re: [PR] User Behavior Insights implementation for Apache Solr [solr]

2024-12-09 Thread via GitHub



mkhludnev commented on code in PR #2452:
URL: https://github.com/apache/solr/pull/2452#discussion_r1876020574


##
solr/core/src/java/org/apache/solr/handler/component/UBIComponent.java:
##
@@ -0,0 +1,424 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.handler.component;
+
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.io.LineNumberReader;
+import java.lang.invoke.MethodHandles;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import org.apache.solr.client.solrj.io.SolrClientCache;
+import org.apache.solr.client.solrj.io.Tuple;
+import org.apache.solr.client.solrj.io.stream.StreamContext;
+import org.apache.solr.client.solrj.io.stream.TupleStream;
+import org.apache.solr.client.solrj.io.stream.expr.DefaultStreamFactory;
+import org.apache.solr.client.solrj.io.stream.expr.StreamExpression;
+import org.apache.solr.client.solrj.io.stream.expr.StreamExpressionParser;
+import org.apache.solr.client.solrj.io.stream.expr.StreamFactory;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.params.SolrParams;
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.common.util.SimpleOrderedMap;
+import org.apache.solr.core.CoreContainer;
+import org.apache.solr.core.PluginInfo;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.handler.LoggingStream;
+import org.apache.solr.response.ResultContext;
+import org.apache.solr.schema.IndexSchema;
+import org.apache.solr.search.DocIterator;
+import org.apache.solr.search.DocList;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.plugin.SolrCoreAware;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * User Behavior Insights (UBI) is an open standard for gathering query and 
event data from users
+ * and storing it in a structured format. UBI can be used for in session 
personalization, implicit
+ * judgements, powering recommendation systems among others. Learn more about 
the UBI standard at https://ubisearch.dev";>https://ubisearch.dev.
+ *
+ * The response from Solr is augmented by this component, and optionally 
the query details can be
+ * tracked and logged to various systems including log files or other backend 
systems.
+ *
+ * Data tracked is a unique query_id for the search request, the end user's 
query, metadata about
+ * the query as a JSON map, and the resulting document id's.
+ *
+ * You provide a streaming expression that is parsed and loaded by the 
component to stream query
+ * data to a target of your choice. If you do not, then the default expression 
of
+ * 'logging(ubi_queries.jsonl,ubiQuery())"' is used which logs data to
+ * $SOLR_HOME/userfiles/ubi_queries.jsonl file.
+ *
+ * You must source your streaming events using the 'ubiQuery()' streaming 
expression to retrieve
+ * the {@link UBIQuery} object that contains the data for recording.
+ *
+ * Event data is tracked by letting the user write events directly to the 
event repository of
+ * your choice, it could be a Solr collection, it could be a file or S3 
bucket, and that is NOT
+ * handled by this component.
+ *
+ * Add the component to a requestHandler in solrconfig.xml like this:
+ *
+ * 
+ * 
+ *
+ * 
+ *   
+ *
+ * ...
+ *
+ *   
+ *   
+ * ubi
+ *   
+ * 
+ *
+ * It can then be enabled at query time by supplying
+ *
+ * ubi=true
+ *
+ * query parameter.
+ *
+ * Ideally this component is used with the JSON Query syntax, as that 
facilitates passing in the
+ * additional data to be tracked with a query. Here is an example:
+ *
+ * 
+ * {
+ * "query" : "apple AND ipod",
+ * "limit":2,
+ * "start":2,
+ * "filter": [
+ *"inStock:true"
+ *  ]
+ * params: {
+ *   "ubi": "true"
+ *   "user_query": "Apple iPod",
+ *   "query_attributes": {
+ * "experiment_name": "super_secret",
+ * "page

Re: [PR] SOLR-15625 Improve documentation for the benchmark module. [solr]

2024-12-09 Thread via GitHub



markrmiller commented on code in PR #406:
URL: https://github.com/apache/solr/pull/406#discussion_r1876519771


##
solr/benchmark/docs/jmh-profilers-setup.md:
##
@@ -0,0 +1,406 @@
+
+
+# JMH Profiler Setup (Async-Profiler and Perfasm)
+
+JMH ships with a number of built-in profiler options that have grown in number 
over time. The profiler system is also pluggable,
+allowing for "after-market" profiler implementations to be added on the fly.
+
+Many of these profilers, most often the ones that stay in the realm of Java, 
will work across platforms and architectures and do
+so right out of the box. Others may be targeted at a specific OS, though there 
is a good chance a similar profiler for other OS's
+may exist where possible. A couple of very valuable profilers also require 
additional setup and environment to either work fully
+or at all.
+
+[TODO: link to page that only lists commands with simple section]
+
+- [JMH Profiler Setup (Async-Profiler and 
Perfasm)](#jmh-profiler-setup-async-profiler-and-perfasm)
+- [Async-Profiler](#async-profiler)
+  - [Install async-profiler](#install-async-profiler)
+  - [Install Java Debug Symbols](#install-java-debug-symbols)
+- [Ubuntu](#ubuntu)
+- [Arch](#arch)
+  - [Perfasm](#perfasm)
+- [Arch](#arch-1)
+- [Ubuntu](#ubuntu-1)
+
+
+This guide will cover setting up both the async-profiler and the Perfasm 
profiler. Currently, we roughly cover two Linux family trees,
+but much of the information can be extrapolated or help point in the right 
direction for other systems.
+
+ 
+
+|Path 1: Arch, Manjaro, etc|Path 2: Debian, Ubuntu, etc|
+| 
:--:
 | 
:--:
 |
+| https://user-images.githubusercontent.com/448788/137563725-0195a732-da40-4c8b-a5e8-fd904a43bb79.png"/>https://user-images.githubusercontent.com/448788/137563722-665de88f-46a4-4939-88b0-3f96e56989ea.png"/>
 | https://user-images.githubusercontent.com/448788/137563909-6c2d2729-2747-47a0-b2bd-f448a958b5be.png"/>https://user-images.githubusercontent.com/448788/137563908-738a7431-88db-47b0-96a4-baaed7e5024b.png"/>
 |
+
+
+
+If you run `jmh.sh` with the `-lprof` argument, it will make an attempt to 
only list the profilers that it detects will work in your particular 
environment.
+
+You should do this first to see where you stand.
+
+
+
+
+![](https://user-images.githubusercontent.com/448788/137610116-eff6d0b7-e862-40fb-af04-452aaf585387.png)
+
+```Shell
+./jmh.sh -lprof` 
+```
+
+
+
+
+
+
+In our case, we will start with very **minimal** Arch and Ubuntu clean 
installations, and so we already know there is _**no chance**_ that 
async-profiler or Perfasm
+are going to run.
+
+In fact, first we have to install a few project build requirements before 
thinking too much about JMH profiler support.
+
+We will run on **Arch/Manjaro**, but there should not be any difference than 
on **Debian/Ubuntu** for this stage.
+
+
+
+![](https://user-images.githubusercontent.com/448788/137610116-eff6d0b7-e862-40fb-af04-452aaf585387.png)
+
+```Shell
+sudo pacman -S wget jdk-openjdk11
+```
+
+
+
+
+
+Here we give **async-profiler** a try on **Arch** anyway and observe the 
failure indicating that we need to obtain the async-profiler library and
+put it in the correct location at a minimum.
+
+
+
+![](https://user-images.githubusercontent.com/448788/137607441-f083e1fe-b3e5-4326-a9ca-2109c9cef985.png)
+
+```Shell
+./jmh.sh BenchMark -prof async
+```
+
+
+   https://user-images.githubusercontent.com/448788/137534191-01c2bc7a-5c1f-42a2-8d66-a5d1a5280db4.png"/>
  Profilers failed to initialize, exiting.
+
+Unable to load async-profiler. Ensure asyncProfiler library is on 
LD_LIBRARY_PATH (Linux)
+DYLD_LIBRARY_PATH (Mac OS), or -Djava.library.path.
+
+Alternatively, point to explicit library location with: '-prof 
async:libPath={path}'
+
+no asyncProfiler in java.library.path: [/usr/java/packages/lib, 
/usr/lib64, /lib64, /lib, /usr/lib]
+
+
+
+
+### Async-Profiler
+
+ Install async-profiler
+
+
+
+![](https://user-images.githubusercontent.com/448788/137610116-eff6d0b7-e862-40fb-af04-452aaf585387.png)
+
+```Shell
+wget -c 
https://github.com/jvm-profiling-tools/async-profiler/releases/download/v2.5/async-profiler-2.5-linux-x64.tar.gz
 -O - | tar -xz
+sudo mkdir -p /usr/java/packages/lib
+sudo cp async-profiler-2.5-linux-x64/build/* /usr/java/packages/lib
+```
+
+
+
+
+
+That should work out better, but there is still an issue that will prevent a 
successful profiling run. async-profiler relies on Linux's perf,
+and in any recent

[jira] [Created] (SOLR-17587) Prometheus Writer Duplicate TYPE exposition format

2024-12-09 Thread Matthew Biscocho (Jira)

Matthew Biscocho created SOLR-17587:
---

 Summary: Prometheus Writer Duplicate TYPE exposition format
 Key: SOLR-17587
 URL: https://issues.apache.org/jira/browse/SOLR-17587
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 9.7
Reporter: Matthew Biscocho


Solr's Prometheus writer duplicates `# TYPE  ` in it's exposition format for core registry metrics.

For example this appears twice in it's output:

 
{code:java}
# TYPE solr_metrics_core_average_request_time gauge
solr_metrics_core_average_request_time{category="ADMIN",collection="foo",core="core_foo_shard9_replica_t351",handler="/admin/file",replica="replica_t351",shard="shard9"}
 0.0 
...
# TYPE solr_metrics_core_average_request_time gauge{code}
 

 

This is technically not allowed per [Prometheus Exposition 
format|https://github.com/prometheus/docs/blob/main/content/docs/instrumenting/exposition_formats.md#:~:text=Only%20one%20TYPE%20line%20may%20exist%20for%20a%20given%20metric%20name.]

This happens because each Dropwizard registry is per core, but for Prometheus 
compatible exposition format upon exporting, it needs to be 1 registry for all 
cores on a single host, otherwise there will be duplicate `TYPE` formats even 
though all metrics are unique for its tags/attributes.

Funnily enough, prometheus upstream collector does not do this verification and 
accepts the metrics anyways just fine Solr -> Prometheus -> Grafana.

But depending on the technologies prometheus exposition verification, this will 
fail. For example 
[Telegraf|https://github.com/influxdata/telegraf/blob/master/plugins/inputs/prometheus/README.md]:
{code:java}
-12-09T16:56:01Z E! [inputs.prometheus] Error in plugin: error reading metrics 
for "http://127.0.0.1:8983/solr/admin/metrics?wt=prometheus": decoding response 
failed: text format parsing error in line 568: second TYPE line for metric name 
"solr_metrics_core_average_request_time", or TYPE reported after samples {code}
This shouldn't be a blocker if you are pushing metrics to prometheus collector 
directly.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Resolved] (SOLR-17540) Remove hadoop-auth module

2024-12-09 Thread Eric Pugh (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-17540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Pugh resolved SOLR-17540.
--
Fix Version/s: main (10.0)
 Assignee: Eric Pugh
   Resolution: Fixed

Hadoop Auth removed in 10.  No backporting to 9x.

> Remove hadoop-auth module
> -
>
> Key: SOLR-17540
> URL: https://issues.apache.org/jira/browse/SOLR-17540
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: main (10.0)
>Reporter: Eric Pugh
>Assignee: Eric Pugh
>Priority: Minor
>  Labels: pull-request-available
> Fix For: main (10.0)
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> One of the outcomes of the 2024 Community Survey is that we learned (in our 
> admittedly fairly unscientific) responses that hadoop-auth is not used.
> This PR is to understand the impact of removing hadoop-auth in Solr 10.
> See [https://lists.apache.org/thread/vnd73j0nq3losfc17lzqp48g10r5tdgg] for 
> outreach to users mailing list.
> See [https://lists.apache.org/thread/lltc0wjdghq18tt37zlrsd8ty35qsytl] for 
> discussion on Dev.
>  
> I won't merge this PR till we have more consensus.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-17540) Remove hadoop-auth module

2024-12-09 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-17540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17904250#comment-17904250
 ] 

ASF subversion and git services commented on SOLR-17540:


Commit cf68a7f9ec2878931a0386069c4b326cdd70f75c in solr's branch 
refs/heads/main from Eric Pugh
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=cf68a7f9ec2 ]

SOLR-17540: Remove Hadoop Auth Module (#2835)

Remove the module hadoop-auth, and supporting code, such as the customizations 
in the Solr Admin app for Kerberios based login.Updates the bin/solr auth 
tool to just support basic auth, but leaves the overall structure to faciliate 
adding other auth types like JWT in the future.   Removes Kerberos specific 
functions from HttpSolrClient.  useShortName is removed as only Kerberos 
supported it.

-

Co-authored-by: Christos Malliaridis 

> Remove hadoop-auth module
> -
>
> Key: SOLR-17540
> URL: https://issues.apache.org/jira/browse/SOLR-17540
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: main (10.0)
>Reporter: Eric Pugh
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> One of the outcomes of the 2024 Community Survey is that we learned (in our 
> admittedly fairly unscientific) responses that hadoop-auth is not used.
> This PR is to understand the impact of removing hadoop-auth in Solr 10.
> See [https://lists.apache.org/thread/vnd73j0nq3losfc17lzqp48g10r5tdgg] for 
> outreach to users mailing list.
> See [https://lists.apache.org/thread/lltc0wjdghq18tt37zlrsd8ty35qsytl] for 
> discussion on Dev.
>  
> I won't merge this PR till we have more consensus.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] [WIP] Jetty12 + EE8 [solr]

2024-12-09 Thread via GitHub



epugh commented on PR #2876:
URL: https://github.com/apache/solr/pull/2876#issuecomment-2529064270

   > > I picked changes from #2835 i.e. Hadoop-auth module being removed and 
this one and pushed a new branch to my fork.
   > > Branch contains code for Jetty 12 + EE10 (Jakarta servlet api) 
https://github.com/iamsanjay/solr/tree/jetty12_ee10
   > > I started working on ee8 because hadoop-auth module does not have any 
jakarta api support, however If #2835 will be merged then I believe we can 
transitioned to ee10 without any issue.
   > 
   > I've been sitting on merging the code, for no really good reason.. Thanks 
for highlighting that it's a bit of a blocker to your progress, I will update 
the PR and merge it today.
   
   Merge is done!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17541: LBSolrClient implementations should agree on 'getClient()' semantics [solr]

2024-12-09 Thread via GitHub



dsmiley commented on code in PR #2899:
URL: https://github.com/apache/solr/pull/2899#discussion_r1876531872


##
solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBHttp2SolrClient.java:
##
@@ -94,35 +97,74 @@
  *
  * @since solr 8.0
  */
-public class LBHttp2SolrClient extends 
LBSolrClient {
+public class LBHttp2SolrClient> 
extends LBSolrClient {
 
-  protected final C solrClient;
+  private final Map urlToClient;
+  private final Set urlParamNames;
+
+  private final HttpSolrClientBuilderBase solrClientBuilder;
 
   @SuppressWarnings("unchecked")
   private LBHttp2SolrClient(Builder builder) {
 super(Arrays.asList(builder.solrEndpoints));
-this.solrClient = (C) builder.solrClient;
+this.solrClientBuilder = builder.solrClientBuilder;
+
+this.urlToClient = new ConcurrentHashMap<>();
+for (LBSolrClient.Endpoint endpoint : builder.solrEndpoints) {
+  buildClient(endpoint);
+}
+
 this.aliveCheckIntervalMillis = builder.aliveCheckIntervalMillis;
 this.defaultCollection = builder.defaultCollection;
+
+if (builder.solrClientBuilder.urlParamNames == null) {
+  this.urlParamNames = Collections.emptySet();
+} else {
+  this.urlParamNames = 
Collections.unmodifiableSet(builder.solrClientBuilder.urlParamNames);
+}
+  }
+
+  private synchronized HttpSolrClientBase buildClient(Endpoint endpoint) {
+var client = urlToClient.get(endpoint.toString());
+if (client == null) {
+  String tmpBaseSolrUrl = solrClientBuilder.baseSolrUrl;
+  solrClientBuilder.baseSolrUrl = endpoint.getBaseUrl();
+  client = solrClientBuilder.build();
+  urlToClient.put(endpoint.getBaseUrl(), client);
+  solrClientBuilder.baseSolrUrl = tmpBaseSolrUrl;
+}
+return client;

Review Comment:
   Prefer calling ConcurrentHashMap.computeIfAbsent or similar to get-then-put 
because of it's nice atomicity properties, and avoids synchronization needs



##
solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBHttp2SolrClient.java:
##
@@ -94,35 +97,74 @@
  *
  * @since solr 8.0
  */
-public class LBHttp2SolrClient extends 
LBSolrClient {
+public class LBHttp2SolrClient> 
extends LBSolrClient {
 
-  protected final C solrClient;
+  private final Map urlToClient;
+  private final Set urlParamNames;
+
+  private final HttpSolrClientBuilderBase solrClientBuilder;
 
   @SuppressWarnings("unchecked")
   private LBHttp2SolrClient(Builder builder) {
 super(Arrays.asList(builder.solrEndpoints));
-this.solrClient = (C) builder.solrClient;
+this.solrClientBuilder = builder.solrClientBuilder;
+
+this.urlToClient = new ConcurrentHashMap<>();
+for (LBSolrClient.Endpoint endpoint : builder.solrEndpoints) {
+  buildClient(endpoint);
+}
+
 this.aliveCheckIntervalMillis = builder.aliveCheckIntervalMillis;
 this.defaultCollection = builder.defaultCollection;
+
+if (builder.solrClientBuilder.urlParamNames == null) {
+  this.urlParamNames = Collections.emptySet();
+} else {
+  this.urlParamNames = 
Collections.unmodifiableSet(builder.solrClientBuilder.urlParamNames);

Review Comment:
   probably fine but I'd prefer Set.copyOf here



##
solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBHttp2SolrClient.java:
##
@@ -94,35 +97,74 @@
  *
  * @since solr 8.0
  */
-public class LBHttp2SolrClient extends 
LBSolrClient {
+public class LBHttp2SolrClient> 
extends LBSolrClient {
 
-  protected final C solrClient;
+  private final Map urlToClient;
+  private final Set urlParamNames;
+
+  private final HttpSolrClientBuilderBase solrClientBuilder;
 
   @SuppressWarnings("unchecked")
   private LBHttp2SolrClient(Builder builder) {
 super(Arrays.asList(builder.solrEndpoints));
-this.solrClient = (C) builder.solrClient;
+this.solrClientBuilder = builder.solrClientBuilder;
+
+this.urlToClient = new ConcurrentHashMap<>();
+for (LBSolrClient.Endpoint endpoint : builder.solrEndpoints) {
+  buildClient(endpoint);
+}
+
 this.aliveCheckIntervalMillis = builder.aliveCheckIntervalMillis;
 this.defaultCollection = builder.defaultCollection;
+
+if (builder.solrClientBuilder.urlParamNames == null) {
+  this.urlParamNames = Collections.emptySet();
+} else {
+  this.urlParamNames = 
Collections.unmodifiableSet(builder.solrClientBuilder.urlParamNames);
+}
+  }
+
+  private synchronized HttpSolrClientBase buildClient(Endpoint endpoint) {

Review Comment:
   synchronized is guarding what here?



##
solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBHttp2SolrClient.java:
##
@@ -94,35 +97,74 @@
  *
  * @since solr 8.0
  */
-public class LBHttp2SolrClient extends 
LBSolrClient {
+public class LBHttp2SolrClient> 
extends LBSolrClient {
 
-  protected final C solrClient;
+  private final Map urlToClient;
+  private final Set urlParamNames;
+
+  private final HttpSolrClientBuilderBase solrClientBuilder;
 
   @SuppressWarnings("unchecked"

Re: [PR] SOLR-17381 SolrJ fix to fetch entire ClusterState if asked [solr]

2024-12-09 Thread via GitHub



dsmiley commented on PR #2853:
URL: https://github.com/apache/solr/pull/2853#issuecomment-2528550933

   [The test failure is known 
flaky](http://fucit.org/solr-jenkins-reports/history-trend-of-recent-failures.html#series/org.apache.solr.hdfs.cloud.api.collections.TestHdfsCloudBackupRestore.testRestoreFailure)
   I'll merge.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[PR] SOLR-17541: LBSolrClient implementations should agree on 'getClient()' semantics [solr]

2024-12-09 Thread via GitHub



jdyer1 opened a new pull request, #2899:
URL: https://github.com/apache/solr/pull/2899

   With this PR makes `LBHttp2SolrClient` maintains an instance of 
`HttpSolrClientBuilderBase` per "Base Url".  This makes the semantics of 
`LBHttp2SolrClient#getClient` consistent with that of the older 
`LBHttpSolrClient`.
   
   Behavior changes:
   - `LBHttp2SolrClient` generates a Http Solr Client per base url
   - (1) at constructon
   - (2) whenever a previously-unseen base url is encountered
   - NOTE: The Map holding the clients may grow unbounded; there is no 
cleanup short of callng `close()`.
   - `LBHttp2SolrClient` mutates the `baseSolrUrl` variable of the 
caller-supplied instance of `HttpSolrClientBuilderBase` whenever it creates a 
new Http Solr Client.
   - Both `LBHttp2SolrClient` and `CloudHttp2SolrClient` always own the 
internal/delegate clients.  They are now always closed by us on `close()`.
   
   The following class definitions are changed in the SolrJ public API:
   - `LBHttp2SolrClient`
   - from: `LBHttp2SolrClient extends 
LBSolrClient`
   - to: `LBHttp2SolrClient> 
extends LBSolrClient`
   - `LBHttp2SolrClient.Builder`
   - from: `Builder`
   - to: `Builder>`
   - All methods on `LBHttp2SolrClient.Builder` return a `Builder` instead 
of a `Builder`
   
   The following are removed from the SolrJ public API:
   - method `CloudHttp2SolrClient.Builder#withHttpClient`
   - constructor `LBHttp2SolrClient.Builder(C solrClient, Endpoint... 
endpoints)`
   - replaced with: `public Builder(B solrClientBuilder, Endpoint... 
endpoints)`
   - method `HttpJdkSolrClient#requestWithBaseUrl` (main only, never released)
   
   The following were not removed, but perhaps should be considered to be 
marked "internal" or "Experimental"
   - method `Http2SolrClient#requestWithBaseUrl` (main only, never released)
   - interface `SolrClientFunction` (main only, never released)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Updated] (SOLR-17541) LBSolrClient implementations should agree on 'getClient()' semantics

2024-12-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-17541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SOLR-17541:
--
Labels: pull-request-available  (was: )

> LBSolrClient implementations should agree on 'getClient()' semantics 
> -
>
> Key: SOLR-17541
> URL: https://issues.apache.org/jira/browse/SOLR-17541
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 9.7
>Reporter: Jason Gerlowski
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> LBSolrClient has an abstract "getClient(String url)" method that is used to 
> fetch a "Http" SolrClient appropriate for the specified URL.
> But implementations of this method differ in the client that is returned.  
> LBHttpSolrClient returns a client that is already pointed at the specified 
> URL and can be used without modification. But LBHttp2SolrClient returns a 
> client with no URL altogether, that must be pointed at the right endpoint 
> prior to use.  This is a bit messy, and complicates the calling code in 
> LBSolrClient quite a bit.
> We should choose one of these approaches and use it for all LBSolrClient 
> implementations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-17381) Make CLUSTERSTATUS request configurable

2024-12-09 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17904217#comment-17904217
 ] 

ASF subversion and git services commented on SOLR-17381:


Commit d2045f679e55e15cf4e974947b9417c4465fccd2 in solr's branch 
refs/heads/main from aparnasuresh85
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=d2045f679e5 ]

SOLR-17381: Fix CloudSolrClient.getClusterState when HTTP CSP (not ZK) (#2853)

When using the HTTP ClusterStateProvider (not ZK), getClusterState() wasn't 
working correctly; a regression from the first PR for this JIRA issue (not 
released).
Also,
* Optimization: fix O(N^2) algorithm to be O(N) for the number of collections 
when calling getClusterState
* liveNodes is now immutable, and probably a non-sorted ordering
* removed "health" and some other keys from DocCollection that aren't present 
when using ZK CSP

Minor details:
* clearly differentiate internal ClusterState processing from getting one 
DocCollection
* Use GenericSolrRequest not QueryRequest
* test the both CSPs better
-

Co-authored-by: David Smiley 

> Make CLUSTERSTATUS request configurable
> ---
>
> Key: SOLR-17381
> URL: https://issues.apache.org/jira/browse/SOLR-17381
> Project: Solr
>  Issue Type: Improvement
>Reporter: Aparna Suresh
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 9.8
>
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> Fetching {{CLUSTERSTATUS}} remotely is resource-intensive and should be done 
> with caution. Currently, if no parameters are specified, the call returns all 
> information, including collections, shards, replicas, aliases, cluster 
> properties, roles, and more. This can have significant performance 
> implications for clients using a Solr cluster with thousands of collections.
> Several performance [issues|https://issues.apache.org/jira/browse/SOLR-14985] 
> have been identified when switching {{CloudSolrClient}} to use HTTP-based 
> CSP, particularly in two instances where the entire cluster state is fetched 
> unnecessarily.
> *Proposal:* Modify the requests to retrieve only the necessary information, 
> such as the cluster status for a specific collection, live nodes, or cluster 
> properties. Ensure these changes maintain backward compatibility. 
> Additionally, update the HTTP CSP to reflect these optimizations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17381 SolrJ fix to fetch entire ClusterState if asked [solr]

2024-12-09 Thread via GitHub



dsmiley merged PR #2853:
URL: https://github.com/apache/solr/pull/2853


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Updated] (SOLR-17587) Prometheus Writer Duplicate TYPE exposition format

2024-12-09 Thread Matthew Biscocho (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-17587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Biscocho updated SOLR-17587:

Description: 
Solr's Prometheus writer duplicates `# TYPE  ` in it's exposition format for core registry metrics.

For example this appears twice in it's output:
{code:java}
# TYPE solr_metrics_core_average_request_time gauge
solr_metrics_core_average_request_time{category="ADMIN",collection="foo",core="core_foo_shard9_replica_t351",handler="/admin/file",replica="replica_t351",shard="shard9"}
 0.0 
...
# TYPE solr_metrics_core_average_request_time gauge{code}
This is technically not allowed per [Prometheus Exposition 
format|https://github.com/prometheus/docs/blob/main/content/docs/instrumenting/exposition_formats.md#:~:text=Only%20one%20TYPE%20line%20may%20exist%20for%20a%20given%20metric%20name.]

This happens because each Dropwizard registry is per core, but for Prometheus 
compatible exposition format upon exporting, it needs to be 1 registry for all 
cores on a single host, otherwise there will be duplicate `TYPE` formats even 
though all metrics are unique for its tags/attributes.

Funnily enough, prometheus upstream collector does not do this verification and 
accepts the metrics anyways just fine Solr -> Prometheus -> Grafana.

But depending on the technologies prometheus exposition verification, this will 
fail. For example 
[Telegraf|https://github.com/influxdata/telegraf/blob/master/plugins/inputs/prometheus/README.md]:
{code:java}
-12-09T16:56:01Z E! [inputs.prometheus] Error in plugin: error reading metrics 
for "http://127.0.0.1:8983/solr/admin/metrics?wt=prometheus": decoding response 
failed: text format parsing error in line 568: second TYPE line for metric name 
"solr_metrics_core_average_request_time", or TYPE reported after samples {code}
This shouldn't be a blocker if you are pushing metrics to prometheus collector 
directly.

 

  was:
Solr's Prometheus writer duplicates `# TYPE  ` in it's exposition format for core registry metrics.

For example this appears twice in it's output:

 
{code:java}
# TYPE solr_metrics_core_average_request_time gauge
solr_metrics_core_average_request_time{category="ADMIN",collection="foo",core="core_foo_shard9_replica_t351",handler="/admin/file",replica="replica_t351",shard="shard9"}
 0.0 
...
# TYPE solr_metrics_core_average_request_time gauge{code}
 

 

This is technically not allowed per [Prometheus Exposition 
format|https://github.com/prometheus/docs/blob/main/content/docs/instrumenting/exposition_formats.md#:~:text=Only%20one%20TYPE%20line%20may%20exist%20for%20a%20given%20metric%20name.]

This happens because each Dropwizard registry is per core, but for Prometheus 
compatible exposition format upon exporting, it needs to be 1 registry for all 
cores on a single host, otherwise there will be duplicate `TYPE` formats even 
though all metrics are unique for its tags/attributes.

Funnily enough, prometheus upstream collector does not do this verification and 
accepts the metrics anyways just fine Solr -> Prometheus -> Grafana.

But depending on the technologies prometheus exposition verification, this will 
fail. For example 
[Telegraf|https://github.com/influxdata/telegraf/blob/master/plugins/inputs/prometheus/README.md]:
{code:java}
-12-09T16:56:01Z E! [inputs.prometheus] Error in plugin: error reading metrics 
for "http://127.0.0.1:8983/solr/admin/metrics?wt=prometheus": decoding response 
failed: text format parsing error in line 568: second TYPE line for metric name 
"solr_metrics_core_average_request_time", or TYPE reported after samples {code}
This shouldn't be a blocker if you are pushing metrics to prometheus collector 
directly.

 


> Prometheus Writer Duplicate TYPE exposition format
> --
>
> Key: SOLR-17587
> URL: https://issues.apache.org/jira/browse/SOLR-17587
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 9.7
>Reporter: Matthew Biscocho
>Priority: Minor
>
> Solr's Prometheus writer duplicates `# TYPE   type>` in it's exposition format for core registry metrics.
> For example this appears twice in it's output:
> {code:java}
> # TYPE solr_metrics_core_average_request_time gauge
> solr_metrics_core_average_request_time{category="ADMIN",collection="foo",core="core_foo_shard9_replica_t351",handler="/admin/file",replica="replica_t351",shard="shard9"}
>  0.0 
> ...
> # TYPE solr_metrics_core_average_request_time gauge{code}
> This is technically not allowed per [Prometheus Exposition 
> format|https://github.com/prometheus/docs/blob/main/content/docs/instrumenting/exposition_formats.md#:~:text=Only%20one%20TYPE%20line%20may%20exist%20for%20a%20given%20metric%20name.]
> This happens because each Dropwizard registry is per core, but fo

Re: [PR] SOLR-17540: Remove Hadoop Auth Module [solr]

2024-12-09 Thread via GitHub



epugh merged PR #2835:
URL: https://github.com/apache/solr/pull/2835


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-17587) Prometheus Writer Duplicate TYPE exposition format

2024-12-09 Thread Matthew Biscocho (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-17587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17904242#comment-17904242
 ] 

Matthew Biscocho commented on SOLR-17587:
-

Working on a fix for this now. Going to try and get a PR out for this soon.

> Prometheus Writer Duplicate TYPE exposition format
> --
>
> Key: SOLR-17587
> URL: https://issues.apache.org/jira/browse/SOLR-17587
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 9.7
>Reporter: Matthew Biscocho
>Priority: Minor
>
> Solr's Prometheus writer duplicates `# TYPE   type>` in it's exposition format for core registry metrics.
> For example this appears twice in it's output:
> {code:java}
> # TYPE solr_metrics_core_average_request_time gauge
> solr_metrics_core_average_request_time{category="ADMIN",collection="foo",core="core_foo_shard9_replica_t351",handler="/admin/file",replica="replica_t351",shard="shard9"}
>  0.0 
> ...
> # TYPE solr_metrics_core_average_request_time gauge{code}
> This is technically not allowed per [Prometheus Exposition 
> format|https://github.com/prometheus/docs/blob/main/content/docs/instrumenting/exposition_formats.md#:~:text=Only%20one%20TYPE%20line%20may%20exist%20for%20a%20given%20metric%20name.]
> This happens because each Dropwizard registry is per core, but for Prometheus 
> compatible exposition format upon exporting, it needs to be 1 registry for 
> all cores on a single host, otherwise there will be duplicate `TYPE` formats 
> even though all metrics are unique for its tags/attributes.
> Funnily enough, prometheus upstream collector does not do this verification 
> and accepts the metrics anyways just fine Solr -> Prometheus -> Grafana.
> But depending on the technologies prometheus exposition verification, this 
> will fail. For example 
> [Telegraf|https://github.com/influxdata/telegraf/blob/master/plugins/inputs/prometheus/README.md]:
> {code:java}
> -12-09T16:56:01Z E! [inputs.prometheus] Error in plugin: error reading 
> metrics for "http://127.0.0.1:8983/solr/admin/metrics?wt=prometheus": 
> decoding response failed: text format parsing error in line 568: second TYPE 
> line for metric name "solr_metrics_core_average_request_time", or TYPE 
> reported after samples {code}
> This shouldn't be a blocker if you are pushing metrics to prometheus 
> collector directly.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-16441) Upgrade Jetty to 11.x

2024-12-09 Thread Jira



[ 
https://issues.apache.org/jira/browse/SOLR-16441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17904266#comment-17904266
 ] 

Paul Rütter commented on SOLR-16441:


For the package change, you might consider using something like 
[https://mvnrepository.com/artifact/org.apache.felix/org.apache.felix.http.wrappers]
 to go between javax and jakarta.

This would allow you to move to Jetty 12 EE10 with the jakarta namespace where 
dependencies allow, and use the wrapper classes to transition from jakarta to 
javax for those still using javax. The overhead is negligible and this allows 
for moving to the latest spec and namespace where possible, thus updating 
dependencies to newer versions once these become available.

 

This path has proven to work out great for our product, that moved to java 21 
and Jetty12 EE10 (via Apache Felix). We're using the wrapper classes to still 
use the Solr 8.x client in this application. 

> Upgrade Jetty to 11.x
> -
>
> Key: SOLR-16441
> URL: https://issues.apache.org/jira/browse/SOLR-16441
> Project: Solr
>  Issue Type: Improvement
>  Components: Server
>Reporter: Tomas Eduardo Fernandez Lobbe
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Solr is currently using Jetty 9.4.x and upgrading to Jetty 10.x in 
> SOLR-15955, we should look at upgrade to Jetty 11 which moves from javax to 
> jakarta namespace for servlet.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] User Behavior Insights implementation for Apache Solr [solr]

2024-12-09 Thread via GitHub



epugh commented on code in PR #2452:
URL: https://github.com/apache/solr/pull/2452#discussion_r1876714632


##
solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java:
##
@@ -126,19 +125,24 @@ public class SearchHandler extends RequestHandlerBase
   private PluginInfo shfInfo;
   private SolrCore core;
 
+  /**
+   * The default set of components that every handler gets. You can change 
this by defining the
+   * specific components for a handler. It puts the {@link QueryComponent} 
first as subsequent
+   * components assume that the QueryComponent ran and populated the document 
list.
+   *
+   * @return A list of component names.
+   */
   protected List getDefaultComponents() {
-ArrayList names = new ArrayList<>(9);
-names.add(QueryComponent.COMPONENT_NAME);
-names.add(FacetComponent.COMPONENT_NAME);
-names.add(FacetModule.COMPONENT_NAME);
-names.add(MoreLikeThisComponent.COMPONENT_NAME);
-names.add(HighlightComponent.COMPONENT_NAME);
-names.add(StatsComponent.COMPONENT_NAME);
-names.add(DebugComponent.COMPONENT_NAME);
-names.add(ExpandComponent.COMPONENT_NAME);
-names.add(TermsComponent.COMPONENT_NAME);
-
-return names;
+List l = new 
ArrayList(SearchComponent.STANDARD_COMPONENTS.keySet());

Review Comment:
   humm...Maybe I just back out this optimization...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] fix: do not return metrics for prom-exporter probes re-fix (#694) [solr-operator]

2024-12-09 Thread via GitHub



gerlowskija merged PR #729:
URL: https://github.com/apache/solr-operator/pull/729


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] fix: do not return metrics for prom-exporter probes re-fix (#694) [solr-operator]

2024-12-09 Thread via GitHub



gerlowskija commented on PR #729:
URL: https://github.com/apache/solr-operator/pull/729#issuecomment-2528514511

   Thanks @smoldenhauer-ish !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [I] gen-pkcs12-keystore init container fails if the tls secret contains no ca.crt [solr-operator]

2024-12-09 Thread via GitHub



gerlowskija closed issue #684: gen-pkcs12-keystore init container fails if the 
tls secret contains no ca.crt
URL: https://github.com/apache/solr-operator/issues/684


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] fix: gen-pkcs12-keystore adds ca.crt input option if it exists (#684) [solr-operator]

2024-12-09 Thread via GitHub



gerlowskija merged PR #685:
URL: https://github.com/apache/solr-operator/pull/685


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [I] Adding ObservedGeneration in the CRD [solr-operator]

2024-12-09 Thread via GitHub



gerlowskija commented on issue #650:
URL: https://github.com/apache/solr-operator/issues/650#issuecomment-2528822755

   Hey @Shashankft9 , @RavinaChidambaram - can you elaborate a little on your 
broader use case?  What sort of changes/states are you trying to monitor, that 
are tough to track with the existing 'status'?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] fix: gen-pkcs12-keystore adds ca.crt input option if it exists (#684) [solr-operator]

2024-12-09 Thread via GitHub



gerlowskija commented on code in PR #685:
URL: https://github.com/apache/solr-operator/pull/685#discussion_r1876310454


##
helm/solr-operator/Chart.yaml:
##
@@ -54,13 +54,21 @@ annotations:
   # Add change log for a single release here.
   # Allowed syntax is described at: 
https://artifacthub.io/docs/topics/annotations/helm/#example
   artifacthub.io/changes: |
+- kind: fixed
+  description: gen-pkcs12-keystore init container fails if the tls secret 
contains no ca.crt

Review Comment:
   ```suggestion
 description: gen-pkcs12-keystore initContainer now supports 
'ca.crt'-less TLS secrets
   ```
   
   Most of the other 'descriptions' here highlight the correct/improved 
behavior, so I've suggested a similar change above to stay in line with that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] fix: gen-pkcs12-keystore adds ca.crt input option if it exists (#684) [solr-operator]

2024-12-09 Thread via GitHub



gerlowskija commented on PR #685:
URL: https://github.com/apache/solr-operator/pull/685#issuecomment-2528643576

   Hi @smoldenhauer-ish - this LGTM overall.
   
   I did try to push one minor change to your branch, to reword the 
`helm/solr-operator/Chart.yaml` entry slightly.  But it looks like I don't have 
the requisite permissions to collaborate on that branch.  If you're willing to 
share write-access on the PR branch, I'll try sharing again?  (Or alternately, 
you could incorporate the ["Suggested Change" 
here](https://github.com/apache/solr-operator/pull/685/files#r1876310454).)
   
   Otherwise this is ready to merge IMO.  I'll hold off for a bit in hopes you 
see this and answer, but won't block merging the PR too long, as it's just a 
minor doc tweak.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] User Behavior Insights implementation for Apache Solr [solr]

2024-12-09 Thread via GitHub



epugh commented on code in PR #2452:
URL: https://github.com/apache/solr/pull/2452#discussion_r1876178441


##
solr/core/src/java/org/apache/solr/handler/component/UBIComponent.java:
##
@@ -0,0 +1,424 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.handler.component;
+
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.io.LineNumberReader;
+import java.lang.invoke.MethodHandles;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import org.apache.solr.client.solrj.io.SolrClientCache;
+import org.apache.solr.client.solrj.io.Tuple;
+import org.apache.solr.client.solrj.io.stream.StreamContext;
+import org.apache.solr.client.solrj.io.stream.TupleStream;
+import org.apache.solr.client.solrj.io.stream.expr.DefaultStreamFactory;
+import org.apache.solr.client.solrj.io.stream.expr.StreamExpression;
+import org.apache.solr.client.solrj.io.stream.expr.StreamExpressionParser;
+import org.apache.solr.client.solrj.io.stream.expr.StreamFactory;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.params.SolrParams;
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.common.util.SimpleOrderedMap;
+import org.apache.solr.core.CoreContainer;
+import org.apache.solr.core.PluginInfo;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.handler.LoggingStream;
+import org.apache.solr.response.ResultContext;
+import org.apache.solr.schema.IndexSchema;
+import org.apache.solr.search.DocIterator;
+import org.apache.solr.search.DocList;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.plugin.SolrCoreAware;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * User Behavior Insights (UBI) is an open standard for gathering query and 
event data from users
+ * and storing it in a structured format. UBI can be used for in session 
personalization, implicit
+ * judgements, powering recommendation systems among others. Learn more about 
the UBI standard at https://ubisearch.dev";>https://ubisearch.dev.
+ *
+ * The response from Solr is augmented by this component, and optionally 
the query details can be
+ * tracked and logged to various systems including log files or other backend 
systems.
+ *
+ * Data tracked is a unique query_id for the search request, the end user's 
query, metadata about
+ * the query as a JSON map, and the resulting document id's.
+ *
+ * You provide a streaming expression that is parsed and loaded by the 
component to stream query
+ * data to a target of your choice. If you do not, then the default expression 
of
+ * 'logging(ubi_queries.jsonl,ubiQuery())"' is used which logs data to
+ * $SOLR_HOME/userfiles/ubi_queries.jsonl file.
+ *
+ * You must source your streaming events using the 'ubiQuery()' streaming 
expression to retrieve
+ * the {@link UBIQuery} object that contains the data for recording.
+ *
+ * Event data is tracked by letting the user write events directly to the 
event repository of
+ * your choice, it could be a Solr collection, it could be a file or S3 
bucket, and that is NOT
+ * handled by this component.
+ *
+ * Add the component to a requestHandler in solrconfig.xml like this:
+ *
+ * 
+ * 
+ *
+ * 
+ *   
+ *
+ * ...
+ *
+ *   
+ *   
+ * ubi
+ *   
+ * 
+ *
+ * It can then be enabled at query time by supplying
+ *
+ * ubi=true
+ *
+ * query parameter.
+ *
+ * Ideally this component is used with the JSON Query syntax, as that 
facilitates passing in the
+ * additional data to be tracked with a query. Here is an example:
+ *
+ * 
+ * {
+ * "query" : "apple AND ipod",
+ * "limit":2,
+ * "start":2,
+ * "filter": [
+ *"inStock:true"
+ *  ]
+ * params: {
+ *   "ubi": "true"
+ *   "user_query": "Apple iPod",
+ *   "query_attributes": {
+ * "experiment_name": "super_secret",
+ * "page": 2

Re: [PR] User Behavior Insights implementation for Apache Solr [solr]

2024-12-09 Thread via GitHub



epugh commented on code in PR #2452:
URL: https://github.com/apache/solr/pull/2452#discussion_r1876187008


##
solr/core/src/java/org/apache/solr/handler/component/UBIComponent.java:
##
@@ -0,0 +1,424 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.handler.component;
+
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.io.LineNumberReader;
+import java.lang.invoke.MethodHandles;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import org.apache.solr.client.solrj.io.SolrClientCache;
+import org.apache.solr.client.solrj.io.Tuple;
+import org.apache.solr.client.solrj.io.stream.StreamContext;
+import org.apache.solr.client.solrj.io.stream.TupleStream;
+import org.apache.solr.client.solrj.io.stream.expr.DefaultStreamFactory;
+import org.apache.solr.client.solrj.io.stream.expr.StreamExpression;
+import org.apache.solr.client.solrj.io.stream.expr.StreamExpressionParser;
+import org.apache.solr.client.solrj.io.stream.expr.StreamFactory;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.params.SolrParams;
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.common.util.SimpleOrderedMap;
+import org.apache.solr.core.CoreContainer;
+import org.apache.solr.core.PluginInfo;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.handler.LoggingStream;
+import org.apache.solr.response.ResultContext;
+import org.apache.solr.schema.IndexSchema;
+import org.apache.solr.search.DocIterator;
+import org.apache.solr.search.DocList;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.plugin.SolrCoreAware;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * User Behavior Insights (UBI) is an open standard for gathering query and 
event data from users
+ * and storing it in a structured format. UBI can be used for in session 
personalization, implicit
+ * judgements, powering recommendation systems among others. Learn more about 
the UBI standard at https://ubisearch.dev";>https://ubisearch.dev.
+ *
+ * The response from Solr is augmented by this component, and optionally 
the query details can be
+ * tracked and logged to various systems including log files or other backend 
systems.
+ *
+ * Data tracked is a unique query_id for the search request, the end user's 
query, metadata about
+ * the query as a JSON map, and the resulting document id's.
+ *
+ * You provide a streaming expression that is parsed and loaded by the 
component to stream query
+ * data to a target of your choice. If you do not, then the default expression 
of
+ * 'logging(ubi_queries.jsonl,ubiQuery())"' is used which logs data to
+ * $SOLR_HOME/userfiles/ubi_queries.jsonl file.
+ *
+ * You must source your streaming events using the 'ubiQuery()' streaming 
expression to retrieve
+ * the {@link UBIQuery} object that contains the data for recording.
+ *
+ * Event data is tracked by letting the user write events directly to the 
event repository of
+ * your choice, it could be a Solr collection, it could be a file or S3 
bucket, and that is NOT
+ * handled by this component.
+ *
+ * Add the component to a requestHandler in solrconfig.xml like this:
+ *
+ * 
+ * 
+ *
+ * 
+ *   
+ *
+ * ...
+ *
+ *   
+ *   
+ * ubi
+ *   
+ * 
+ *
+ * It can then be enabled at query time by supplying
+ *
+ * ubi=true
+ *
+ * query parameter.
+ *
+ * Ideally this component is used with the JSON Query syntax, as that 
facilitates passing in the
+ * additional data to be tracked with a query. Here is an example:
+ *
+ * 
+ * {
+ * "query" : "apple AND ipod",
+ * "limit":2,
+ * "start":2,
+ * "filter": [
+ *"inStock:true"
+ *  ]
+ * params: {
+ *   "ubi": "true"
+ *   "user_query": "Apple iPod",
+ *   "query_attributes": {
+ * "experiment_name": "super_secret",
+ * "page": 2

Re: [PR] SOLR-17381 SolrJ fix to fetch entire ClusterState if asked [solr]

2024-12-09 Thread via GitHub



aparnasuresh85 commented on PR #2853:
URL: https://github.com/apache/solr/pull/2853#issuecomment-2528479499

   > Should be ready, assuming tests pass. @aparnasuresh85 happy to review with 
you.
   
   LGTM - could be merged when all test pass


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Resolved] (SOLR-17381) Make CLUSTERSTATUS request configurable

2024-12-09 Thread David Smiley (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley resolved SOLR-17381.
-
  Assignee: David Smiley
Resolution: Fixed

> Make CLUSTERSTATUS request configurable
> ---
>
> Key: SOLR-17381
> URL: https://issues.apache.org/jira/browse/SOLR-17381
> Project: Solr
>  Issue Type: Improvement
>Reporter: Aparna Suresh
>Assignee: David Smiley
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 9.8
>
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> Fetching {{CLUSTERSTATUS}} remotely is resource-intensive and should be done 
> with caution. Currently, if no parameters are specified, the call returns all 
> information, including collections, shards, replicas, aliases, cluster 
> properties, roles, and more. This can have significant performance 
> implications for clients using a Solr cluster with thousands of collections.
> Several performance [issues|https://issues.apache.org/jira/browse/SOLR-14985] 
> have been identified when switching {{CloudSolrClient}} to use HTTP-based 
> CSP, particularly in two instances where the entire cluster state is fetched 
> unnecessarily.
> *Proposal:* Modify the requests to retrieve only the necessary information, 
> such as the cluster status for a specific collection, live nodes, or cluster 
> properties. Ensure these changes maintain backward compatibility. 
> Additionally, update the HTTP CSP to reflect these optimizations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-17381) Make CLUSTERSTATUS request configurable

2024-12-09 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17904224#comment-17904224
 ] 

ASF subversion and git services commented on SOLR-17381:


Commit a9e7506c618e392d357dd5cb636a0c8772e79fc5 in solr's branch 
refs/heads/branch_9x from aparnasuresh85
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=a9e7506c618 ]

SOLR-17381: Fix CloudSolrClient.getClusterState when HTTP CSP (not ZK) (#2853)

When using the HTTP ClusterStateProvider (not ZK), getClusterState() wasn't 
working correctly; a regression from the first PR for this JIRA issue (not 
released).
Also,
* Optimization: fix O(N^2) algorithm to be O(N) for the number of collections 
when calling getClusterState
* liveNodes is now immutable, and probably a non-sorted ordering
* removed "health" and some other keys from DocCollection that aren't present 
when using ZK CSP

Minor details:
* clearly differentiate internal ClusterState processing from getting one 
DocCollection
* Use GenericSolrRequest not QueryRequest
* test the both CSPs better
-

Co-authored-by: David Smiley 

(cherry picked from commit d2045f679e55e15cf4e974947b9417c4465fccd2)


> Make CLUSTERSTATUS request configurable
> ---
>
> Key: SOLR-17381
> URL: https://issues.apache.org/jira/browse/SOLR-17381
> Project: Solr
>  Issue Type: Improvement
>Reporter: Aparna Suresh
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 9.8
>
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> Fetching {{CLUSTERSTATUS}} remotely is resource-intensive and should be done 
> with caution. Currently, if no parameters are specified, the call returns all 
> information, including collections, shards, replicas, aliases, cluster 
> properties, roles, and more. This can have significant performance 
> implications for clients using a Solr cluster with thousands of collections.
> Several performance [issues|https://issues.apache.org/jira/browse/SOLR-14985] 
> have been identified when switching {{CloudSolrClient}} to use HTTP-based 
> CSP, particularly in two instances where the entire cluster state is fetched 
> unnecessarily.
> *Proposal:* Modify the requests to retrieve only the necessary information, 
> such as the cluster status for a specific collection, live nodes, or cluster 
> properties. Ensure these changes maintain backward compatibility. 
> Additionally, update the HTTP CSP to reflect these optimizations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] fix: gen-pkcs12-keystore adds ca.crt input option if it exists (#684) [solr-operator]

2024-12-09 Thread via GitHub



smoldenhauer-ish commented on PR #685:
URL: https://github.com/apache/solr-operator/pull/685#issuecomment-2528760533

   Hi @gerlowskija - added your suggestion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] User Behavior Insights implementation for Apache Solr [solr]

2024-12-09 Thread via GitHub



epugh commented on code in PR #2452:
URL: https://github.com/apache/solr/pull/2452#discussion_r1876795494


##
solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java:
##
@@ -126,19 +125,24 @@ public class SearchHandler extends RequestHandlerBase
   private PluginInfo shfInfo;
   private SolrCore core;
 
+  /**
+   * The default set of components that every handler gets. You can change 
this by defining the
+   * specific components for a handler. It puts the {@link QueryComponent} 
first as subsequent
+   * components assume that the QueryComponent ran and populated the document 
list.
+   *
+   * @return A list of component names.
+   */
   protected List getDefaultComponents() {
-ArrayList names = new ArrayList<>(9);
-names.add(QueryComponent.COMPONENT_NAME);
-names.add(FacetComponent.COMPONENT_NAME);
-names.add(FacetModule.COMPONENT_NAME);
-names.add(MoreLikeThisComponent.COMPONENT_NAME);
-names.add(HighlightComponent.COMPONENT_NAME);
-names.add(StatsComponent.COMPONENT_NAME);
-names.add(DebugComponent.COMPONENT_NAME);
-names.add(ExpandComponent.COMPONENT_NAME);
-names.add(TermsComponent.COMPONENT_NAME);
-
-return names;
+List l = new 
ArrayList(SearchComponent.STANDARD_COMPONENTS.keySet());

Review Comment:
   check out the changes I made to back this out, but still keep the name 
`STANDARD_COMPONENTS`...



##
solr/core/src/java/org/apache/solr/handler/component/UBIComponent.java:
##
@@ -0,0 +1,424 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.handler.component;
+
+import java.io.IOException;
+import java.io.InputStreamReader;
+import java.io.LineNumberReader;
+import java.lang.invoke.MethodHandles;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import org.apache.solr.client.solrj.io.SolrClientCache;
+import org.apache.solr.client.solrj.io.Tuple;
+import org.apache.solr.client.solrj.io.stream.StreamContext;
+import org.apache.solr.client.solrj.io.stream.TupleStream;
+import org.apache.solr.client.solrj.io.stream.expr.DefaultStreamFactory;
+import org.apache.solr.client.solrj.io.stream.expr.StreamExpression;
+import org.apache.solr.client.solrj.io.stream.expr.StreamExpressionParser;
+import org.apache.solr.client.solrj.io.stream.expr.StreamFactory;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.params.SolrParams;
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.common.util.SimpleOrderedMap;
+import org.apache.solr.core.CoreContainer;
+import org.apache.solr.core.PluginInfo;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.handler.LoggingStream;
+import org.apache.solr.response.ResultContext;
+import org.apache.solr.schema.IndexSchema;
+import org.apache.solr.search.DocIterator;
+import org.apache.solr.search.DocList;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.plugin.SolrCoreAware;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * User Behavior Insights (UBI) is an open standard for gathering query and 
event data from users
+ * and storing it in a structured format. UBI can be used for in session 
personalization, implicit
+ * judgements, powering recommendation systems among others. Learn more about 
the UBI standard at https://ubisearch.dev";>https://ubisearch.dev.
+ *
+ * The response from Solr is augmented by this component, and optionally 
the query details can be
+ * tracked and logged to various systems including log files or other backend 
systems.
+ *
+ * Data tracked is a unique query_id for the search request, the end user's 
query, metadata about
+ * the query as a JSON map, and the resulting document id's.
+ *
+ * You provide a streaming expression that is parsed and loaded by the 
component to stream query
+ * data to a target of your choice. If you do not, then the default expression 
of
+ * 'logging(ubi_queries.jsonl,ubiQuery())"' is used which logs data to
+ * $SOLR_HOME/userfiles/ubi_queries.jsonl file.
+ *
+ * You must source your streaming events using the

Re: [I] metadata.annotations: Too long: must have at most 262144 bytes [solr-operator]

2024-12-09 Thread via GitHub



HoustonPutman closed issue #732: metadata.annotations: Too long: must have at 
most 262144 bytes
URL: https://github.com/apache/solr-operator/issues/732


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [I] metadata.annotations: Too long: must have at most 262144 bytes [solr-operator]

2024-12-09 Thread via GitHub



HoustonPutman commented on issue #732:
URL: https://github.com/apache/solr-operator/issues/732#issuecomment-2529644607

   Please refer to the answer here: 
https://github.com/apache/solr-operator/issues/502#issuecomment-1339758128


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [I] metadata.annotations: Too long: must have at most 262144 bytes [solr-operator]

2024-12-09 Thread via GitHub



HoustonPutman closed issue #732: metadata.annotations: Too long: must have at 
most 262144 bytes
URL: https://github.com/apache/solr-operator/issues/732


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[I] metadata.annotations: Too long: must have at most 262144 bytes [solr-operator]

2024-12-09 Thread via GitHub



andrewrothstein opened a new issue, #732:
URL: https://github.com/apache/solr-operator/issues/732

   kubectl apply -f "https://solr.apache.org/operator/downloads/crds/v0.8.1/a\
   ll-with-dependencies.yaml"   
   
   customresourcedefinition.apiextensions.k8s.io/solrbackups.solr.apache.org 
configured
   
customresourcedefinition.apiextensions.k8s.io/solrprometheusexporters.solr.apache.org
 configured
   
customresourcedefinition.apiextensions.k8s.io/zookeeperclusters.zookeeper.pravega.io
 configured 
   The CustomResourceDefinition "solrclouds.solr.apache.org" is invalid: 
metadata.annotations: Too long: must have at most 262144 bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Updated] (SOLR-17588) javabin must support primitive arrays

2024-12-09 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-17588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SOLR-17588:
--
Labels: pull-request-available  (was: )

> javabin must support primitive arrays
> -
>
> Key: SOLR-17588
> URL: https://issues.apache.org/jira/browse/SOLR-17588
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> javabin does not support primitive arrays such as float[], long[], int[] etc. 
> So, now we need to convert them to List, List, List etc 
> which is inefficient and inconvenient



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Created] (SOLR-17588) javabin must support primitive arrays

2024-12-09 Thread Noble Paul (Jira)

Noble Paul created SOLR-17588:
-

 Summary: javabin must support primitive arrays
 Key: SOLR-17588
 URL: https://issues.apache.org/jira/browse/SOLR-17588
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrJ
Reporter: Noble Paul
Assignee: Noble Paul


javabin does not support primitive arrays such as float[], long[], int[] etc. 
So, now we need to convert them to List, List, List etc 
which is inefficient and inconvenient




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] upgrade-log4j2-2.24.2 [solr]

2024-12-09 Thread via GitHub



janhoy commented on PR #2895:
URL: https://github.com/apache/solr/pull/2895#issuecomment-2530705029

   I have no preference when license is ambiguous. Guess whatever is in the 
download tar for that version is safe.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] [WIP] Jetty12 + EE8 [solr]

2024-12-09 Thread via GitHub



iamsanjay commented on PR #2876:
URL: https://github.com/apache/solr/pull/2876#issuecomment-2530557275

   
[c3c2f1c](https://github.com/apache/solr/pull/2876/commits/c3c2f1c11ef40a34484a4d322be56f4384857db6)
 removed hadoop-auth
   
[f7d900b](https://github.com/apache/solr/pull/2876/commits/f7d900b54fbeb31a8f5ee244a0166276f10f70f0)
 is where the branch is upgraded to EE10, all the usage of javax.servlet has 
been replaced with jakarta.servlet
   
   With this, changing the title of this PR as well.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

47 matches

Mail list logo