Can I insert query result into another collection of the same Solr?
Hello, I did a query based on one collection and want to insert its result into another collection of the same Solr. The query result has the same fields as that of another collection. Is there a simple way to do the job? If the query result has to be moved outside Solr before being added to another collection of the same Solr, it would not be very efficient when the query result is very large. Relevant information would be welcome. Kind regards, Zhiqing
Re: Can I insert query result into another collection of the same Solr?
Hi, you can look at the updatestream from streaming api of solr, it can take a search expression and emitted tuples can be added to a new collection. https://solr.apache.org/guide/8_4/stream-decorator-reference.html Sent from my iPhone > On Mar 23, 2022, at 4:06 AM, WU, Zhiqing wrote: > > Hello, > I did a query based on one collection and want to insert its result into > another collection of the same Solr. The query result has the same fields > as that of another collection. Is there a simple way to do the job? > If the query result has to be moved outside Solr before being added to > another collection of the same Solr, it would not be very efficient when > the query result is very large. > Relevant information would be welcome. > Kind regards, > Zhiqing
RE: Using Schema API breaks the Upload of Config Set file
> > > This does sound like a bug ;-(. > > So I will create a bug report for this. Tracked it in JIRA SOLR-16110 Using Schema/Config API breaks the File-Upload of Config Set File https://issues.apache.org/jira/browse/SOLR-16110
Representative filtering of very large result sets
We are using the collapse query parser for consolidating results based on a field value, and are also faceting on a number of other fields. The collapse field and the facet fields all have docValues=true. For very large (millions of documents) result sets, the heap usage gets a little out of hand, and the resulting GC is problematic. I am trying to figure out how to reduce the number of documents that are being faceted over, and still display facets that are "representative" of the entire result set. Some sort of filter query seems to be the obvious answer, but what? I don't want to accidentally exclude my most relevant results. How can I facet over only the top N results? Thanks for any tips. -- Jeremy Buckley
Solr 8.11.1 upgrading LOG4J from 2.16 to 2.17
I have seen the emails about Solr not being affected by the DoS vulnerability associated by LOG4J 2.16, but SOLR failed a security scan because of it and the bosses want it upgraded. Can someone tell me where I can download an upgrade or patch for LOG4J and instructions on how to implement it? Thanks, George smime.p7s Description: S/MIME cryptographic signature
Re: Solr 8.11.1 upgrading LOG4J from 2.16 to 2.17
> On Mar 23, 2022, at 1:36 PM, Heller, George A III CTR (USA) > wrote: > > Can someone tell me where I can download an upgrade or patch for LOG4J and > instructions on how to implement it? > See https://solr.apache.org/security.html
RE: [URL Verdict: Neutral][Non-DoD Source] Re: Solr 8.11.1 upgrading LOG4J from 2.16 to 2.17
Whatever you sent got removed by our email filters. Can you please resend as text. Thanks, George -Original Message- From: Andy Lester Sent: Wednesday, March 23, 2022 2:55 PM To: users@solr.apache.org Subject: [URL Verdict: Neutral][Non-DoD Source] Re: Solr 8.11.1 upgrading LOG4J from 2.16 to 2.17 All active links contained in this email were disabled. Please verify the identity of the sender, and confirm the authenticity of all links contained within the message prior to copying and pasting the address to a Web browser. > On Mar 23, 2022, at 1:36 PM, Heller, George A III CTR (USA) > wrote: > > Can someone tell me where I can download an upgrade or patch for LOG4J and > instructions on how to implement it? > See Caution-https://solr.apache.org/security.html smime.p7s Description: S/MIME cryptographic signature
Re: Solr 8.11.1 upgrading LOG4J from 2.16 to 2.17
Go to the https://solr.apache.org/security.html URL and you will find instructions there on what to do. Andy
Re: Solr 8.11.1 upgrading LOG4J from 2.16 to 2.17
Here's the issue where Log4J was upgraded. You can look at the pull request there to find out what you need to change. After that, you can build your own Solr binaries for your use (fix in github.com/apache/lucene-solr's branch_8_11 and build using "ant ivy-bootstrap; cd solr; ant package" which will generate a .tgz file). https://issues.apache.org/jira/browse/SOLR-15843 On Thu, Mar 24, 2022 at 12:42 AM Andy Lester wrote: > Go to the https://solr.apache.org/security.html URL and you will find > instructions there on what to do. > > Andy
Re: Solr 8.11.1 upgrading LOG4J from 2.16 to 2.17
And feel free to open a new JIRA for this log4j upgrade, it will get picked up in 8.11.2 (whenever someone gets time to release it). On Thu, Mar 24, 2022 at 3:18 AM Ishan Chattopadhyaya < ichattopadhy...@gmail.com> wrote: > Here's the issue where Log4J was upgraded. You can look at the pull > request there to find out what you need to change. After that, you can > build your own Solr binaries for your use (fix in > github.com/apache/lucene-solr's branch_8_11 and build using "ant > ivy-bootstrap; cd solr; ant package" which will generate a .tgz file). > https://issues.apache.org/jira/browse/SOLR-15843 > > On Thu, Mar 24, 2022 at 12:42 AM Andy Lester wrote: > >> Go to the https://solr.apache.org/security.html URL and you will find >> instructions there on what to do. >> >> Andy > >
Re: Can I insert query result into another collection of the same Solr?
Hi Susmit, Thanks for your reply. Since I do not have much experience with the streaming api of solr, I only can understand a part of the page and do not know how to implement related parts with SolrJ. Is it possible you could recommend some books or webpages which contain examples for streaming api? Looking forward to your reply. Kind regards, Zhiqing On Wed, 23 Mar 2022 at 14:34, Susmit wrote: > Hi, > you can look at the updatestream from streaming api of solr, it can take a > search expression and emitted tuples can be added to a new collection. > > https://solr.apache.org/guide/8_4/stream-decorator-reference.html > > Sent from my iPhone > > > On Mar 23, 2022, at 4:06 AM, WU, Zhiqing wrote: > > > > Hello, > > I did a query based on one collection and want to insert its result into > > another collection of the same Solr. The query result has the same fields > > as that of another collection. Is there a simple way to do the job? > > If the query result has to be moved outside Solr before being added to > > another collection of the same Solr, it would not be very efficient when > > the query result is very large. > > Relevant information would be welcome. > > Kind regards, > > Zhiqing >
Re: Solr 8.11.1 upgrading LOG4J from 2.16 to 2.17
Please do not create another JIRA, it is already committed, just waiting on the 8.11.2 release. https://issues.apache.org/jira/browse/SOLR-15871 The suggestion across multiple threads in the users list has been to remove the log4j jar, and replace it with the 2.17.1 jar, which will pass security checks. On Wed, Mar 23, 2022 at 5:53 PM Ishan Chattopadhyaya < ichattopadhy...@gmail.com> wrote: > And feel free to open a new JIRA for this log4j upgrade, it will get picked > up in 8.11.2 (whenever someone gets time to release it). > > On Thu, Mar 24, 2022 at 3:18 AM Ishan Chattopadhyaya < > ichattopadhy...@gmail.com> wrote: > > > Here's the issue where Log4J was upgraded. You can look at the pull > > request there to find out what you need to change. After that, you can > > build your own Solr binaries for your use (fix in > > github.com/apache/lucene-solr's branch_8_11 and build using "ant > > ivy-bootstrap; cd solr; ant package" which will generate a .tgz file). > > https://issues.apache.org/jira/browse/SOLR-15843 > > > > On Thu, Mar 24, 2022 at 12:42 AM Andy Lester wrote: > > > >> Go to the https://solr.apache.org/security.html URL and you will find > >> instructions there on what to do. > >> > >> Andy > > > > >
Re: Representative filtering of very large result sets
It sounds like you are collapsing on a high cardinality field and/or faceting on high cardinality fields. Can you describe the cardinality of the fields so we can get an idea of how large the problem is? Joel Bernstein http://joelsolr.blogspot.com/ On Wed, Mar 23, 2022 at 12:30 PM Jeremy Buckley - IQ-C wrote: > We are using the collapse query parser for consolidating results based on a > field value, and are also faceting on a number of other fields. The > collapse field and the facet fields all have docValues=true. For very large > (millions of documents) result sets, the heap usage gets a little out of > hand, and the resulting GC is problematic. I am trying to figure out how > to reduce the number of documents that are being faceted over, and still > display facets that are "representative" of the entire result set. > > Some sort of filter query seems to be the obvious answer, but what? I don't > want to accidentally exclude my most relevant results. > > How can I facet over only the top N results? > > Thanks for any tips. > > -- > Jeremy Buckley >
Re: Representative filtering of very large result sets
The number of documents in the collection is about 90 million. The collapse field has about 30 million distinct values, so I guess that qualifies as high cardinality. We used to use result grouping but switched to collapse for improved performance. The faceting fields are more of a mix, 5-10 fields ranging from around a dozen to around 250,000 distinct values. On Wed, Mar 23, 2022 at 8:30 PM Joel Bernstein wrote: > It sounds like you are collapsing on a high cardinality field and/or > faceting on high cardinality fields. Can you describe the cardinality of > the fields so we can get an idea of how large the problem is? > > Joel Bernstein > http://joelsolr.blogspot.com/ >
Re: Representative filtering of very large result sets
To collapse on 30 million distinct values is going to cause memory problems for sure. If the heap is growing as the result set grows that means you are likely using a newer version of Solr which collapses into a hashmap. Older versions of Solr would collapse into an array 30 million in length which probably would have blown up memory with even small result sets. I think you're going to need to shard to get this to perform well. With SolrCloud you can shard on the collapse key ( https://solr.apache.org/guide/8_7/shards-and-indexing-data-in-solrcloud.html#document-routing). This will send all documents with the same collapse key to the same shard. Then run the collapse query on the sharded collection. Joel Bernstein http://joelsolr.blogspot.com/ On Wed, Mar 23, 2022 at 9:42 PM Jeremy Buckley - IQ-C wrote: > The number of documents in the collection is about 90 million. The > collapse field has about 30 million distinct values, so I guess that > qualifies as high cardinality. We used to use result grouping but switched > to collapse for improved performance. > > The faceting fields are more of a mix, 5-10 fields ranging from around a > dozen to around 250,000 distinct values. > > On Wed, Mar 23, 2022 at 8:30 PM Joel Bernstein wrote: > > > It sounds like you are collapsing on a high cardinality field and/or > > faceting on high cardinality fields. Can you describe the cardinality of > > the fields so we can get an idea of how large the problem is? > > > > Joel Bernstein > > http://joelsolr.blogspot.com/ > > >
Re: Solr 8.11.1 upgrading LOG4J from 2.16 to 2.17
On 3/23/2022 12:36 PM, Heller, George A III CTR (USA) wrote: Can someone tell me where I can download an upgrade or patch for LOG4J and instructions on how to implement it? Did you try googling? Because if I enter "log4j download" (minus the quotes) into Google, the first hit looks like it is exactly what you want. You'll want the "binary" download, either .tar.gz or .zip format. As for what to do with it once you download it, just find all the log4j jars in your Solr directory and replace them with jars from the log4j archive that have the same names and different version numbers. There has been a fair amount of user testing and we have determined that this is a safe operation, as long as you don't leave some jars at a different version than the rest. The log4j public API is very stable, which is why this is safe to do, but I have no idea how stable their internal APIs are. Depending on the exact Solr version you have, you may have a jar that starts with "log4j-layout-template-json" ... this jar won't be in the log4j download. If you have not changed Solr's logging configuration so that it outputs JSON formatted logs, you can safely delete this one jar. If you actually need an upgraded version of that jar, you can find it on Maven Central. https://repo1.maven.org/maven2/org/apache/logging/log4j/log4j-layout-template-json/2.17.2/log4j-layout-template-json-2.17.2.jar Thanks, Shawn h ttps://lmgtfy.app/?q=log4j+download
Secure SSL connections between Solr and ZooKeeper
According to the "Enabling SSL" section of the Apache Solr 8.11 Reference Guide [1]: > ZooKeeper does not support encrypted communication with clients like > Solr. There are several related JIRA tickets where SSL support is > being planned/worked on: > https://issues.apache.org/jira/browse/ZOOKEEPER-235 > https://issues.apache.org/jira/browse/ZOOKEEPER-236 > https://issues.apache.org/jira/browse/ZOOKEEPER-1000 > https://issues.apache.org/jira/browse/ZOOKEEPER-2120 However, that appears to be outdated information, since Apache ZooKeeper has already implemented encrypted communications with clients since around ZooKeeper 3.5 (the current stable ZooKeeper is version 3.6.3). How do I configure Solr to use SSL when communicating with ZooKeeper? In my ZooKeeper configuration (zoo.cfg), I have this: secureClientPort=2182 #clientPort=2181 # Disabled. Allow secure connections only. ssl.clientAuth=need ssl.keystore.location=/opt/zookeeper/conf/zk-keystore.jks ssl.keystore.password=123456 ssl.truststore.location=/opt/zookeeper/conf/zk-truststore.jks ssl.truststore.password=123456 # ... Now, what should I do on the SolrCloud side to connect to ZooKeeper using SSL? [1]: https://solr.apache.org/guide/8_11/enabling-ssl.html#configure-zookeeper