Solr Error
Hi All, Am getting below error in the solr , which is affecting the services which are accessing the dqmStore. Please suggest if you have come across this issue. 2022-03-24 08:48:36.629 ERROR (qtp1198197478-620853) [c:dqmStore s:shard1 r:core_node4 x:dqmStore_shard1_replica_n2] o.a.s. u.SolrCmdDistributor java.io.IOException: Request processing has stalled for 20036ms with 100 remaining elements in the que ue. -- Thanks and Regards, Hari Mobile:9790756568
Re: Representative filtering of very large result sets
Thanks, Joel, that is exactly what we are doing. We have four shards and are sharding on the collapse key. Performance is fine (subsecond) as long as the result set is relatively small. I am really looking for the best way to ensure that this is always true. On Wed, Mar 23, 2022 at 10:18 PM Joel Bernstein wrote: > To collapse on 30 million distinct values is going to cause memory problems > for sure. If the heap is growing as the result set grows that means you are > likely using a newer version of Solr which collapses into a hashmap. Older > versions of Solr would collapse into an array 30 million in length which > probably would have blown up memory with even small result sets. > > I think you're going to need to shard to get this to perform well. With > SolrCloud you can shard on the collapse key ( > > https://solr.apache.org/guide/8_7/shards-and-indexing-data-in-solrcloud.html#document-routing > ). > This will send all documents with the same collapse key to the same shard. > Then run the collapse query on the sharded collection. > > Joel Bernstein > http://joelsolr.blogspot.com/ > >
RE: [URL Verdict: Neutral][Non-DoD Source] Re: Solr 8.11.1 upgrading LOG4J from 2.16 to 2.17
What happens if we need to deploy to production before 8.11.2 is released? -Original Message- From: Houston Putman Sent: Wednesday, March 23, 2022 7:15 PM To: users@solr.apache.org Subject: [URL Verdict: Neutral][Non-DoD Source] Re: Solr 8.11.1 upgrading LOG4J from 2.16 to 2.17 All active links contained in this email were disabled. Please verify the identity of the sender, and confirm the authenticity of all links contained within the message prior to copying and pasting the address to a Web browser. Please do not create another JIRA, it is already committed, just waiting on the 8.11.2 release. Caution-https://issues.apache.org/jira/browse/SOLR-15871 The suggestion across multiple threads in the users list has been to remove the log4j jar, and replace it with the 2.17.1 jar, which will pass security checks. On Wed, Mar 23, 2022 at 5:53 PM Ishan Chattopadhyaya < ichattopadhy...@gmail.com> wrote: > And feel free to open a new JIRA for this log4j upgrade, it will get > picked up in 8.11.2 (whenever someone gets time to release it). > > On Thu, Mar 24, 2022 at 3:18 AM Ishan Chattopadhyaya < > ichattopadhy...@gmail.com> wrote: > > > Here's the issue where Log4J was upgraded. You can look at the pull > > request there to find out what you need to change. After that, you > > can build your own Solr binaries for your use (fix in > > github.com/apache/lucene-solr's branch_8_11 and build using "ant > > ivy-bootstrap; cd solr; ant package" which will generate a .tgz file). > > Caution-https://issues.apache.org/jira/browse/SOLR-15843 > > > > On Thu, Mar 24, 2022 at 12:42 AM Andy Lester wrote: > > > >> Go to the Caution-https://solr.apache.org/security.html URL and you > >> will find instructions there on what to do. > >> > >> Andy > > > > > smime.p7s Description: S/MIME cryptographic signature
Re: [URL Verdict: Neutral][Non-DoD Source] Re: Solr 8.11.1 upgrading LOG4J from 2.16 to 2.17
You need to manage your risk in that case -- Which is worse? a potential log4j vulnerability, your own "hacked" solr war, deploying a pre-release, or delaying the prod rollout? Will your security scan team allow you to give a mitigation plan and a timeline for a prod upgrade? On Thu, Mar 24, 2022 at 8:33 AM Heller, George A III CTR (USA) wrote: > What happens if we need to deploy to production before 8.11.2 is released? > > -Original Message- > From: Houston Putman > Sent: Wednesday, March 23, 2022 7:15 PM > To: users@solr.apache.org > Subject: [URL Verdict: Neutral][Non-DoD Source] Re: Solr 8.11.1 upgrading > LOG4J from 2.16 to 2.17 > > All active links contained in this email were disabled. Please verify the > identity of the sender, and confirm the authenticity of all links contained > within the message prior to copying and pasting the address to a Web > browser. > > > > > > > Please do not create another JIRA, it is already committed, just waiting > on the 8.11.2 release. > > Caution-https://issues.apache.org/jira/browse/SOLR-15871 > > The suggestion across multiple threads in the users list has been to > remove the log4j jar, and replace it with the 2.17.1 jar, which will pass > security checks. > > On Wed, Mar 23, 2022 at 5:53 PM Ishan Chattopadhyaya < > ichattopadhy...@gmail.com> wrote: > > > And feel free to open a new JIRA for this log4j upgrade, it will get > > picked up in 8.11.2 (whenever someone gets time to release it). > > > > On Thu, Mar 24, 2022 at 3:18 AM Ishan Chattopadhyaya < > > ichattopadhy...@gmail.com> wrote: > > > > > Here's the issue where Log4J was upgraded. You can look at the pull > > > request there to find out what you need to change. After that, you > > > can build your own Solr binaries for your use (fix in > > > github.com/apache/lucene-solr's branch_8_11 and build using "ant > > > ivy-bootstrap; cd solr; ant package" which will generate a .tgz file). > > > Caution-https://issues.apache.org/jira/browse/SOLR-15843 > > > > > > On Thu, Mar 24, 2022 at 12:42 AM Andy Lester > wrote: > > > > > >> Go to the Caution-https://solr.apache.org/security.html URL and you > > >> will find instructions there on what to do. > > >> > > >> Andy > > > > > > > > >
Re: Representative filtering of very large result sets
Yeah, that's a tricky problem. Keeping the result set small without losing results. I don't have an answer except as you already mentioned which would be to limit the query in some way. Joel Bernstein http://joelsolr.blogspot.com/ On Thu, Mar 24, 2022 at 8:24 AM Jeremy Buckley - IQ-C wrote: > Thanks, Joel, that is exactly what we are doing. We have four shards and > are sharding on the collapse key. Performance is fine (subsecond) as long > as the result set is relatively small. I am really looking for the best > way to ensure that this is always true. > > On Wed, Mar 23, 2022 at 10:18 PM Joel Bernstein > wrote: > > > To collapse on 30 million distinct values is going to cause memory > problems > > for sure. If the heap is growing as the result set grows that means you > are > > likely using a newer version of Solr which collapses into a hashmap. > Older > > versions of Solr would collapse into an array 30 million in length which > > probably would have blown up memory with even small result sets. > > > > I think you're going to need to shard to get this to perform well. With > > SolrCloud you can shard on the collapse key ( > > > > > https://solr.apache.org/guide/8_7/shards-and-indexing-data-in-solrcloud.html#document-routing > > ). > > This will send all documents with the same collapse key to the same > shard. > > Then run the collapse query on the sharded collection. > > > > Joel Bernstein > > http://joelsolr.blogspot.com/ > > > > >
Re: Representative filtering of very large result sets
Are you determining your "top doc" for each collapsed group based on score? If your use case is such that you determine the "top doc" based on a static field with a manageable number of values, you may have other options available to you. (For some use cases it can be acceptable to "pre-filter" the domain with creative fq params. This works iff your "collapse" could be considered a type of "deduplication" with doc priority determined by a static field; but it's a non-starter if you know you need to search over the full uncollapsed domain.) Michael On Thu, Mar 24, 2022 at 9:11 AM Joel Bernstein wrote: > Yeah, that's a tricky problem. Keeping the result set small without losing > results. I don't have an answer except as you already mentioned which would > be to limit the query in some way. > > > Joel Bernstein > http://joelsolr.blogspot.com/ > > > On Thu, Mar 24, 2022 at 8:24 AM Jeremy Buckley - IQ-C > wrote: > > > Thanks, Joel, that is exactly what we are doing. We have four shards and > > are sharding on the collapse key. Performance is fine (subsecond) as > long > > as the result set is relatively small. I am really looking for the best > > way to ensure that this is always true. > > > > On Wed, Mar 23, 2022 at 10:18 PM Joel Bernstein > > wrote: > > > > > To collapse on 30 million distinct values is going to cause memory > > problems > > > for sure. If the heap is growing as the result set grows that means you > > are > > > likely using a newer version of Solr which collapses into a hashmap. > > Older > > > versions of Solr would collapse into an array 30 million in length > which > > > probably would have blown up memory with even small result sets. > > > > > > I think you're going to need to shard to get this to perform well. With > > > SolrCloud you can shard on the collapse key ( > > > > > > > > > https://solr.apache.org/guide/8_7/shards-and-indexing-data-in-solrcloud.html#document-routing > > > ). > > > This will send all documents with the same collapse key to the same > > shard. > > > Then run the collapse query on the sharded collection. > > > > > > Joel Bernstein > > > http://joelsolr.blogspot.com/ > > > > > > > > >
Re: Representative filtering of very large result sets
Thanks, Michael. I think this will work, and it is the direction I am heading. We are collapsing for deduplication, sort of. We do need to search over the full uncollapsed domain, but I am pretty sure that nobody needs to see 40 million results, and if they're dumb enough to enter a query that matches that many documents, they deserve whatever they get. So my strategy is: 1. Check the query to see if it looks "safe" based on some heuristics. 2. If (1) fails do a search to get only the result count with rows=0 and no faceting or sorting. This is usually pretty fast. 3. If the count returned in (2) is above a certain threshold, add my extra filter query before executing the full faceted search Thanks, everyone! On Thu, Mar 24, 2022 at 10:04 AM Michael Gibney wrote: > Are you determining your "top doc" for each collapsed group based on score? > If your use case is such that you determine the "top doc" based on a static > field with a manageable number of values, you may have other options > available to you. (For some use cases it can be acceptable to "pre-filter" > the domain with creative fq params. This works iff your "collapse" could be > considered a type of "deduplication" with doc priority determined by a > static field; but it's a non-starter if you know you need to search over > the full uncollapsed domain.) > > Michael >
RE: [URL Verdict: Neutral][Non-DoD Source] Re: Solr 8.11.1 upgrading LOG4J from 2.16 to 2.17
Thanks for helpful info. I did notice there was no equivalent for log4j-layout-template-json-2.16.0. We have changed the query return from XML to JSON, but have done nothing to the logging. I will implement this solution after lunch and test it(make sure solr runs and populates the logs). BTW, Do you know of anything other than nssm or AlwaysUp that would create a Windows service to start Solr when the server is rebooted? NSSM failed our security scan and not sure if cheap bosses want to pay the small fee for AlwaysUp. Thanks Again for Your Helpful Info, George -Original Message- From: Shawn Heisey Sent: Wednesday, March 23, 2022 10:55 PM To: users@solr.apache.org Subject: [URL Verdict: Neutral][Non-DoD Source] Re: Solr 8.11.1 upgrading LOG4J from 2.16 to 2.17 All active links contained in this email were disabled. Please verify the identity of the sender, and confirm the authenticity of all links contained within the message prior to copying and pasting the address to a Web browser. On 3/23/2022 12:36 PM, Heller, George A III CTR (USA) wrote: > Can someone tell me where I can download an upgrade or patch for LOG4J > and instructions on how to implement it? Did you try googling? Because if I enter "log4j download" (minus the quotes) into Google, the first hit looks like it is exactly what you want. You'll want the "binary" download, either .tar.gz or .zip format. As for what to do with it once you download it, just find all the log4j jars in your Solr directory and replace them with jars from the log4j archive that have the same names and different version numbers. There has been a fair amount of user testing and we have determined that this is a safe operation, as long as you don't leave some jars at a different version than the rest. The log4j public API is very stable, which is why this is safe to do, but I have no idea how stable their internal APIs are. Depending on the exact Solr version you have, you may have a jar that starts with "log4j-layout-template-json" ... this jar won't be in the log4j download. If you have not changed Solr's logging configuration so that it outputs JSON formatted logs, you can safely delete this one jar. If you actually need an upgraded version of that jar, you can find it on Maven Central. Caution-https://repo1.maven.org/maven2/org/apache/logging/log4j/log4j-layout-template-json/2.17.2/log4j-layout-template-json-2.17.2.jar Thanks, Shawn h ttps://lmgtfy.app/?q=log4j+download smime.p7s Description: S/MIME cryptographic signature
Re: [URL Verdict: Neutral][Non-DoD Source] Re: Solr 8.11.1 upgrading LOG4J from 2.16 to 2.17
On 3/24/22 09:38, Heller, George A III CTR (USA) wrote: BTW, Do you know of anything other than nssm or AlwaysUp that would create a Windows service to start Solr when the server is rebooted? NSSM failed our security scan and not sure if cheap bosses want to pay the small fee for AlwaysUp. My honest opinion for how you can best deal with a Windows server? Don't run Windows. Put the workload onto one of the open source operating systems. I use Linux, but there also other choices. Since you're probably in a situation where you can't follow that advice... NSSM is what I've seen used quite a bit. Apache has a project that I think will work as well. https://commons.apache.org/proper/commons-daemon/procrun.html I found a number of resources with "run java application as a service on windows" as a google search. I couldn't find any mention of a confirmed vulnerability in NSSM. I did find something about a vulnerability in the CouchDB installer related to installing NSSM but I haven't yet found anything for NSSM itself. Thanks, Shawn
Re: Can I insert query result into another collection of the same Solr?
Hi Zhiqing, You can use the 'Stream' menu option on solr console to run the streaming query. Paste the streaming expression and execute - Solrj can also execute the expression , here is a page that explains it- https://lucidworks.com/post/streaming-expressions-in-solrj/ search(collection1, q="*:*", qt="/export", fl="id,a_s,a_i,a_f", sort="a_f asc, a_i asc") On Wed, Mar 23, 2022 at 4:08 PM WU, Zhiqing wrote: > Hi Susmit, > Thanks for your reply. > Since I do not have much experience with the streaming api of solr, I only > can understand a part of the page and do not know how to implement related > parts with SolrJ. > Is it possible you could recommend some books or webpages which contain > examples for streaming api? > Looking forward to your reply. > Kind regards, > Zhiqing > > On Wed, 23 Mar 2022 at 14:34, Susmit wrote: > > > Hi, > > you can look at the updatestream from streaming api of solr, it can take > a > > search expression and emitted tuples can be added to a new collection. > > > > https://solr.apache.org/guide/8_4/stream-decorator-reference.html > > > > Sent from my iPhone > > > > > On Mar 23, 2022, at 4:06 AM, WU, Zhiqing wrote: > > > > > > Hello, > > > I did a query based on one collection and want to insert its result > into > > > another collection of the same Solr. The query result has the same > fields > > > as that of another collection. Is there a simple way to do the job? > > > If the query result has to be moved outside Solr before being added to > > > another collection of the same Solr, it would not be very efficient > when > > > the query result is very large. > > > Relevant information would be welcome. > > > Kind regards, > > > Zhiqing > > >
Re: Secure SSL connections between Solr and ZooKeeper
On 2022-03-24 04:42 +, Sam Lee wrote: > In my ZooKeeper configuration (zoo.cfg), I have this: > > secureClientPort=2182 > #clientPort=2181 # Disabled. Allow secure connections only. > > ssl.clientAuth=need > ssl.keystore.location=/opt/zookeeper/conf/zk-keystore.jks > ssl.keystore.password=123456 > ssl.truststore.location=/opt/zookeeper/conf/zk-truststore.jks > ssl.truststore.password=123456 > > # ... Sorry, I made some typos in the zoo.cfg file. It's supposed to be: secureClientPort=2182 #clientPort=2181 # Disabled. Allow secure connections only. ssl.clientAuth=need ssl.keyStore.location=/opt/zookeeper/conf/zk-keystore.jks ssl.keyStore.password=123456 ssl.trustStore.location=/opt/zookeeper/conf/zk-truststore.jks ssl.trustStore.password=123456 # ... Notice the change in capitalization. But the question still stands. How do I connect SolrCloud to ZooKeeper via SSL?
Re: Secure SSL connections between Solr and ZooKeeper
I think I've found the way to connect SolrCloud to an external ZooKeeper ensemble via SSL. By default, Solr does not use SSL to connect to ZooKeeper. So if the ZooKeeper configuration requires SSL for client connections, Solr will complain like this when it tries to connect to ZooKeeper: --8<---cut here---start->8--- WARN - 2022-03-25 12:34:43.681; org.apache.zookeeper.ClientCnxn; Session 0x0 for sever localhost/127.0.0.1:2182, Closing socket connection. Attempting reconnect except it is a SessionExpiredException. => EndOfStreamException: Unable to read additional data from server sessionid 0x0, likely server has closed socket at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:77) org.apache.zookeeper.ClientCnxn$EndOfStreamException: Unable to read additional data from server sessionid 0x0, likely server has closed socket at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:77) ~[zookeeper-3.6.2.jar:3.6.2] at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) ~[zookeeper-3.6.2.jar:3.6.2] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1275) ~[zookeeper-3.6.2.jar:3.6.2] --8<---cut here---end--->8--- On the ZooKeeper side, the corresponding log entry is something like this: --8<---cut here---start->8--- 2022-03-25 12:34:43,652 [myid:1] - ERROR [nioEventLoopGroup-4-2:NettyServerCnxnFactory$CertificateVerifier@448] - Unsuccessful handshake with session 0x0 2022-03-25 12:34:43,682 [myid:1] - WARN [nioEventLoopGroup-4-2:NettyServerCnxnFactory$CnxnChannelHandler@284] - Exception caught io.netty.handler.codec.DecoderException: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 002d7530001000 at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:478) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 002d7530001000 at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1232) at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1300) at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:508) at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:447) ... 17 more --8<---cut here---end--->8--- This error message indicates that ZooKeeper was expecting an SSL connection, but the client (i.e. Solr) was connecting without SSL. The solution is to add the appropriate ZooKeeper Java properties. Notice that these are exactly the same properties needed by standalone ZooKeeper's 'zkServer.sh' and 'zkCli.sh' to connect to ZooKeeper via SSL [1] [2]. Add the following to bin/solr.in.sh: --8<---cut here---start->8--- SOLR_OPTS="$SOLR_OPTS -Dzookeeper.client.secure=true -Dzookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty -Dzoo