Solr Error

2022-03-24 Thread HariBabu kuruva
Hi All,

Am getting  below error in the solr , which is affecting the services which
are accessing the dqmStore. Please suggest if you have come across this
issue.


2022-03-24 08:48:36.629 ERROR (qtp1198197478-620853) [c:dqmStore s:shard1
r:core_node4 x:dqmStore_shard1_replica_n2] o.a.s.
u.SolrCmdDistributor java.io.IOException: Request processing has stalled
for 20036ms with 100 remaining elements in the que
ue.

-- 

Thanks and Regards,
 Hari
Mobile:9790756568


Re: Representative filtering of very large result sets

2022-03-24 Thread Jeremy Buckley - IQ-C
Thanks, Joel, that is exactly what we are doing.  We have four shards and
are sharding on the collapse key.  Performance is fine (subsecond) as long
as the result set is relatively small.  I am really looking for the best
way to ensure that this is always true.

On Wed, Mar 23, 2022 at 10:18 PM Joel Bernstein  wrote:

> To collapse on 30 million distinct values is going to cause memory problems
> for sure. If the heap is growing as the result set grows that means you are
> likely using a newer version of Solr which collapses into a hashmap. Older
> versions of Solr would collapse into an array 30 million in length which
> probably would have blown up memory with even small result sets.
>
> I think you're going to need to shard to get this to perform well. With
> SolrCloud you can shard on the collapse key (
>
> https://solr.apache.org/guide/8_7/shards-and-indexing-data-in-solrcloud.html#document-routing
> ).
> This will send all documents with the same collapse key to the same shard.
> Then run the collapse query on the sharded collection.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>


RE: [URL Verdict: Neutral][Non-DoD Source] Re: Solr 8.11.1 upgrading LOG4J from 2.16 to 2.17

2022-03-24 Thread Heller, George A III CTR (USA)
What happens if we need to deploy to production before 8.11.2 is released?

-Original Message-
From: Houston Putman  
Sent: Wednesday, March 23, 2022 7:15 PM
To: users@solr.apache.org
Subject: [URL Verdict: Neutral][Non-DoD Source] Re: Solr 8.11.1 upgrading LOG4J 
from 2.16 to 2.17

All active links contained in this email were disabled.  Please verify the 
identity of the sender, and confirm the authenticity of all links contained 
within the message prior to copying and pasting the address to a Web browser.  






Please do not create another JIRA, it is already committed, just waiting on the 
8.11.2 release.

Caution-https://issues.apache.org/jira/browse/SOLR-15871

The suggestion across multiple threads in the users list has been to remove the 
log4j jar, and replace it with the 2.17.1 jar, which will pass security checks.

On Wed, Mar 23, 2022 at 5:53 PM Ishan Chattopadhyaya < 
ichattopadhy...@gmail.com> wrote:

> And feel free to open a new JIRA for this log4j upgrade, it will get 
> picked up in 8.11.2 (whenever someone gets time to release it).
>
> On Thu, Mar 24, 2022 at 3:18 AM Ishan Chattopadhyaya < 
> ichattopadhy...@gmail.com> wrote:
>
> > Here's the issue where Log4J was upgraded. You can look at the pull 
> > request there to find out what you need to change. After that, you 
> > can build your own Solr binaries for your use (fix in 
> > github.com/apache/lucene-solr's branch_8_11 and build using "ant 
> > ivy-bootstrap; cd solr; ant package" which will generate a .tgz file).
> > Caution-https://issues.apache.org/jira/browse/SOLR-15843
> >
> > On Thu, Mar 24, 2022 at 12:42 AM Andy Lester  wrote:
> >
> >> Go to the Caution-https://solr.apache.org/security.html URL and you 
> >> will find instructions there on what to do.
> >>
> >> Andy
> >
> >
>


smime.p7s
Description: S/MIME cryptographic signature


Re: [URL Verdict: Neutral][Non-DoD Source] Re: Solr 8.11.1 upgrading LOG4J from 2.16 to 2.17

2022-03-24 Thread matthew sporleder
You need to manage your risk in that case --

Which is worse? a potential log4j vulnerability, your own "hacked" solr
war, deploying a pre-release, or delaying the prod rollout?

Will your security scan team allow you to give a mitigation plan and a
timeline for a prod upgrade?




On Thu, Mar 24, 2022 at 8:33 AM Heller, George A III CTR (USA)
 wrote:

> What happens if we need to deploy to production before 8.11.2 is released?
>
> -Original Message-
> From: Houston Putman 
> Sent: Wednesday, March 23, 2022 7:15 PM
> To: users@solr.apache.org
> Subject: [URL Verdict: Neutral][Non-DoD Source] Re: Solr 8.11.1 upgrading
> LOG4J from 2.16 to 2.17
>
> All active links contained in this email were disabled.  Please verify the
> identity of the sender, and confirm the authenticity of all links contained
> within the message prior to copying and pasting the address to a Web
> browser.
>
>
>
>
> 
>
> Please do not create another JIRA, it is already committed, just waiting
> on the 8.11.2 release.
>
> Caution-https://issues.apache.org/jira/browse/SOLR-15871
>
> The suggestion across multiple threads in the users list has been to
> remove the log4j jar, and replace it with the 2.17.1 jar, which will pass
> security checks.
>
> On Wed, Mar 23, 2022 at 5:53 PM Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
>
> > And feel free to open a new JIRA for this log4j upgrade, it will get
> > picked up in 8.11.2 (whenever someone gets time to release it).
> >
> > On Thu, Mar 24, 2022 at 3:18 AM Ishan Chattopadhyaya <
> > ichattopadhy...@gmail.com> wrote:
> >
> > > Here's the issue where Log4J was upgraded. You can look at the pull
> > > request there to find out what you need to change. After that, you
> > > can build your own Solr binaries for your use (fix in
> > > github.com/apache/lucene-solr's branch_8_11 and build using "ant
> > > ivy-bootstrap; cd solr; ant package" which will generate a .tgz file).
> > > Caution-https://issues.apache.org/jira/browse/SOLR-15843
> > >
> > > On Thu, Mar 24, 2022 at 12:42 AM Andy Lester 
> wrote:
> > >
> > >> Go to the Caution-https://solr.apache.org/security.html URL and you
> > >> will find instructions there on what to do.
> > >>
> > >> Andy
> > >
> > >
> >
>


Re: Representative filtering of very large result sets

2022-03-24 Thread Joel Bernstein
Yeah, that's a tricky problem. Keeping the result set small without losing
results. I don't have an answer except as you already mentioned which would
be to limit the query in some way.


Joel Bernstein
http://joelsolr.blogspot.com/


On Thu, Mar 24, 2022 at 8:24 AM Jeremy Buckley - IQ-C
 wrote:

> Thanks, Joel, that is exactly what we are doing.  We have four shards and
> are sharding on the collapse key.  Performance is fine (subsecond) as long
> as the result set is relatively small.  I am really looking for the best
> way to ensure that this is always true.
>
> On Wed, Mar 23, 2022 at 10:18 PM Joel Bernstein 
> wrote:
>
> > To collapse on 30 million distinct values is going to cause memory
> problems
> > for sure. If the heap is growing as the result set grows that means you
> are
> > likely using a newer version of Solr which collapses into a hashmap.
> Older
> > versions of Solr would collapse into an array 30 million in length which
> > probably would have blown up memory with even small result sets.
> >
> > I think you're going to need to shard to get this to perform well. With
> > SolrCloud you can shard on the collapse key (
> >
> >
> https://solr.apache.org/guide/8_7/shards-and-indexing-data-in-solrcloud.html#document-routing
> > ).
> > This will send all documents with the same collapse key to the same
> shard.
> > Then run the collapse query on the sharded collection.
> >
> > Joel Bernstein
> > http://joelsolr.blogspot.com/
> >
> >
>


Re: Representative filtering of very large result sets

2022-03-24 Thread Michael Gibney
Are you determining your "top doc" for each collapsed group based on score?
If your use case is such that you determine the "top doc" based on a static
field with a manageable number of values, you may have other options
available to you. (For some use cases it can be acceptable to "pre-filter"
the domain with creative fq params. This works iff your "collapse" could be
considered a type of "deduplication" with doc priority determined by a
static field; but it's a non-starter if you know you need to search over
the full uncollapsed domain.)

Michael

On Thu, Mar 24, 2022 at 9:11 AM Joel Bernstein  wrote:

> Yeah, that's a tricky problem. Keeping the result set small without losing
> results. I don't have an answer except as you already mentioned which would
> be to limit the query in some way.
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Thu, Mar 24, 2022 at 8:24 AM Jeremy Buckley - IQ-C
>  wrote:
>
> > Thanks, Joel, that is exactly what we are doing.  We have four shards and
> > are sharding on the collapse key.  Performance is fine (subsecond) as
> long
> > as the result set is relatively small.  I am really looking for the best
> > way to ensure that this is always true.
> >
> > On Wed, Mar 23, 2022 at 10:18 PM Joel Bernstein 
> > wrote:
> >
> > > To collapse on 30 million distinct values is going to cause memory
> > problems
> > > for sure. If the heap is growing as the result set grows that means you
> > are
> > > likely using a newer version of Solr which collapses into a hashmap.
> > Older
> > > versions of Solr would collapse into an array 30 million in length
> which
> > > probably would have blown up memory with even small result sets.
> > >
> > > I think you're going to need to shard to get this to perform well. With
> > > SolrCloud you can shard on the collapse key (
> > >
> > >
> >
> https://solr.apache.org/guide/8_7/shards-and-indexing-data-in-solrcloud.html#document-routing
> > > ).
> > > This will send all documents with the same collapse key to the same
> > shard.
> > > Then run the collapse query on the sharded collection.
> > >
> > > Joel Bernstein
> > > http://joelsolr.blogspot.com/
> > >
> > >
> >
>


Re: Representative filtering of very large result sets

2022-03-24 Thread Jeremy Buckley - IQ-C
Thanks, Michael. I think this will work, and it is the direction I am
heading.  We are collapsing for deduplication, sort of.

We do need to search over the full uncollapsed domain, but I am pretty sure
that nobody needs to see 40 million results, and if they're dumb enough to
enter a query that matches that many documents, they deserve whatever they
get.

So my strategy is:
1. Check the query to see if it looks "safe" based on some heuristics.
2. If (1) fails do a search to get only the result count with rows=0 and no
faceting or sorting. This is usually pretty fast.
3. If the count returned in (2)  is above a certain threshold, add my extra
filter query before executing the full faceted search

Thanks, everyone!

On Thu, Mar 24, 2022 at 10:04 AM Michael Gibney 
wrote:

> Are you determining your "top doc" for each collapsed group based on score?
> If your use case is such that you determine the "top doc" based on a static
> field with a manageable number of values, you may have other options
> available to you. (For some use cases it can be acceptable to "pre-filter"
> the domain with creative fq params. This works iff your "collapse" could be
> considered a type of "deduplication" with doc priority determined by a
> static field; but it's a non-starter if you know you need to search over
> the full uncollapsed domain.)
>
> Michael
>


RE: [URL Verdict: Neutral][Non-DoD Source] Re: Solr 8.11.1 upgrading LOG4J from 2.16 to 2.17

2022-03-24 Thread Heller, George A III CTR (USA)
Thanks for helpful info. I did notice there was no equivalent for 
log4j-layout-template-json-2.16.0. 

We have changed the query return from XML to JSON, but have done nothing to the 
logging.

I will implement this solution after lunch and test it(make sure solr runs and 
populates the logs). 

BTW, Do you know of anything other than nssm or AlwaysUp that would create a 
Windows service to start Solr when the server is rebooted?

NSSM failed our security scan and not sure if cheap bosses want to pay the 
small fee for AlwaysUp.

Thanks Again for Your Helpful Info,
George

-Original Message-
From: Shawn Heisey  
Sent: Wednesday, March 23, 2022 10:55 PM
To: users@solr.apache.org
Subject: [URL Verdict: Neutral][Non-DoD Source] Re: Solr 8.11.1 upgrading LOG4J 
from 2.16 to 2.17

All active links contained in this email were disabled.  Please verify the 
identity of the sender, and confirm the authenticity of all links contained 
within the message prior to copying and pasting the address to a Web browser.



On 3/23/2022 12:36 PM, Heller, George A III CTR (USA) wrote:
> Can someone tell me where I can download an upgrade or patch for LOG4J 
> and instructions on how to implement it?

Did you try googling?  Because if I enter "log4j download" (minus the
quotes) into Google, the first hit looks like it is exactly what you want.  
You'll want the "binary" download, either .tar.gz or .zip format.

As for what to do with it once you download it, just find all the log4j jars in 
your Solr directory and replace them with jars from the log4j archive that have 
the same names and different version numbers.  There has been a fair amount of 
user testing and we have determined that this is a safe operation, as long as 
you don't leave some jars at a different version than the rest.  The log4j 
public API is very stable, which is why this is safe to do, but I have no idea 
how stable their internal APIs are.

Depending on the exact Solr version you have, you may have a jar that starts 
with "log4j-layout-template-json" ... this jar won't be in the log4j download.  
If you have not changed Solr's logging configuration so that it outputs JSON 
formatted logs, you can safely delete this one jar.  If you actually need an 
upgraded version of that jar, you can find it on Maven Central.

Caution-https://repo1.maven.org/maven2/org/apache/logging/log4j/log4j-layout-template-json/2.17.2/log4j-layout-template-json-2.17.2.jar

Thanks,
Shawn

h ttps://lmgtfy.app/?q=log4j+download


smime.p7s
Description: S/MIME cryptographic signature


Re: [URL Verdict: Neutral][Non-DoD Source] Re: Solr 8.11.1 upgrading LOG4J from 2.16 to 2.17

2022-03-24 Thread Shawn Heisey

On 3/24/22 09:38, Heller, George A III CTR (USA) wrote:

BTW, Do you know of anything other than nssm or AlwaysUp that would create a 
Windows service to start Solr when the server is rebooted?

NSSM failed our security scan and not sure if cheap bosses want to pay the 
small fee for AlwaysUp.


My honest opinion for how you can best deal with a Windows server?  
Don't run Windows.  Put the workload onto one of the open source 
operating systems.  I use Linux, but there also other choices.


Since you're probably in a situation where you can't follow that 
advice...  NSSM is what I've seen used quite a bit.  Apache has a 
project that I think will work as well.


https://commons.apache.org/proper/commons-daemon/procrun.html

I found a number of resources with "run java application as a service on 
windows" as a google search.


I couldn't find any mention of a confirmed vulnerability in NSSM. I did 
find something about a vulnerability in the CouchDB installer related to 
installing NSSM but I haven't yet found anything for NSSM itself.


Thanks,
Shawn



Re: Can I insert query result into another collection of the same Solr?

2022-03-24 Thread Susmit Shukla
Hi Zhiqing,

You can use the 'Stream' menu option on solr console to run the streaming
query. Paste the streaming expression and execute -
Solrj can also execute the expression , here is a page that explains it-
https://lucidworks.com/post/streaming-expressions-in-solrj/

  search(collection1,
 q="*:*",
 qt="/export",
 fl="id,a_s,a_i,a_f",
 sort="a_f asc, a_i asc")


On Wed, Mar 23, 2022 at 4:08 PM WU, Zhiqing  wrote:

> Hi Susmit,
> Thanks for your reply.
> Since I do not have much experience with the streaming api of solr, I only
> can understand a part of the page and do not know how to implement related
> parts with SolrJ.
> Is it possible you could recommend some books or webpages which contain
> examples for streaming api?
> Looking forward to your reply.
> Kind regards,
> Zhiqing
>
> On Wed, 23 Mar 2022 at 14:34, Susmit  wrote:
>
> > Hi,
> > you can look at the updatestream from streaming api of solr, it can take
> a
> > search expression and emitted tuples can be added to a new collection.
> >
> > https://solr.apache.org/guide/8_4/stream-decorator-reference.html
> >
> > Sent from my iPhone
> >
> > > On Mar 23, 2022, at 4:06 AM, WU, Zhiqing  wrote:
> > >
> > > Hello,
> > > I did a query based on one collection and want to insert its result
> into
> > > another collection of the same Solr. The query result has the same
> fields
> > > as that of another collection. Is there a simple way to do the job?
> > > If the query result has to be moved outside Solr before being added to
> > > another collection of the same Solr, it would not be very efficient
> when
> > > the query result is very large.
> > > Relevant information would be welcome.
> > > Kind regards,
> > > Zhiqing
> >
>


Re: Secure SSL connections between Solr and ZooKeeper

2022-03-24 Thread Sam Lee
On 2022-03-24 04:42 +, Sam Lee wrote:
> In my ZooKeeper configuration (zoo.cfg), I have this:
>
> secureClientPort=2182
> #clientPort=2181  # Disabled. Allow secure connections only.
>
> ssl.clientAuth=need
> ssl.keystore.location=/opt/zookeeper/conf/zk-keystore.jks
> ssl.keystore.password=123456
> ssl.truststore.location=/opt/zookeeper/conf/zk-truststore.jks
> ssl.truststore.password=123456
>
> # ...

Sorry, I made some typos in the zoo.cfg file. It's supposed to be:

 secureClientPort=2182
 #clientPort=2181  # Disabled. Allow secure connections only.

 ssl.clientAuth=need
 ssl.keyStore.location=/opt/zookeeper/conf/zk-keystore.jks
 ssl.keyStore.password=123456
 ssl.trustStore.location=/opt/zookeeper/conf/zk-truststore.jks
 ssl.trustStore.password=123456

 # ...

Notice the change in capitalization.

But the question still stands. How do I connect SolrCloud to ZooKeeper
via SSL?


Re: Secure SSL connections between Solr and ZooKeeper

2022-03-24 Thread Sam Lee
I think I've found the way to connect SolrCloud to an external ZooKeeper
ensemble via SSL.

By default, Solr does not use SSL to connect to ZooKeeper. So if the
ZooKeeper configuration requires SSL for client connections, Solr will
complain like this when it tries to connect to ZooKeeper:

--8<---cut here---start->8---
WARN  - 2022-03-25 12:34:43.681; org.apache.zookeeper.ClientCnxn; Session 0x0 
for sever localhost/127.0.0.1:2182, Closing socket connection. Attempting 
reconnect except it is a SessionExpiredException. => EndOfStreamException: 
Unable to read additional data from server sessionid 0x0, likely server has 
closed socket
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:77)
org.apache.zookeeper.ClientCnxn$EndOfStreamException: Unable to read additional 
data from server sessionid 0x0, likely server has closed socket
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:77) 
~[zookeeper-3.6.2.jar:3.6.2]
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
 ~[zookeeper-3.6.2.jar:3.6.2]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1275) 
~[zookeeper-3.6.2.jar:3.6.2]
--8<---cut here---end--->8---

On the ZooKeeper side, the corresponding log entry is something like
this:

--8<---cut here---start->8---
2022-03-25 12:34:43,652 [myid:1] - ERROR 
[nioEventLoopGroup-4-2:NettyServerCnxnFactory$CertificateVerifier@448] - 
Unsuccessful handshake with session 0x0
2022-03-25 12:34:43,682 [myid:1] - WARN  
[nioEventLoopGroup-4-2:NettyServerCnxnFactory$CnxnChannelHandler@284] - 
Exception caught
io.netty.handler.codec.DecoderException: 
io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 
002d7530001000
at 
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:478)
at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at 
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 
002d7530001000
at 
io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1232)
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1300)
at 
io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:508)
at 
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:447)
... 17 more
--8<---cut here---end--->8---

This error message indicates that ZooKeeper was expecting an SSL
connection, but the client (i.e. Solr) was connecting without SSL.

The solution is to add the appropriate ZooKeeper Java properties. Notice
that these are exactly the same properties needed by standalone
ZooKeeper's 'zkServer.sh' and 'zkCli.sh' to connect to ZooKeeper via
SSL [1] [2]. Add the following to bin/solr.in.sh:

--8<---cut here---start->8---
SOLR_OPTS="$SOLR_OPTS
-Dzookeeper.client.secure=true
-Dzookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty
-Dzoo