Thanks!! I’ll work on it. Right now I’m playing around modifying bytecode of SolrClientCache directly. Don’t know if it will work though.
Sent from Mail for Windows 10 From: ufuk yılmaz Sent: 06 March 2021 20:25 To: users@solr.apache.org Subject: RE: Idle timeout expired and Early Client Disconnect errors How? O_O Sent from Mail for Windows 10 From: Susmit Sent: 06 March 2021 18:35 To: solr-u...@lucene.apache.org Subject: Re: Idle timeout expired and Early Client Disconnect errors i have used a workaround to increase the default (hard coded) timeout of 2 min in solrclientcache. i can run 9+ hour long streaming queries with no issues. Sent from my iPhone > On Mar 2, 2021, at 5:32 PM, ufuk yılmaz <uyil...@vivaldi.net.invalid> wrote: > > I divided the query to 1000 pieces and removed the parallel stream clause, > it seems to be working without timeout so far, if it does I just can divide > it to even smaller pieces I guess. > > I tried to send all 1000 pieces in a “list” expression to be executed > linearly, it didn’t work but I was just curious if it could handle such a > large query 😃 > > Now I’m just generating expression strings from java code and sending them > one by one. I tried to use SolrJ for this, but encountered a weird problem > where even the simplest expression (echo) stops working after a few > iterations in a loop. I’m guessing the underlying HttpClient is not closing > connections timely, hitting the OS per-host connection limit. I asked a > separate question about this. I was following the example on lucidworks: > https://lucidworks.com/post/streaming-expressions-in-solrj/ > > I just modified my code to use regular REST calls using okhttp3, it’s a shame > that I couldn’t use SolrJ since it truly streams every result 1 by 1 > continuously. REST just returns a single large response at the very end of > the stream. > > Thanks again for your help. > > Sent from Mail for Windows 10 > > From: Joel Bernstein > Sent: 02 March 2021 00:19 > To: solr-u...@lucene.apache.org > Subject: Re: Idle timeout expired and Early Client Disconnect errors > > Also the parallel function builds hash partitioning filters that could lead > to timeouts if they take too long to build. Try the query without the > parallel function if you're still getting timeouts when making the query > smaller. > > > > Joel Bernstein > http://joelsolr.blogspot.com/ > > >> On Mon, Mar 1, 2021 at 4:03 PM Joel Bernstein <joels...@gmail.com> wrote: >> >> The settings in your version are 30 seconds and 15 seconds for socket and >> connection timeouts. >> >> Typically timeouts occur because one or more shards in the query are idle >> beyond the timeout threshold. This happens because lot's of data is being >> read from other shards. >> >> Breaking the query into small parts would be a good strategy. >> >> >> >> >> Joel Bernstein >> http://joelsolr.blogspot.com/ >> >> >> On Mon, Mar 1, 2021 at 3:30 PM ufuk yılmaz <uyil...@vivaldi.net.invalid> >> wrote: >> >>> Hello Mr. Bernstein, >>> >>> I’m using version 8.4. So, if I understand correctly, I can’t increase >>> timeouts and they are bound to happen in such a large stream. Should I just >>> reduce the output of my search expressions? >>> >>> Maybe I can split my search results into ~100 parts and run the same >>> query 100 times in series. Each part would emit ~3M documents so they >>> should finish before timeout? >>> >>> Is this a reasonable solution? >>> >>> Btw how long is the default hard-coded timeout value? Because yesterday I >>> ran another query which took more than 1 hour without any timeouts and >>> finished successfully. >>> >>> Sent from Mail for Windows 10 >>> >>> From: Joel Bernstein >>> Sent: 01 March 2021 23:03 >>> To: solr-u...@lucene.apache.org >>> Subject: Re: Idle timeout expired and Early Client Disconnect errors >>> >>> Oh wait, I misread your email. The idle timeout issue is configurable in: >>> >>> https://issues.apache.org/jira/browse/SOLR-14672 >>> >>> This unfortunately missed the 8.8 release and will be 8.9. >>> >>> >>> >>> This i >>> >>> >>> >>> Joel Bernstein >>> http://joelsolr.blogspot.com/ >>> >>> >>>> On Mon, Mar 1, 2021 at 2:56 PM Joel Bernstein <joels...@gmail.com> wrote: >>> >>>> What version are you using? >>>> >>>> Solr 8.7 has changes that caused these errors to hit the logs. These >>> used >>>> to be suppressed. This has been fixed in Solr 9.0 but it has not been >>> back >>>> ported to Solr 8.x. >>>> >>>> The errors are actually normal operational occurrences when doing joins >>> so >>>> should be suppressed in the logs and were before the specific release. >>>> >>>> It might make sense to do a release that specifically suppresses these >>>> errors without backporting the full Solr 9.0 changes which impact the >>>> memory footprint of export. >>>> >>>> >>>> >>>> >>>> Joel Bernstein >>>> http://joelsolr.blogspot.com/ >>>> >>>> >>>> On Mon, Mar 1, 2021 at 10:29 AM ufuk yılmaz <uyil...@vivaldi.net.invalid >>>> >>>> wrote: >>>> >>>>> Hello all, >>>>> >>>>> I’m running a large streaming expression and feeding the result to >>> update >>>>> expression. >>>>> >>>>> update(targetCollection, ...long running stream here..., >>>>> >>>>> I tried sending the exact same query multiple times, it sometimes works >>>>> and indexes some results, then gives exception, other times fails with >>> an >>>>> exception after 2 minutes. >>>>> >>>>> Response is like: >>>>> "EXCEPTION":"java.util.concurrent.ExecutionException: >>>>> java.io.IOException: params distrib=false&numWorkers=4.... and my long >>>>> stream expression >>>>> >>>>> Server log (short): >>>>> [c:DNM s:shard1 r:core_node2 x:DNM_shard1_replica_n1] >>>>> o.a.s.s.HttpSolrCall null:java.io.IOException: >>>>> java.util.concurrent.TimeoutException: Idle timeout expired: >>> 120000/120000 >>>>> ms >>>>> o.a.s.s.HttpSolrCall null:java.io.IOException: >>>>> java.util.concurrent.TimeoutException: Idle timeout expired: >>> 120000/120000 >>>>> ms >>>>> >>>>> I tried to increase the jetty idle timeout value on the node which >>> hosts >>>>> my target collection to something like an hour. It didn’t affect. >>>>> >>>>> >>>>> Server logs (long) >>>>> ERROR (qtp832292933-589) [c:DNM s:shard1 r:core_node2 >>>>> x:DNM_shard1_replica_n1] o.a.s.s.HttpSolrCall null:java.io.IOException: >>>>> java.util.concurrent.TimeoutException: Idle timeout expired: 1 >>>>> 20000/120000 ms >>>>> solr-01 | at >>>>> >>> org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:235) >>>>> solr-01 | at >>>>> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:226) >>>>> solr-01 | at >>>>> org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:524) >>>>> solr-01 | at >>>>> >>> org.apache.solr.servlet.ServletOutputStreamWrapper.write(ServletOutputStreamWrapper.java:134) >>>>> solr-01 | at >>>>> java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:233) >>>>> solr-01 | at >>>>> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:303) >>>>> solr-01 | at >>>>> java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:281) >>>>> solr-01 | at >>>>> java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125) >>>>> solr-01 | at java.base/java.io >>>>> .OutputStreamWriter.write(OutputStreamWriter.java:211) >>>>> solr-01 | at >>>>> org.apache.solr.common.util.FastWriter.flush(FastWriter.java:140) >>>>> solr-01 | at >>>>> org.apache.solr.common.util.FastWriter.write(FastWriter.java:54) >>>>> solr-01 | at >>>>> org.apache.solr.response.JSONWriter._writeChar(JSONWriter.java:173) >>>>> solr-01 | at >>>>> >>> org.apache.solr.common.util.JsonTextWriter.writeStr(JsonTextWriter.java:86) >>>>> solr-01 | at >>>>> org.apache.solr.common.util.TextWriter.writeVal(TextWriter.java:52) >>>>> solr-01 | at >>>>> >>> org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:152) >>>>> solr-01 | at >>>>> >>> org.apache.solr.common.util.JsonTextWriter$2.put(JsonTextWriter.java:176) >>>>> solr-01 | at >>>>> org.apache.solr.common.MapWriter$EntryWriter.put(MapWriter.java:154) >>>>> solr-01 | at >>>>> >>> org.apache.solr.handler.export.StringFieldWriter.write(StringFieldWriter.java:77) >>>>> solr-01 | at >>>>> >>> org.apache.solr.handler.export.ExportWriter.writeDoc(ExportWriter.java:313) >>>>> solr-01 | at >>>>> >>> org.apache.solr.handler.export.ExportWriter.lambda$addDocsToItemWriter$4(ExportWriter.java:263) >>>>> -- >>>>> solr-01 | at org.eclipse.jetty.io >>>>> .FillInterest.fillable(FillInterest.java:103) >>>>> solr-01 | at org.eclipse.jetty.io >>>>> .ChannelEndPoint$2.run(ChannelEndPoint.java:117) >>>>> solr-01 | at >>>>> >>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333) >>>>> solr-01 | at >>>>> >>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310) >>>>> solr-01 | at >>>>> >>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168) >>>>> solr-01 | at >>>>> >>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126) >>>>> solr-01 | at >>>>> >>> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366) >>>>> solr-01 | at >>>>> >>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:781) >>>>> solr-01 | at >>>>> >>> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:917) >>>>> solr-01 | at java.base/java.lang.Thread.run(Thread.java:834) >>>>> solr-01 | Caused by: java.util.concurrent.TimeoutException: Idle >>>>> timeout expired: 120000/120000 ms >>>>> solr-01 | at org.eclipse.jetty.io >>>>> .IdleTimeout.checkIdleTimeout(IdleTimeout.java:171) >>>>> solr-01 | at org.eclipse.jetty.io >>>>> .IdleTimeout.idleCheck(IdleTimeout.java:113) >>>>> solr-01 | at >>>>> >>> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) >>>>> solr-01 | at >>>>> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) >>>>> solr-01 | at >>>>> >>> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) >>>>> solr-01 | at >>>>> >>> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) >>>>> solr-01 | at >>>>> >>> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) >>>>> solr-01 | ... 1 more >>>>> >>>>> >>>>> My expression, in case it helps. To summarize, it finds the document >>> ids >>>>> which exists on sourceCollection but not on target collection (DNM). >>> Joins >>>>> on itself to duplicate some fields (I couldn’t find another way to >>>>> duplicate the value of field into 2 fields). Then sends the result to >>>>> update. Source collection has about 300M documents, 24GB heap, 2 >>> shards, 2 >>>>> replicas of each shard. >>>>> >>>>> update( >>>>> DNM, >>>>> batchSize=1000, >>>>> parallel( >>>>> WorkerCollection, >>>>> leftOuterJoin( >>>>> fetch( >>>>> sourceCollection, >>>>> complement( >>>>> search( >>>>> sourceCollection, >>>>> q="*:*", >>>>> qt="/export", >>>>> fq="...some filters...", >>>>> sort="id_str asc", >>>>> fl="id_str", >>>>> partitionKeys="id_str" >>>>> ), >>>>> search( >>>>> DNM, >>>>> q="*:*", >>>>> qt="/export", >>>>> sort="id_str asc", >>>>> fl="id_str", >>>>> partitionKeys="id_str" >>>>> ), >>>>> on="id_str" >>>>> ), >>>>> fl="...my many fields...", >>>>> on="id_str", >>>>> batchSize="1000" >>>>> ), >>>>> select( >>>>> fetch( >>>>> sourceCollection, >>>>> complement( >>>>> search( >>>>> sourceCollection, >>>>> q="*:*", >>>>> qt="/export", >>>>> fq="...some other filters...", >>>>> sort="id_str asc", >>>>> fl="id_str", >>>>> partitionKeys="id_str" >>>>> ), >>>>> search( >>>>> DNM, >>>>> q="*:*", >>>>> qt="/export", >>>>> sort="id_str asc", >>>>> fl="id_str", >>>>> partitionKeys="id_str" >>>>> ), >>>>> on="id_str" >>>>> ), >>>>> fl="...some other fields...", >>>>> on="id_str", >>>>> batchSize="1000" >>>>> ), >>>>> id_str, ..some other fields as... >>>>> ), >>>>> on="id_str" >>>>> ), >>>>> workers="4", sort="id_str asc" >>>>> ) >>>>> ) >>>>> >>>>> Sent from Mail for Windows 10 >>>>> >>>>> >>> >>> >