[
https://issues.apache.org/jira/browse/SOLR-9830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065085#comment-16065085
]
Erick Erickson commented on SOLR-9830:
--------------------------------------
Usually this means you have not upped the limit for open files on a *nix system.
What do you get when you check "ulmit -1"? There should be an entry for "open
files". Does this go away if you set it higher than what you currently have? I
have 10,000 as the limit on my laptop for instance....
> Once IndexWriter is closed due to some RunTimeException like
> FileSystemException, It never return to normal unless restart the Solr JVM
> ---------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-9830
> URL: https://issues.apache.org/jira/browse/SOLR-9830
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: update
> Affects Versions: 6.2
> Environment: Red Hat 4.4.7-3,SolrCloud
> Reporter: Daisy.Yuan
>
> 1. Collection coll_test, has 9 shards, each has two replicas in different
> solr instances.
> 2. When update documens to the collection use Solrj, inject the exhausted
> handle fault to one solr instance like solr1.
> 3. Update to col_test_shard3_replica1(It's leader) is failed due to
> FileSystemException, and IndexWriter is closed.
> 4. And clear the fault, the col_test_shard3_replica1 (is leader) is always
> cannot be updated documens and the numDocs is always less than the standby
> replica.
> 5. After Solr instance restart, It can update documens and the numDocs is
> consistent between the two replicas.
> I think in this case in Solr Cloud mode, it should recovery itself and not
> restart to recovery the solrcore update function.
> 2016-12-01 14:13:00,932 | INFO | http-nio-21101-exec-20 |
> [DWPT][http-nio-21101-exec-20]: now abort |
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,932 | INFO | http-nio-21101-exec-20 |
> [DWPT][http-nio-21101-exec-20]: done abort |
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,932 | INFO | http-nio-21101-exec-20 |
> [IW][http-nio-21101-exec-20]: hit exception updating document |
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,933 | INFO | http-nio-21101-exec-20 |
> [IW][http-nio-21101-exec-20]: hit tragic FileSystemException inside
> updateDocument |
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,933 | INFO | http-nio-21101-exec-20 |
> [IW][http-nio-21101-exec-20]: rollback |
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,933 | INFO | http-nio-21101-exec-20 |
> [IW][http-nio-21101-exec-20]: all running merges have aborted |
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,934 | INFO | http-nio-21101-exec-20 |
> [IW][http-nio-21101-exec-20]: rollback: done finish merges |
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,934 | INFO | http-nio-21101-exec-20 |
> [DW][http-nio-21101-exec-20]: abort |
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,939 | INFO | commitScheduler-46-thread-1 |
> [DWPT][commitScheduler-46-thread-1]: flush postings as segment _4h9
> numDocs=3798 |
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 |
> [DWPT][commitScheduler-46-thread-1]: now abort |
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 |
> [DWPT][commitScheduler-46-thread-1]: done abort |
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,940 | INFO | http-nio-21101-exec-20 |
> [DW][http-nio-21101-exec-20]: done abort success=true |
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 |
> [DW][commitScheduler-46-thread-1]: commitScheduler-46-thread-1
> finishFullFlush success=false |
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,940 | INFO | http-nio-21101-exec-20 |
> [IW][http-nio-21101-exec-20]: rollback:
> infos=_4g7(6.2.0):C59169/23684:delGen=4 _4gq(6.2.0):C67474/11636:delGen=1
> _4gg(6.2.0):C64067/15664:delGen=2 _4gr(6.2.0):C13131 _4gs(6.2.0):C966
> _4gt(6.2.0):C4543 _4gu(6.2.0):C6960 _4gv(6.2.0):C2544 |
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 |
> [IW][commitScheduler-46-thread-1]: hit exception during NRT reader |
> org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
> 2016-12-01 14:13:00,967 | INFO | http-nio-21101-exec-20 |
> [col_test_shard3_replica1] webapp=/solr path=/update
> params={wt=javabin&version=2}{add=[5____5 (1552493084330164224), 24____5
> (1552493084330164225), 28____5 (1552493084331212800), 32____5
> (1552493084331212801), 44____5 (1552493084331212802), 46____5
> (1552493084331212803), 64____5 (1552493084331212804), 94____5
> (1552493084331212805), 100____5 (1552493084331212806), 119____5
> (1552493084331212807), ... (74 adds)]} 0 43 |
> org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.finish(LogUpdateProcessorFactory.java:187)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:156)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2143)
> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:695)
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:471)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:450)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:400)
> at
> org.apache.solr.servlet.SolrAuthorizationFilter.doFilter(SolrAuthorizationFilter.java:195)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at
> com.huawei.solr.security.check.SolrParaCheckFilter.doFilter(SolrParaCheckFilter.java:201)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at
> com.huawei.solr.security.audit.AuditFilter.doFilter(AuditFilter.java:145)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at
> com.huawei.solr.security.auth.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:611)
> at
> com.huawei.solr.security.auth.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:578)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at
> com.huawei.solr.security.auth.cas.HttpServletRequestWrapperFilterWrapper.doFilter(HttpServletRequestWrapperFilterWrapper.java:37)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at
> com.huawei.solr.security.auth.cas.Cas20ProxyReceivingTicketValidationFilterWrapper.doFilter(Cas20ProxyReceivingTicketValidationFilterWrapper.java:71)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at
> com.huawei.solr.security.auth.cas.Cas20AuthenticationFilterWrapper.doFilter(Cas20AuthenticationFilterWrapper.java:60)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at
> com.huawei.solr.security.auth.cas.LogoutFilter.doFilter(LogoutFilter.java:84)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at
> com.huawei.solr.monitor.MemMonitorFilter.doFilter(MemMonitorFilter.java:81)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at
> com.huawei.solr.security.auth.ServerRealmFilter.doFilter(ServerRealmFilter.java:55)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at
> com.huawei.solr.security.auth.RerouteRequestFilter.doFilter(RerouteRequestFilter.java:58)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:218)
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
> at
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:505)
> at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:169)
> at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
> at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:442)
> at
> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1083)
> at
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:640)
> at
> org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1756)
> at
> org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1715)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at
> org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter
> is closed
> at
> org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:740)
> at
> org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:754)
> at
> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1558)
> at
> org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:279)
> at
> org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:211)
> at
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:166)
> ... 73 more
> Caused by: java.nio.file.FileSystemException:
> /srv/BigData/solr/solrserveradmin/col_test_shard3_replica1/data/index/_4ha.fdx:
> Too many open files in system
> at
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
> at
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
> at
> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
> at
> sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
> at
> java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434)
> at java.nio.file.Files.newOutputStream(Files.java:216)
> at
> org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:413)
> at
> org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:409)
> at
> org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:253)
> at
> org.apache.lucene.store.LockValidatingDirectoryWrapper.createOutput(LockValidatingDirectoryWrapper.java:44)
> at
> org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:43)
> at
> org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.<init>(CompressingStoredFieldsWriter.java:108)
> at
> org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:128)
> at
> org.apache.lucene.codecs.lucene50.Lucene50StoredFieldsFormat.fieldsWriter(Lucene50StoredFieldsFormat.java:183)
> at
> org.apache.lucene.index.DefaultIndexingChain.initStoredFieldsWriter(DefaultIndexingChain.java:83)
> at
> org.apache.lucene.index.DefaultIndexingChain.startStoredFields(DefaultIndexingChain.java:331)
> at
> org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:368)
> at
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:231)
> at
> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:478)
> at
> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1562)
> ... 76 more
>
>
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]