[
https://issues.apache.org/jira/browse/SOLR-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14272877#comment-14272877
]
Alexander S. commented on SOLR-6875:
------------------------------------
Now we have 4 shards, each with 2 replics (8 total nodes) and the next picture:
{noformat}
Shard 1:
Replica 1: 14 486 089
Replica 2: 14 496 445
Shard 2
Replica 1: 14 496 609
Replica 2: 14 496 609
Shard 3
Replica 1: 14 492 812
Replica 2: 14 492 812
Shard 4
Replica 1: 14 488 755
Replica 2: 14 488 755
{noformat}
How could it be? We didn't see anything like that before upgrade from 4.8.1 to
4.10.2. Also we enabled checkIntegrityAtMerge, could it be the reason?
> No data integrity between replicas
> ----------------------------------
>
> Key: SOLR-6875
> URL: https://issues.apache.org/jira/browse/SOLR-6875
> Project: Solr
> Issue Type: Bug
> Affects Versions: 4.10.2
> Environment: One replica is @ Linux solr1.devops.wegohealth.com
> 3.8.0-29-generic #42~precise1-Ubuntu SMP Wed Aug 14 16:19:23 UTC 2013 x86_64
> x86_64 x86_64 GNU/Linux
> Another replica is @ Linux solr2.devops.wegohealth.com 3.16.0-23-generic
> #30-Ubuntu SMP Thu Oct 16 13:17:16 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
> Solr is running with the next options:
> * -Xms12G
> * -Xmx16G
> * -XX:+UseConcMarkSweepGC
> * -XX:+UseLargePages
> * -XX:+CMSParallelRemarkEnabled
> * -XX:+ParallelRefProcEnabled
> * -XX:+UseLargePages
> * -XX:+AggressiveOpts
> * -XX:CMSInitiatingOccupancyFraction=75
> Reporter: Alexander S.
>
> Setup: SolrCloud with 2 shards, each with 2 replicas, 4 nodes in total.
> Indexing is stopped, one replica of a shard (Solr1) shows 45 574 039 docs,
> and another (Solr1.1) 45 574 038 docs.
> Solr1 is the leader, these errors appeared in the logs:
> {code}
> ERROR - 2014-12-20 09:54:38.783;
> org.apache.solr.update.StreamingSolrServers$1; error
> java.net.SocketException: Connection reset
> at java.net.SocketInputStream.read(SocketInputStream.java:196)
> at java.net.SocketInputStream.read(SocketInputStream.java:122)
> at
> org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
> at
> org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
> at
> org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
> at
> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
> at
> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
> at
> org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
> at
> org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
> at
> org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
> at
> org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
> at
> org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
> at
> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
> at
> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:682)
> at
> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:486)
> at
> org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
> at
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
> at
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
> at
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
> at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:233)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> WARN - 2014-12-20 09:54:38.787;
> org.apache.solr.update.processor.DistributedUpdateProcessor; Error sending
> update
> java.net.SocketException: Connection reset
> at java.net.SocketInputStream.read(SocketInputStream.java:196)
> at java.net.SocketInputStream.read(SocketInputStream.java:122)
> at
> org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:160)
> at
> org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:84)
> at
> org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:273)
> at
> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
> at
> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
> at
> org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
> at
> org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
> at
> org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251)
> at
> org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197)
> at
> org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271)
> at
> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
> at
> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:682)
> at
> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:486)
> at
> org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
> at
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
> at
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
> at
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
> at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:233)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> WARN - 2014-12-20 09:54:38.813; org.apache.solr.cloud.ZkController; Leader
> is publishing core=crm-prod coreNodeName =10.128.209.232:8081_solr_crm-prod
> state=down on behalf of un-reachable replica
> http://10.128.209.232:8081/solr/crm-prod/; forcePublishState? false
> ERROR - 2014-12-20 09:54:38.818;
> org.apache.solr.update.processor.DistributedUpdateProcessor; Setting up to
> try to start recovery on replica http://10.128.209.232:8081/solr/crm-prod/
> after: java.net.SocketException: Connection reset
> {code}
> On Solr1.1:
> {code}
> WARN - 2014-12-20 09:54:38.854; org.apache.solr.cloud.RecoveryStrategy;
> Stopping recovery for core=crm-prod
> coreNodeName=10.128.209.232:8081_solr_crm-prod
> {code}
> Index optimization was running at that time.
> It was not a system crash, the server is up and was running smoothly with a
> lot of available resources on board, lots of CPU, available RAM and a very
> fast SSD RAID. So whatever happened Solr should get recovered properly, e.g.
> as mysql does.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]