Hello community, I am using Accumulo 1.7. Yesterday there were following log messages before one of the tserver went down. Could you give me some hints about their meaning, to better understand what was happening?
2015-12-07 14:11:02,414 [util.MetadataTableUtil] ERROR: null ConstraintViolationException(violationSummaries:[TConstraintViolationSummary(constrainClass:org.apache.accumulo.server.constraints.MetadataConstraints, violationCode:7, violationDescription:Lock not held in zookeeper by writer, numberOfViolatingMutations:1)]) 2015-12-07 14:11:02,415 [log.TabletServerLogger] ERROR: Unexpected error writing to log, retrying attempt 2 java.lang.RuntimeException: ConstraintViolationException(violationSummaries:[TConstraintViolationSummary(constrainClass:org.apache.accumulo.server.constraints.MetadataConstraints, violationCode:7, violationDescription:Lock not held in zookeeper by writer, numberOfViolatingMutations:1)]) 2015-12-07 14:11:02,403 [impl.Writer] ERROR: error sending update to server2.cluster.org:9997: ConstraintViolationException(violationSummaries:[TConstraintViolationSummary(constrainClass:org.apache.accumulo.server.constraints.MetadataConstraints, violationCode:7, violationDescription:Lock not held in zookeeper by writer, numberOfViolatingMutations:1)]) 2015-12-07 14:11:02,463 [zookeeper.DistributedWorkQueue] INFO : Got unexpected zookeeper event: None for /accumulo/8a7f6781-ae6e-44bc-a717-5b8cbd28d647/recovery 2015-12-07 14:11:02,461 [util.MetadataTableUtil] ERROR: null ConstraintViolationException(violationSummaries:[TConstraintViolationSummary(constrainClass:org.apache.accumulo.server.constraints.MetadataConstraints, violationCode:7, violationDescription:Lock not held in zookeeper by writer, numberOfViolatingMutations:1)]) 2015-12-07 14:11:02,476 [tserver.TabletServer] ERROR: Lost tablet server lock (reason = SESSION_EXPIRED), exiting. 2015-12-07 14:11:02,476 [server.GarbageCollectionLogger] WARN : GC pause checker not called in a timely fashion. Expected every 30.0 seconds but was 128.9 seconds since last check I assume there was to much load on the server, so it got problems while communicating with Zookeeper, am I right? Best regards Martin Grimmer
