On Wed, Oct 24, 2012 at 02:44:45PM +0000, Xiaohan wrote: > And the logs in NameNode, we found the difference from each of the times: > > 2012-10-20 20:05:43,249 INFO > org.apache.hadoop.hdfs.server.namenode.FSEditLog: **** Number of syncs: 1347 > SyncTimes(ms): 14138 3677 > > 2012-10-22 18:34:42,223 INFO > org.apache.hadoop.hdfs.server.namenode.FSEditLog: **** Number of syncs: 51 > SyncTimes(ms): 34553 312 > > We inspect that it is the problem of Bookkeeper. Anyone ever encounter that > or any clue for that? Thanks very much. > The environment is strictly controlled, and the logs can only be copied by > hand. So the logs are not so detailed. How many bookies are you using? Are any of the bookies displaying disk errors? what does iostat say on the bookies and on the namenode?
It does look like the editlog is the culprit here. However it's not clear that it's BK. If BK is the shared edits, it should be second in the list of journals. From the sync times, the second journal seems to be performing fine. -Ivan