A replica waiting to be recovered does not provision read

2015-04-21 Thread lei liu
One block has three replicas, I restart all DataNodes, so the three replicas all are RWR state, When one dfs client read the file , that throw IOException. The visible length of RWR state replica is -1, why is not BytesOnDisk? Thanks, LiuLei

java.net.SocketTimeoutException: read(2) error: Resource temporarily unavailable

2014-07-04 Thread lei liu
I use hbase-0.94 and hadoop-2.2, there is below exception: 2014-07-04 12:43:49,700 WARN org.apache.hadoop.hdfs.DFSClient: failed to connect to DomainSocket(fd=322,path=/home/hadoop/hadoop-current/cdh4-dn-socket/dn_socket) java.net.SocketTimeoutException: read(2) error: Resource temporarily unavai

Re: hedged read bug

2014-06-09 Thread lei liu
e the infinite loop that you > reported > > with the test case that you gave. > > > > However, if you're still seeing a bug on your side, then I recommend > filing > > a new jira issue with a full description. We can continue > troubleshooting > > there. > >

Re: hedged read bug

2014-06-08 Thread lei liu
let us know? Thank you! > > Chris Nauroth > Hortonworks > http://hortonworks.com/ > > > > On Thu, Jun 5, 2014 at 8:34 PM, lei liu wrote: > > > I use hadoop2.4. > > > > When I use "hedged read", If there is only one live datanode, the reading

hedged read bug

2014-06-05 Thread lei liu
I use hadoop2.4. When I use "hedged read", If there is only one live datanode, the reading from the datanode throw TimeoutException and ChecksumException., the Client will infinite wait. Example below test case: @Test public void testException() throws IOException, InterruptedException, Exec

deadNodes in DFSInputStream

2013-12-31 Thread lei liu
I use Hbase-0.94 and CDH-4.3.1 When RegionServer read data from loca datanode, if local datanode is dead, the local datanode is add to deadNodes, and RegionServer read data from remote datanode. But when local datanode is become live, RegionServer still read data from remote datanode, that reduces

ByteBuffer-based read API for pread

2013-12-31 Thread lei liu
There is ByteBuffer read API for sequential read in CDH4.3.1, example:public synchronized int read(final ByteBuffer buf) throws IOException API. But there is not ByteBuffe read API for pread. Why don't support ByteBuffer read API for pread in CDH4.3.1? Thanks, LiuLei

Re: Metrics2 code

2013-11-21 Thread lei liu
; value. There aren't any time-based rolling metrics to my knowledge besides > MutableQuantiles. > > Best, > Andrew > > > On Wed, Nov 20, 2013 at 7:34 PM, lei liu wrote: > > > I use cdh-4.3.1 version. I am reading the code about metrics2. > > >

Metrics2 code

2013-11-20 Thread lei liu
I use cdh-4.3.1 version. I am reading the code about metrics2. There are COUNTER and GAUGE metric type in metrics v2. What is the difference between the two? There is @Metric MutableCounterLong bytesWritten attribute in DataNodeMetrics, which is used to statistics written bytes per second on D

Re: block access token

2013-11-19 Thread lei liu
Hi Luke, Can I set "dfs.block.access.token.lifetime" to two minutes? 2013/11/18 lei liu > Thanks Luke for your reply. > > The life time of block access token is ten hours, whether we should > change two minutes? I think the shorter the life time of the token, tok

Re: block access token

2013-11-18 Thread lei liu
/DN > shared secrets are rolled periodically. As long as you cannot steal block > token easily (besides using zero-day bugs), there is really no security > hole here (by design). If you know of a way to steal block tokens without > root access, let us know. > > > On Tue, Nov 12,

block access token

2013-11-12 Thread lei liu
When client read block from DataNode, the block access token is used for authorization on DataNode. But if the block access token is stolen by impostor, the impostor can read the block, I think this is one security hole. I think we can use the replay cache mechanism in Kerberos to resolve the que

Re: Decommission DataNode

2013-10-29 Thread lei liu
Should the datanode be shutdown when it is Decommissioned? I think if this is bug, I can fix it. 2013/10/29 lei liu > HI Steve, thanks for your reply. > > > There is the question in hadoop-2.2.0. > > > In hadoop-2.2.0, there are below code in DatanodeManager.han

Re: Decommission DataNode

2013-10-28 Thread lei liu
will be kill by NameNode, but in hadoop-2.2.0, when the datanode is Decommissioned, the datanode still is lived. 2013/10/28 Steve Loughran > sounds like a question for the cloudera support forums > > > On 28 October 2013 08:59, lei liu wrote: > > > In CDH3u5, when the Dat

Decommission DataNode

2013-10-28 Thread lei liu
In CDH3u5, when the DataNode is Decommissioned, the DataNode progress will be shutdown by NameNode. But In CDH4.3.1, when the DataNode is Decommissioned, the DataNode progress will be not shutdown by NameNode. When the datanode is Decommissioned, why the datanode is not automatically shutdown

Datanode fencing mechanism

2013-10-28 Thread lei liu
In https://issues.apache.org/jira/browse/HDFS-1972 jira, there is one below case: Scenario 3: DN restarts during split brain period (this scenario illustrates why I think we need to persistently record the promise about who is active) - block has 2 replicas, user asks to reduce to 1 - NN1 a

Re: hsync is too slower than hflush

2013-08-26 Thread lei liu
Hi all, DataNode sequential write file, so I think the disk seek time should be very small.Why is disk seek time 10ms? I think that is too long. Whether we can optimize the linux system configuration, reduce disk seek time. 2013/8/26 haosdent > haha, thank you very much, I get it now. > > -- >

HDFS pread performance

2013-04-17 Thread lei liu
I test the HDFS pread performance, the avg time of pread is about 10ms, but pread max time reach 200ms, there is about one percent of pread time is 200ms, that result to my application is timeout. I find max time of the RemoteBlockReader.readChunk method also can reach 100ms. The RemoteBlockR