Hi, guys.

Now our cluster is moving to security mode. We find many difference with the 
non-security, one is the starting of datanode. And I am not sure how it works, 
so I send the email here to ask.
Secure mode must use jsvc liking tools to start datanode because it allows the 
datanode listening the port under 1024 woring in not-root user.
I search the reason of using the port under 1024 with google, only findng that 
Cloudera's CDH doc describes the reason, which is "DataNode must be below 1024, 
because this provides part of the security mechanism to make it impossible for 
a user to run a map task which impersonates a DataNode."
I try to configure the datanode's http port with 2004(the suggesting value is 
1004) which is above 1024, then starting it in secure mode. It result in a 
failure of starting the one as expected. But I found the failure is because of 
the DataNode itself check the number and throws the exception. Since user to 
run map task may impersonate the DataNode, he could also change the code of 
DataNode with avoiding the check in DataNode. When user do it, it still 
impersonate the DataNode with a port above 1024, which a non-root user could 
use and then application in map task could use.
Then I supposed that NN should also do the check, so I deleted the check code 
in DataNode, configuring the http port with 2004, then starting DataNode in 
secure mode. The DataNode starting successfully and the NN accept the DataNode.
The data is also writed to the DataNode. Everything works well as the DataNode 
is a normal one.

Is it a defect? Or I 've missed something. If either of them, please let me 
know. Thank you.

Reply via email to