This question is not directly related to Hive, but: I configured 3 datanodes on my Linux machine. In my configuration, I configured the number of replication to be 1.
I am submitting a file to the hdfs, and found that the file has 3 copies on each datanodes (I checked it from the browser) Isn't right that I should only see the file on 1 datanodes and on 1 replica? Thanks ******************************* This e-mail contains information for the intended recipient only. It may contain proprietary material or confidential information. If you are not the intended recipient you are not authorised to distribute, copy or use this e-mail or any attachment to it. Murex cannot guarantee that it is virus free and accepts no responsibility for any loss or damage arising from its use. If you have received this e-mail in error please notify immediately the sender and delete the original email received, any attachments and all copies from your system.