Should we assume that the non-hadoop system has no way to get on the network of the hadoop cluster and its clients? Otherwise, you can grant it temporary access and do a write from it itself. If not:
Run a micro FTP server pointed to that file, and then do a 'hadoop fs -cp ftp://location hdfs://location', since FTPFileSystem is present in Hadoop? Or if NFS/etc. mounted, file:/// will work (or via copyFromLocal/put). Essentially you're bringing in the file remotely but performing the copy via CLI. Or you can copy them in chunks, either keeping the destination file writer open, if possible, or appending (depending on what version of Hadoop you're using). On Thu, Jul 5, 2012 at 11:54 PM, Ted Yu <yuzhih...@gmail.com> wrote: > Hi, > One of the customers wants to transfer dump file (size ~ 2TB) from outside > hadoop cluster onto hdfs. > The size exceeds free space on CLI machine. > > I want to poll best practice in this scenario. > > Thanks -- Harsh J