Should we assume that the non-hadoop system has no way to get on the
network of the hadoop cluster and its clients? Otherwise, you can
grant it temporary access and do a write from it itself. If not:

Run a micro FTP server pointed to that file, and then do a 'hadoop fs
-cp ftp://location hdfs://location', since FTPFileSystem is present in
Hadoop? Or if NFS/etc. mounted, file:/// will work (or via
copyFromLocal/put). Essentially you're bringing in the file remotely
but performing the copy via CLI.

Or you can copy them in chunks, either keeping the destination file
writer open, if possible, or appending (depending on what version of
Hadoop you're using).

On Thu, Jul 5, 2012 at 11:54 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> Hi,
> One of the customers wants to transfer dump file (size ~ 2TB) from outside
> hadoop cluster onto hdfs.
> The size exceeds free space on CLI machine.
>
> I want to poll best practice in this scenario.
>
> Thanks



-- 
Harsh J

Reply via email to