Hey Benoit,

Sorry for the slow response on this.

The attachment seems to have not come through. Would you mind filing
an HDFS JIRA and attach the reproducer case there?

Thanks
-Todd

On Wed, Sep 5, 2012 at 12:28 AM, Benoit Perroud <ben...@noisette.ch> wrote:
> Hi All,
>
> I experience some memory retention while copying data into HDFS when a
> IOExeption is thrown.
>
> My use case is the following: I have multiple threads sharing a
> FileSystem object, all uploading files. At some point quota is
> exceeded in one thread and I get a DSQuotaExceededException (subclass
> of IOException). In both regular case and when such exception is
> thrown, I'm closing the DFSOutputStream.
> But for DFSOutputStream that encountered a IOException, the last
> Packet is kept in memory until the FileSystem is closed. Which I
> usually don't close really often.
>
> So my questions:
>
> - Is this the expected behavior and need I to deal with ?
> - Is there a way to close properly a DFSOutputStream (and freeing all
> the retained memory) without closing the FileSystem ?
> - Is the usage of one shared FileSystem in several threads recommended ?
>
> Attached is a simple test reproducing the behavior: MiniDFSCluster is
> launched, a deadly small quota is set to have IOException thrown.
> Random content is generated and uploaded to hdfs. FileSystem is not
> closed, thus memory is growing till an OOM is thrown (don't blame me
> for the @Test(expected = OutOfMemoryError.class) :)). Tested on Hadoop
> 1.0.2.
>
> Thanks in advance for your answers, pointers and advises.
>
> Benoit.



-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to