Hi Benjamin,

SAMZA-968 <https://issues.apache.org/jira/browse/SAMZA-968> is already
assigned to you.

Thanks,
Jagadish

On Thu, Jun 16, 2016 at 10:51 AM, Benjamin Smith <
ben.sm...@ranksoftwareinc.com> wrote:

> Sure, looks like a straightforward enough change.
>
>
> I've created: https://issues.apache.org/jira/browse/SAMZA-968
>
>
> I don't see anyway to assign it to myself though?
>
> ________________________________
> From: Yi Pan <nickpa...@gmail.com>
> Sent: Thursday, June 16, 2016 1:02:59 PM
> To: dev@samza.apache.org
> Subject: Re: Bug in SequenceFileHdfsFileWriter
>
> Hi, Benjamin,
>
> Thanks a lot for reporting this! It makes sense from reading the posts.
> Could you open a JIRA? Are you interested in assigning to yourself and
> contribute the fix?
>
> Thanks a lot again!
>
> -Yi
>
> On Thu, Jun 16, 2016 at 9:52 AM, Benjamin Smith <
> ben.sm...@ranksoftwareinc.com> wrote:
>
> >
> > Hello,
> >
> > I am working on a project where we are integrating Samza and Hive. As
> part
> > of this project, we ran into an issue where sequence files written from
> > Samza were taking a long time (hours) to completely sync with HDFS.
> >
> > After some Googling and digging into the code, it appears that the issue
> > is here:
> >
> >
> https://github.com/apache/samza/blob/master/samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/writer/SequenceFileHdfsWriter.scala#L111
> >
> > Writer.stream(dfs.create(path)) implies that the caller of
> > dfs.create(path) is responsible for closing the created stream
> explicitly.
> > This doesn't happen, and the SequenceFileHdfsWriter call to close will
> only
> > flush the stream.
> >
> > I believe the correct line should be:
> >
> > Writer.file(path)
> >
> > Or, SequenceFileHdfsWriter should explicitly track and close the stream.
> >
> > Thanks!
> >
> > Ben
> >
> > Refernece material:
> >
> >
> http://stackoverflow.com/questions/27916872/why-the-sequencefile-is-truncated
> >
> >
> https://apache.googlesource.com/hadoop-common/+/HADOOP-6685/src/java/org/apache/hadoop/io/SequenceFile.java#1238
> >
> >
>



-- 
Jagadish V,
Graduate Student,
Department of Computer Science,
Stanford University

Reply via email to