Okay, nice to hear it works out!

On Mon, Oct 5, 2015 at 1:50 PM, Pieter Hameete <phame...@gmail.com> wrote:

> Hi Stephen,
>
> it was not the SimpleInputProjection, because that is a stateless object.
> The boolean endReached was not reset upon opening a new file however, so
> for each consecutive file no records were parsed.
>
> Thanks alot for your help!
>
> - Pieter
>
> 2015-10-05 12:50 GMT+02:00 Stephan Ewen <se...@apache.org>:
>
>> If you have more files than task slots, then some tasks will get multiple
>> files. That means that open() and close() are called multiple times on the
>> input format.
>>
>> Make sure that your input format tolerates that and does not get confused
>> with lingering state (maybe create a new SimpleInputProjection as well)
>>
>> On Mon, Oct 5, 2015 at 12:41 PM, Pieter Hameete <phame...@gmail.com>
>> wrote:
>>
>>> Hi Stephen,
>>>
>>> it concerns the DataSet API.
>>>
>>> The program im running can be found at
>>> https://github.com/PHameete/dawn-flink/blob/development/src/main/scala/wis/dawnflink/performance/xmark/XMarkQuery11.scala
>>> The Custom Input Format at
>>> https://github.com/PHameete/dawn-flink/blob/development/src/main/scala/wis/dawnflink/parsing/xml/XML2DawnInputFormat.java
>>>
>>> Cheers!
>>>
>>> 2015-10-05 12:38 GMT+02:00 Stephan Ewen <se...@apache.org>:
>>>
>>>> I assume this concerns the streaming API?
>>>>
>>>> Can you share your program and/or the custom input format code?
>>>>
>>>> On Mon, Oct 5, 2015 at 12:33 PM, Pieter Hameete <phame...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hello Flinkers!
>>>>>
>>>>> I run into some strange behavior when reading from a folder of input
>>>>> files.
>>>>>
>>>>> When the number of input files in the folder exceeds the number of
>>>>> task slots I noticed that the size of my datasets varies with each run. It
>>>>> seems as if the transformations don't wait for all input files to be read.
>>>>>
>>>>> When I have equal or more task slots than there are files, there are
>>>>> no problems.
>>>>>
>>>>> I'm using a custom input format. Could there be a problem with my
>>>>> custom input format, and if so what could I be forgetting?
>>>>>
>>>>> Kind regards and thank you for your time!
>>>>>
>>>>> Pieter
>>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to