Thanks a lot!
2015-08-10 12:20 GMT+02:00 Flavio Pompermaier :
> Done through https://issues.apache.org/jira/browse/FLINK-2503
>
> Thanks again,
> Flavio
>
> On Mon, Aug 10, 2015 at 12:11 PM, Fabian Hueske wrote:
>
>> Congrats that you got your InputFormat working!
>> It is true, there can be a f
Done through https://issues.apache.org/jira/browse/FLINK-2503
Thanks again,
Flavio
On Mon, Aug 10, 2015 at 12:11 PM, Fabian Hueske wrote:
> Congrats that you got your InputFormat working!
> It is true, there can be a few inconsistencies in the Formats derived from
> FileInputFormat.
>
> It woul
Congrats that you got your InputFormat working!
It is true, there can be a few inconsistencies in the Formats derived from
FileInputFormat.
It would be great if you could open JIRAs for these issues. Otherwise, the
might get lost on the mailing list.
Thanks, Fabian
2015-08-10 12:02 GMT+02:00 Fla
Hi Fabian,
thanks to your help I finally managed to successfully generate a DataSet
from my folder but I think that there are some inconsistencies in the
hierarchy of InputFormats.
The *BinaryOutputFormat*/*TypeSerializerInputFormat* should somehow inherit
the behaviour of the FileInputFormat (so r
You need to do something like this:
public class YourInputFormat extends FileInputFormat {
private boolean objectRead;
@Override
public FileInputSplit[] createInputSplits(int minNumSplits) {
// Create one FileInputSplit for each file you want to read.
// Check FileInputForma
Sorry Fabian but I don't understand what I should do :(
Could you provide me a simple snippet of code to achieve this?
On Fri, Aug 7, 2015 at 1:30 PM, Fabian Hueske wrote:
> Enumeration of nested files is a feature of the FileInputFormat.
> If you implement your own IF based on FileInputFormat a
Enumeration of nested files is a feature of the FileInputFormat.
If you implement your own IF based on FileInputFormat as I suggested
before, you can use that feature.
2015-08-07 12:29 GMT+02:00 Flavio Pompermaier :
> I have a directory containing a list of files, each one containing a
> kryo-ser
I have a directory containing a list of files, each one containing a
kryo-serialized object.
With json serialized objects I don't have that problem (but there I use
env.readTextFile(path.withParameters(parameters)
where parameters has the ENUMERATE_NESTED_FILES_FLAG set to true).
On Fri, Aug 7, 2
I don't know your use case.
The InputFormat interface is very flexible. Directories can be recursively
read. A file can contain one or more objects. You can also make a smarter
IF and put multiple (small) files into one split...
It is up to your use case what you need to implement.
2015-08-07 12
Should this be the case just reading recursively an entire directory
containing one object per file?
On Fri, Aug 7, 2015 at 12:04 PM, Fabian Hueske wrote:
> You could implement your own InputFormat based on FileInputFormat and
> overwrite the createInputSplits method to just create a single spli
You could implement your own InputFormat based on FileInputFormat and
overwrite the createInputSplits method to just create a single split per
file.
2015-08-07 12:02 GMT+02:00 Flavio Pompermaier :
> So what should I do?
>
> On Fri, Aug 7, 2015 at 12:01 PM, Fabian Hueske wrote:
>
>> Ah, I checked
So what should I do?
On Fri, Aug 7, 2015 at 12:01 PM, Fabian Hueske wrote:
> Ah, I checked the code.
>
> The BinaryInputFormat expects metadata which is written be the
> BinaryOutputFormat.
> So you cannot use the BinaryInputFormat to read a file which does not
> provide the metadata.
>
> 2015-0
Ah, I checked the code.
The BinaryInputFormat expects metadata which is written be the
BinaryOutputFormat.
So you cannot use the BinaryInputFormat to read a file which does not
provide the metadata.
2015-08-07 11:53 GMT+02:00 Flavio Pompermaier :
> The file containing the serialized object is 7
The file containing the serialized object is 7 bytes
On Fri, Aug 7, 2015 at 11:49 AM, Fabian Hueske wrote:
> This might be an issue with the blockSize parameter of the
> BinaryInputFormat.
> How large is the file with the single object?
>
> 2015-08-07 11:37 GMT+02:00 Flavio Pompermaier :
>
>> I
This might be an issue with the blockSize parameter of the
BinaryInputFormat.
How large is the file with the single object?
2015-08-07 11:37 GMT+02:00 Flavio Pompermaier :
> I also tried with
>
> DataSet ds = env.createInput(inputFormat).setParallelism(1);
>
> but I get the same error :(
>
> More
I also tried with
DataSet ds = env.createInput(inputFormat).setParallelism(1);
but I get the same error :(
Moreover, in this example I put exactly one object per file so it should be
able to deserialize it, right?
On Fri, Aug 7, 2015 at 11:33 AM, Fabian Hueske wrote:
> If you create your file
If you create your file by just sequentially writing all objects to the
file using Kryo, you can only read it with a parallelism of 1.
Writing binary files in a way that they can be read in parallel is a bit
tricky (and not specific to Flink).
2015-08-07 11:28 GMT+02:00 Flavio Pompermaier :
> Hi
Hi to all,
I;m trying to read a file serialized with kryo but I get this exception
(due to the fact that the createInputSplits creates 8 inputsplits, where
just one is not empty..).
Caused by: java.io.IOException: Invalid argument
at sun.nio.ch.FileChannelImpl.position0(Native Method)
at sun.nio.c
18 matches
Mail list logo