Re: How best to deal with wide, structured tuples?

2015-11-09 Thread Johann Kovacs
o contain the Schema. But this could have performance implications. > > We should definitely add support for creating a DataSet[Row] directly from > the a CSV-Input, since otherwise you have to go trough tuples which does > not work > with dynamic schemas and if you have more than a cer

Re: How best to deal with wide, structured tuples?

2015-11-02 Thread Johann Kovacs
ever, similar to the first >> approach, you need to know the schema of the data in advance (before the >> program is executed). >> >> In my opinion the first approach is the better, but as you said it is >> more effort to implement and might not work depending on what inf

How best to deal with wide, structured tuples?

2015-10-28 Thread Johann Kovacs
Hi all, I currently find myself evaluating a use case, where I have to deal with wide (i.e. about 50-60 columns, definitely more than the 25 supported by the Tuple types), structured data from CSV files, with a potentially dynamically (during runtime) generated (or automatically inferred from the