It does generalize types, but only on the intersection of the columns it seems. 
There might be a way to get the union of the columns too using HiveQL. Types 
generalize up with string being the "most general".

Matei

> On Nov 1, 2014, at 6:22 PM, Daniel Mahler <dmah...@gmail.com> wrote:
> 
> Thanks Matei. What does unionAll do if the input RDD schemas are not 100% 
> compatible. Does it take the union of the columns and generalize the types?
> 
> thanks
> Daniel
> 
> On Sat, Nov 1, 2014 at 6:08 PM, Matei Zaharia <matei.zaha...@gmail.com 
> <mailto:matei.zaha...@gmail.com>> wrote:
> Try unionAll, which is a special method on SchemaRDDs that keeps the schema 
> on the results.
> 
> Matei
> 
> > On Nov 1, 2014, at 3:57 PM, Daniel Mahler <dmah...@gmail.com 
> > <mailto:dmah...@gmail.com>> wrote:
> >
> > I would like to combine 2 parquet tables I have create.
> > I tried:
> >
> >       sc.union(sqx.parquetFile("fileA"), sqx.parquetFile("fileB"))
> >
> > but that just returns RDD[Row].
> > How do I combine them to get a SchemaRDD[Row]?
> >
> > thanks
> > Daniel
> 
> 

Reply via email to