Re: question related partitions of the DataFrame

2015-07-14 Thread Eugene Morozov
Gil, I’d say that DataFrame is a result of transformation of any other RDD. Your input RDD might contains strings and numbers. But as a result of transformation you end up with RDD that contains GenericRowWithSchema, which is what DataFrame actually is. So, I’d say that DataFrame is just sort

Re: question related partitions of the DataFrame

2015-07-14 Thread Gil Vernik
I see that most recent code doesn't has RDDApi anymore. But i still would like to understand the logic of partitions of DataFrame. Does DataFrame has it's own partitions and is sort of RDD by itself, or it depends on the partitions of the underline RDD that was used to load the data? For exampl