Hi Matthes, You may find the following blog post relevant: http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/
Hope that helps, -Jey On Thu, Sep 25, 2014 at 5:05 PM, matthes <mdiekst...@sensenetworks.com> wrote: > Hi again! > > At the moment I try to use parquet and I want to keep the data into the > memory in an efficient way to make requests against the data as fast as > possible. > I read about parquet it is able to encode nested columns. Parquet uses the > Dremel encoding with definition and repetition levels. > Is it at the moment possible to use this in spark as well or is it actually > not implemented? If yes, I’m not sure how to do it. I saw some examples, > they try to put some arrays or case classes in other case classes, nut I > don’t think that is the right way. The other thing that I saw in this > relation was SchemaRDDs. > > Input: > > Col1 | Col2 | Col3 | Col4 > Int | long | long | int > --------------------------------------------- > 14 | 1234 | 1422 | 3 > 14 | 3212 | 1542 | 2 > 14 | 8910 | 1422 | 8 > 15 | 1234 | 1542 | 9 > 15 | 8897 | 1422 | 13 > > Want this Parquet-format: > Col3 | Col1 | Col4 | Col2 > long | int | int | long > -------------------------------------------- > 1422 | 14 | 3 | 1234 > “ | “ | 8 | 8910 > “ | 15 | 13 | 8897 > 1542 | 14 | 2 | 3212 > “ | 15 | 9 | 1234 > > It would be awesome if somebody could give me a good hint how can I do that > or maybe a better way. > > Best, > Matthes > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Is-it-possible-to-use-Parquet-with-Dremel-encoding-tp15186.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org