Hi Matthes,

You may find the following blog post relevant:
http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/

Hope that helps,
-Jey

On Thu, Sep 25, 2014 at 5:05 PM, matthes <mdiekst...@sensenetworks.com> wrote:
> Hi again!
>
> At the moment I try to use parquet and I want to keep the data into the
> memory in an efficient way to make requests against the data as fast as
> possible.
> I read about parquet it is able to encode nested columns. Parquet uses the
> Dremel encoding with definition and repetition levels.
> Is it at the moment possible to use this in spark as well or is it actually
> not implemented? If yes, I’m not sure how to do it. I saw some examples,
> they try to put some arrays or case classes in other case classes, nut I
> don’t think that is the right way.  The other thing that I saw in this
> relation was SchemaRDDs.
>
> Input:
>
> Col1    |       Col2    |       Col3    |       Col4
> Int     |       long    |       long    |       int
> ---------------------------------------------
> 14      |       1234    |       1422    |       3
> 14      |       3212    |       1542    |       2
> 14      |       8910    |       1422    |       8
> 15      |       1234    |       1542    |       9
> 15      |       8897    |       1422    |       13
>
> Want this Parquet-format:
> Col3    |       Col1    |       Col4    |       Col2
> long    |       int     |       int     |       long
> --------------------------------------------
> 1422    |       14      |       3       |       1234
> “       |       “       |       8       |       8910
> “       |       15      |       13      |       8897
> 1542    |       14      |       2       |       3212
> “       |       15      |       9       |       1234
>
> It would be awesome if somebody could give me a good hint how can I do that
> or maybe a better way.
>
> Best,
> Matthes
>
>
>
>
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Is-it-possible-to-use-Parquet-with-Dremel-encoding-tp15186.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to