Re: JSON to SQL

Andrés Ivaldi Thu, 28 Jan 2016 15:14:49 -0800

Thans for the tip, I've realize about that end I've ended using explode as
you said.


This is my attempt

 var res=(df.explode("rows","r") {
        l: WrappedArray[ArrayBuffer[String]] => l.toList}).select("r")
        .map { m => m.getList[Row](0) }

 var u = res.map { m => Row.fromSeq(m.toSeq) }

var df1 = df.sqlContext.createDataFrame(u, getScheme(df)  )

It woks ok, but throws an invalid cast to Integer if the scheme have some
IntegerType, looks like a spark-csv bug, but I can solved anyway

Thanks for the help.


On Thu, Jan 28, 2016 at 7:43 PM, Mohammed Guller <[email protected]>
wrote:

> You don’t need Hive for that. The DataFrame class has a method  named
> explode, which provides the same functionality.
>
>
>
> Here is an example from the Spark API documentation:
>
> df.explode("words", "word"){words: String => words.split(" ")}
>
>
>
> The first argument to the explode method  is the name of the input column
> and the second argument is the name of the output column.
>
>
>
> Mohammed
>
> Author: Big Data Analytics with Spark
> <http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>
>
>
>
> *From:* Andrés Ivaldi [mailto:[email protected]]
> *Sent:* Wednesday, January 27, 2016 7:17 PM
> *To:* Cheng, Hao
> *Cc:* Sahil Sareen; Al Pivonka; user
>
> *Subject:* Re: JSON to SQL
>
>
>
> I'm using DataFrames reading the JSON exactly as you say, and I can get
> the scheme from there. Reading the documentation, I realized that is
> possible to create Dynamically a Structure, so applying some
> transformations to the dataFrame plus the new structure I'll be able to
> save the JSON on my DBRM.
>
>
>
> For the flatten approach, you mentioned LateralView, do I need Hive DB for
> that? or just the Spark Hive Context? I saw some examples and that is
> exactly what I'm needing. Can you explain it a little bit more?
>
>
>
> Thanks
>
>
>
> On Wed, Jan 27, 2016 at 10:29 PM, Cheng, Hao <[email protected]> wrote:
>
> Have you ever try the DataFrame API like:
> sqlContext.read.json("/path/to/file.json"); the Spark SQL will auto infer
> the type/schema for you.
>
>
>
> And lateral view will help on the flatten issues,
>
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView,
> as well as the “a.b[0].c” format of expression.
>
>
>
>
>
> *From:* Andrés Ivaldi [mailto:[email protected]]
> *Sent:* Thursday, January 28, 2016 3:39 AM
> *To:* Sahil Sareen
> *Cc:* Al Pivonka; user
> *Subject:* Re: JSON to SQL
>
>
>
> I'm really brand new with Scala, but if I'm defining a case class then is
> becouse I know how is the json's structure is previously?
>
> If I'm able to define dinamicaly a case class from the JSON structure then
> even with spark I will be able to extract the data
>
>
>
> On Wed, Jan 27, 2016 at 4:01 PM, Sahil Sareen <[email protected]> wrote:
>
> Isn't this just about defining a case class and using
> parse(json).extract[CaseClassName]  using Jackson?
>
> -Sahil
>
>
>
> On Wed, Jan 27, 2016 at 11:08 PM, Andrés Ivaldi <[email protected]>
> wrote:
>
> We dont have Domain Objects, its a service like a pipeline, data is read
> from source and they are saved it in relational Database
>
> I can read the structure from DataFrames, and do some transformations, I
> would prefer to do it with Spark to be consistent with the process
>
>
>
> On Wed, Jan 27, 2016 at 12:25 PM, Al Pivonka <[email protected]> wrote:
>
> Are you using an Relational Database?
>
> If so why not use a nojs DB ? then pull from it to your relational?
>
>
>
> Or utilize a library that understands Json structure like Jackson to
> obtain the data from the Json structure the persist the Domain Objects ?
>
>
>
> On Wed, Jan 27, 2016 at 9:45 AM, Andrés Ivaldi <[email protected]> wrote:
>
> Sure,
>
> The Job is like an etl, but without interface, so I decide the rules of
> how the JSON will be saved into a SQL Table.
>
>
>
> I need to Flatten the hierarchies where is possible in case of list
> flatten also, nested objects Won't be processed by now
>
> {"a":1,"b":[2,3],"c"="Field", "d":[4,5,6,7,8] }
> {"a":11,"b":[22,33],"c"="Field1", "d":[44,55,66,77,88] }
> {"a":111,"b":[222,333],"c"="Field2", "d":[44,55,666,777,888] }
>
> I would like something like this on my SQL table
>
> a        b              c             d
>
> 1        2,3            Field         4,5,6,7,8
>
> 11       22,33          Field1        44,55,66,77,88
>
> 111      222,333        Field2        444,555,,666,777,888
>
> Right now this is what i need
>
> I will later add more intelligence, like detection of list or nested
> objects and create relations in other tables.
>
>
>
>
>
>
>
> On Wed, Jan 27, 2016 at 11:25 AM, Al Pivonka <[email protected]> wrote:
>
> More detail is needed.
>
> Can you provide some context to the use-case ?
>
>
>
> On Wed, Jan 27, 2016 at 8:33 AM, Andrés Ivaldi <[email protected]> wrote:
>
> Hello, I'm trying to Save a JSON filo into SQL table.
>
> If i try to do this directly the IlligalArgumentException is raised, I
> suppose this is beacouse JSON have a hierarchical structure, is that
> correct?
>
> If that is the problem, how can I flatten the JSON structure? The JSON
> structure to be processed would be unknow, so I need to do it
> programatically
>
> regards
>
> --
>
> Ing. Ivaldi Andres
>
>
>
>
>
> --
>
> Those who say it can't be done, are usually interrupted by those doing it.
>
>
>
> --
>
> Ing. Ivaldi Andres
>
>
>
>
>
> --
>
> Those who say it can't be done, are usually interrupted by those doing it.
>
>
>
> --
>
> Ing. Ivaldi Andres
>
>
>
>
>
>
> --
>
> Ing. Ivaldi Andres
>
>
>
>
>
> --
>
> Ing. Ivaldi Andres
>



-- 
Ing. Ivaldi Andres

Re: JSON to SQL

Reply via email to