Re: Using Spark to analyze complex JSON

2014-05-25 Thread Michael Armbrust
On Sat, May 24, 2014 at 11:47 PM, Mayur Rustagi wrote: > > Is the in-memory columnar store planned as part of SparkSQL ? > This has already been ported from Shark, and is used when you run cacheTable. > Also will both HiveQL & SQLParser be kept updated? > Yeah, we need to figure out exactly wha

Re: Using Spark to analyze complex JSON

2014-05-24 Thread Mayur Rustagi
Hi Michael, Is the in-memory columnar store planned as part of SparkSQL ? Also will both HiveQL & SQLParser be kept updated? Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi On Sun, May 25, 2014 at 2:44 AM, Michael Armbrus

Re: Using Spark to analyze complex JSON

2014-05-24 Thread Michael Armbrust
> But going back to your presented pattern, I have a question. Say your data > does have a fixed structure, but some of the JSON values are lists. How > would you map that to a SchemaRDD? (I didn’t notice any list values in the > CandyCrush example.) Take the likes field from my original example: >

Re: Using Spark to analyze complex JSON

2014-05-23 Thread Nicholas Chammas
ble >>> > WHERE name = "Nick"; >>> > >>> > Of course, this is just a hand-wavy suggestion of how I’d like to be >>> able to >>> > query JSON (particularly that last example) using SQL. I’m interested >>> in >>> > seeing what y’all come up with. >>> > >>> > A large part of what my team does is make it easy for analysts to >>> explore >>> > and query JSON data using SQL. We have a fairly complex home-grown >>> process >>> > to do that and are looking to replace it with something more out of >>> the box. >>> > So if you’d like more input on how users might use this feature, I’d >>> be glad >>> > to chime in. >>> > >>> > Nick >>> > >>> > >>> > >>> > On Wed, May 21, 2014 at 11:21 AM, Michael Armbrust < >>> mich...@databricks.com> >>> > wrote: >>> >> >>> >> You can already extract fields from json data using Hive UDFs. We >>> have an >>> >> intern working on on better native support this summer. We will be >>> sure to >>> >> post updates once there is a working prototype. >>> >> >>> >> Michael >>> >> >>> >> >>> >> On Tue, May 20, 2014 at 6:46 PM, Nick Chammas < >>> nicholas.cham...@gmail.com> >>> >> wrote: >>> >>> >>> >>> The Apache Drill home page has an interesting heading: "Liberate >>> Nested >>> >>> Data". >>> >>> >>> >>> Is there any current or planned functionality in Spark SQL or Shark >>> to >>> >>> enable SQL-like querying of complex JSON? >>> >>> >>> >>> Nick >>> >>> >>> >>> >>> >>> >>> >>> View this message in context: Using Spark to analyze complex JSON >>> >>> Sent from the Apache Spark User List mailing list archive at >>> Nabble.com. >>> >> >>> >> >>> > >>> >> >> >

Re: Using Spark to analyze complex JSON

2014-05-22 Thread Michael Cutler
es": ["ice cream", "dogs", "Vanilla Ice"] >>>> > } >>>> > >>>> > It would be SUPER COOL if we could query that table in a way that is >>>> as >>>> > natural as follows: >>>> > >>>> > SELECT DISTINCT name >>>> > FROM json_table; >>>> > >>>> > SELECT MAX(location.x) >>>> > FROM json_table; >>>> > >>>> > SELECT likes[2] -- Ice Ice Baby >>>> > FROM json_table >>>> > WHERE name = "Nick"; >>>> > >>>> > Of course, this is just a hand-wavy suggestion of how I’d like to be >>>> able to >>>> > query JSON (particularly that last example) using SQL. I’m interested >>>> in >>>> > seeing what y’all come up with. >>>> > >>>> > A large part of what my team does is make it easy for analysts to >>>> explore >>>> > and query JSON data using SQL. We have a fairly complex home-grown >>>> process >>>> > to do that and are looking to replace it with something more out of >>>> the box. >>>> > So if you’d like more input on how users might use this feature, I’d >>>> be glad >>>> > to chime in. >>>> > >>>> > Nick >>>> > >>>> > >>>> > >>>> > On Wed, May 21, 2014 at 11:21 AM, Michael Armbrust < >>>> mich...@databricks.com> >>>> > wrote: >>>> >> >>>> >> You can already extract fields from json data using Hive UDFs. We >>>> have an >>>> >> intern working on on better native support this summer. We will be >>>> sure to >>>> >> post updates once there is a working prototype. >>>> >> >>>> >> Michael >>>> >> >>>> >> >>>> >> On Tue, May 20, 2014 at 6:46 PM, Nick Chammas < >>>> nicholas.cham...@gmail.com> >>>> >> wrote: >>>> >>> >>>> >>> The Apache Drill home page has an interesting heading: "Liberate >>>> Nested >>>> >>> Data". >>>> >>> >>>> >>> Is there any current or planned functionality in Spark SQL or Shark >>>> to >>>> >>> enable SQL-like querying of complex JSON? >>>> >>> >>>> >>> Nick >>>> >>> >>>> >>> >>>> >>> >>>> >>> View this message in context: Using Spark to analyze complex JSON >>>> >>> Sent from the Apache Spark User List mailing list archive at >>>> Nabble.com. >>>> >> >>>> >> >>>> > >>>> >>>

Re: Using Spark to analyze complex JSON

2014-05-22 Thread Flavio Pompermaier
gt;>> > >>> > Of course, this is just a hand-wavy suggestion of how I’d like to be >>> able to >>> > query JSON (particularly that last example) using SQL. I’m interested >>> in >>> > seeing what y’all come up with. >>> > >>> > A large part of what my team does is make it easy for analysts to >>> explore >>> > and query JSON data using SQL. We have a fairly complex home-grown >>> process >>> > to do that and are looking to replace it with something more out of >>> the box. >>> > So if you’d like more input on how users might use this feature, I’d >>> be glad >>> > to chime in. >>> > >>> > Nick >>> > >>> > >>> > >>> > On Wed, May 21, 2014 at 11:21 AM, Michael Armbrust < >>> mich...@databricks.com> >>> > wrote: >>> >> >>> >> You can already extract fields from json data using Hive UDFs. We >>> have an >>> >> intern working on on better native support this summer. We will be >>> sure to >>> >> post updates once there is a working prototype. >>> >> >>> >> Michael >>> >> >>> >> >>> >> On Tue, May 20, 2014 at 6:46 PM, Nick Chammas < >>> nicholas.cham...@gmail.com> >>> >> wrote: >>> >>> >>> >>> The Apache Drill home page has an interesting heading: "Liberate >>> Nested >>> >>> Data". >>> >>> >>> >>> Is there any current or planned functionality in Spark SQL or Shark >>> to >>> >>> enable SQL-like querying of complex JSON? >>> >>> >>> >>> Nick >>> >>> >>> >>> >>> >>> >>> >>> View this message in context: Using Spark to analyze complex JSON >>> >>> Sent from the Apache Spark User List mailing list archive at >>> Nabble.com. >>> >> >>> >> >>> > >>> >>

Re: Using Spark to analyze complex JSON

2014-05-21 Thread Michael Cutler
gt; to do that and are looking to replace it with something more out of the >> box. >> > So if you’d like more input on how users might use this feature, I’d be >> glad >> > to chime in. >> > >> > Nick >> > >> > >> > >> > On Wed, May 21, 2014 at 11:21 AM, Michael Armbrust < >> mich...@databricks.com> >> > wrote: >> >> >> >> You can already extract fields from json data using Hive UDFs. We >> have an >> >> intern working on on better native support this summer. We will be >> sure to >> >> post updates once there is a working prototype. >> >> >> >> Michael >> >> >> >> >> >> On Tue, May 20, 2014 at 6:46 PM, Nick Chammas < >> nicholas.cham...@gmail.com> >> >> wrote: >> >>> >> >>> The Apache Drill home page has an interesting heading: "Liberate >> Nested >> >>> Data". >> >>> >> >>> Is there any current or planned functionality in Spark SQL or Shark to >> >>> enable SQL-like querying of complex JSON? >> >>> >> >>> Nick >> >>> >> >>> >> >>> >> >>> View this message in context: Using Spark to analyze complex JSON >> >>> Sent from the Apache Spark User List mailing list archive at >> Nabble.com. >> >> >> >> >> > >> > >

Re: Using Spark to analyze complex JSON

2014-05-21 Thread Nicholas Chammas
a using Hive UDFs. We have > an > >> intern working on on better native support this summer. We will be > sure to > >> post updates once there is a working prototype. > >> > >> Michael > >> > >> > >> On Tue, May 20, 2014 at 6:46 PM, Nick Chammas < > nicholas.cham...@gmail.com> > >> wrote: > >>> > >>> The Apache Drill home page has an interesting heading: "Liberate Nested > >>> Data". > >>> > >>> Is there any current or planned functionality in Spark SQL or Shark to > >>> enable SQL-like querying of complex JSON? > >>> > >>> Nick > >>> > >>> > >>> > >>> View this message in context: Using Spark to analyze complex JSON > >>> Sent from the Apache Spark User List mailing list archive at > Nabble.com. > >> > >> > > >

Re: Using Spark to analyze complex JSON

2014-05-21 Thread Tobias Pfeiffer
gt; >> >> On Tue, May 20, 2014 at 6:46 PM, Nick Chammas >> wrote: >>> >>> The Apache Drill home page has an interesting heading: "Liberate Nested >>> Data". >>> >>> Is there any current or planned functionality in Spark SQL or Shark to >>> enable SQL-like querying of complex JSON? >>> >>> Nick >>> >>> >>> >>> View this message in context: Using Spark to analyze complex JSON >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> >

Re: Using Spark to analyze complex JSON

2014-05-21 Thread Nicholas Chammas
home page has an >> interesting heading: "Liberate Nested Data". >> >> Is there any current or planned functionality in Spark SQL or Shark to >> enable SQL-like querying of complex JSON? >> >> Nick >> >> >> -- &g

Re: Using Spark to analyze complex JSON

2014-05-21 Thread Michael Armbrust
> View this message in context: Using Spark to analyze complex > JSON<http://apache-spark-user-list.1001560.n3.nabble.com/Using-Spark-to-analyze-complex-JSON-tp6146.html> > Sent from the Apache Spark User List mailing list > archive<http://apache-spark-user-list.1001560.n3.nabble.com/>at Nabble.com. >

Using Spark to analyze complex JSON

2014-05-20 Thread Nick Chammas
pache-spark-user-list.1001560.n3.nabble.com/Using-Spark-to-analyze-complex-JSON-tp6146.html Sent from the Apache Spark User List mailing list archive at Nabble.com.