Thank you Ryan and Xiao – sharing all this info really gives a very good
insight!
From: Ryan Blue
Reply-To: "rb...@netflix.com"
Date: Monday, December 3, 2018 at 12:05 PM
To: "Thakrar, Jayesh"
Cc: Xiao Li , Spark Dev List
Subject: Re: DataSourceV2 community sync #3
Ja
esh"
> *Cc: *Ryan Blue , "u...@spark.apache.org" <
> dev@spark.apache.org>
> *Subject: *Re: DataSourceV2 community sync #3
>
>
>
> Hi, Jayesh,
>
>
>
> This is a good question. Spark is a unified analytics engine for various
> data sources. We
To: "Thakrar, Jayesh"
Cc: Ryan Blue , "u...@spark.apache.org"
Subject: Re: DataSourceV2 community sync #3
Hi, Jayesh,
This is a good question. Spark is a unified analytics engine for various data
sources. We are able to get the table schema from the underlying data sources
park catalog be the common denominator of the other
> catalogs (least featured) or a super-feature catalog?
>
>
>
> *From: *Xiao Li
> *Date: *Saturday, December 1, 2018 at 10:49 PM
> *To: *Ryan Blue
> *Cc: *"u...@spark.apache.org"
> *Subject: *Re: DataSourceV2 co
Do you agree on my definition of catalog in Spark SQL?
I think we agree on what a catalog is: A service that can manage the
metadata and definitions of databases, views, tables, functions, roles, etc.
external objects accessed through our data source APIs are called “tables”.
I do not think we wi
d the Spark catalog be the common denominator of the other
> catalogs (least featured) or a super-feature catalog?
>
>
>
> *From: *Xiao Li
> *Date: *Saturday, December 1, 2018 at 10:49 PM
> *To: *Ryan Blue
> *Cc: *"u...@spark.apache.org"
> *Subject: *
Blue
Cc: "u...@spark.apache.org"
Subject: Re: DataSourceV2 community sync #3
Hi, Ryan,
Let us first focus on answering the most fundamental problem before discussing
various related topics. What is a catalog in Spark SQL?
My definition of catalog is based on the database catalog. Basi
Hi, Ryan,
Let us first focus on answering the most fundamental problem before
discussing various related topics. What is a catalog in Spark SQL?
My definition of catalog is based on the database catalog. Basically, the
catalog provides a service that manage the metadata/definitions of database
ob
I try to avoid discussing each specific topic about the catalog federation
before we deciding the framework of multi-catalog supports.
I’ve tried to open discussions on this for the last 6+ months because we
need it. I understand that you’d like a comprehensive plan for supporting
more than one ca
Hi, Ryan,
I try to avoid discussing each specific topic about the catalog federation
before we deciding the framework of multi-catalog supports.
- *CatalogTableIdentifier*: The PR
https://github.com/apache/spark/pull/21978 is doing nothing but adding an
interface. In the PR, we did not discuss h
Xiao,
I do have opinions about how multi-catalog support should work, but I don't
think we are at a point where there is consensus. That's why I've started
discussion threads and added the CatalogTableIdentifier PR instead of a
comprehensive design doc. You have opinions about how users should int
Hi, Ryan,
Catalog is a really important component for Spark SQL or any analytics
platform, I have to emphasize. Thus, a careful design is needed to ensure
it works as expected. Based on my previous discussion with many community
members, Spark SQL needs a catalog interface so that we can mount mul
Xiao,
For the questions in this last email about how catalogs interact and how
functions and other future features work: we discussed those last night. As
I said then, I think that the right approach is incremental. We don’t want
to design all of that in one gigantic proposal up front. To do that
Ryan,
All the proposal I read is only related to Table metadata. Catalog contains
the metadata of database, functions, columns, views, and so on. When we
have multiple catalogs, how these catalogs interact with each other? How
the global catalog works? How a view, table, function, database and col
Xiao,
Please have a look at the pull requests and documents I've posted over the
last few months.
If you still have questions about how you might plug in Glue, let me know
and I can clarify.
rb
On Thu, Nov 29, 2018 at 2:56 PM Xiao Li wrote:
> Ryan,
>
> Thanks for leading the discussion and se
Ryan,
Thanks for leading the discussion and sending out the memo!
> Xiao suggested that there are restrictions for how tables and functions
> interact. Because of this, he doesn’t think that separate TableCatalog and
> FunctionCatalog APIs are feasible.
Anything is possible. It depends on how
Hi everyone,
Here are my notes from last night’s sync. Some attendees that joined during
discussion may be missing, since I made the list while we were waiting for
people to join.
If you have topic suggestions for the next sync, please start sending them
to me. Thank you!
*Attendees:*
Ryan Blue
Based on my understanding, we are not inventing anything new here.
Basically, we are building a federated database system especially after we
supporting multiple catalog. There are many mature commercial products in
the market. For example,
https://www.ibm.com/support/knowledgecenter/en/SSEPGG_10.5
Hi Ryan,
Thanks for hosting the discussion! I think the table catalog is super
useful, but since this is the first time we allow users to extend catalog,
it's better to write down some details from end-user APIs to internal
management.
1. How would end-users register/unregister catalog with SQL AP
+1
Please add me to the Google Hangout invite.
--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Hi Ryan,
I would like to be added to the Google Hangout invite. Thank you.
Cheers,
Martin
On 26.11.18 23:54, Ryan Blue wrote:
Hi everyone,
I just sent out an invite for the next DSv2 community sync for
Wednesday, 28 Nov at 5PM PST.
We have a few topics left over from last time to cover.
21 matches
Mail list logo