Thanks for the feedback. I am preparing a doc and a PoC, will post soon. On Mon, Oct 14, 2019 at 3:17 AM Wenchen Fan <cloud0...@gmail.com> wrote:
> I'm fine with the view definition proposed here, but my major concern is > how to make sure table/view share the same namespace. According to the SQL > spec, if there is a view named "a", we can't create a table named "a" > anymore. > > We can add documents and ask the implementation to guarantee it, but it's > better if this can be guaranteed by the API. > > On Wed, Aug 14, 2019 at 1:46 AM John Zhuge <jzh...@apache.org> wrote: > >> Thanks for the feedback, Ryan! I can share the WIP copy of the SPIP if >> that makes sense. >> >> I can't find out a lot about view resolution and validation in SQL Spec >> Part1. Anybody with full SQL knowledge may chime in. >> >> Here are my understanding based on online manuals, docs, and other >> resources: >> >> - A view has a name in the database schema so that other queries can >> use it like a table. >> - A view's schema is frozen at the time the view is created; >> subsequent changes to underlying tables (e.g. adding a column) will not be >> reflected in the view's schema. If an underlying table is dropped or >> changed in an incompatible fashion, subsequent attempts to query the >> invalid view will fail. >> >> In Preso, view columns are used for validation only (see >> StatementAnalyzer.Visitor#isViewStale): >> >> - view column names must match the visible fields of analyzed view sql >> - the visible fields can be coerced to view column types >> >> In Spark 2.2+, view columns are also used for validation (see >> CheckAnalysis#checkAnalysis case View): >> >> - view column names must match the output fields of the view sql >> - view column types must be able to UpCast to output field types >> >> Rule EliminateView adds a Project to viewQueryColumnNames if it exists. >> >> As for `softwareVersion`, the purpose is to track which software version >> is used to create the view, in preparation for different versions of the >> same software or even different softwares, such as Presto vs Spark. >> >> >> On Tue, Aug 13, 2019 at 9:47 AM Ryan Blue <rb...@netflix.com> wrote: >> >>> Thanks for working on this, John! >>> >>> I'd like to see a more complete write-up of what you're proposing. >>> Without that, I don't think we can have a productive discussion about this. >>> >>> For example, I think you're proposing to keep the view columns to ensure >>> that the same columns are produced by the view every time, based on >>> requirements from the SQL spec. Let's start by stating what those behavior >>> requirements are, so that everyone has the context to understand why your >>> proposal includes the view columns. Similarly, I'd like to know why you're >>> proposing `softwareVersion` in the view definition. >>> >>> On Tue, Aug 13, 2019 at 8:56 AM John Zhuge <jzh...@apache.org> wrote: >>> >>>> Catalog support has been added to DSv2 along with a table catalog >>>> interface. Here I'd like to propose a view catalog interface, for the >>>> following benefit: >>>> >>>> - Abstraction for view management thus allowing different view >>>> backends >>>> - Disassociation of view definition storage from Hive Metastore >>>> >>>> A catalog plugin can be both TableCatalog and ViewCatalog. Resolve an >>>> identifier as view first then table. >>>> >>>> More details in SPIP and PR if we decide to proceed. Here is a quick >>>> glance at the API: >>>> >>>> ViewCatalog interface: >>>> >>>> - loadView >>>> - listViews >>>> - createView >>>> - deleteView >>>> >>>> View interface: >>>> >>>> - name >>>> - originalSql >>>> - defaultCatalog >>>> - defaultNamespace >>>> - viewColumns >>>> - owner >>>> - createTime >>>> - softwareVersion >>>> - options (map) >>>> >>>> ViewColumn interface: >>>> >>>> - name >>>> - type >>>> >>>> >>>> Thanks, >>>> John Zhuge >>>> >>> >>> >>> -- >>> Ryan Blue >>> Software Engineer >>> Netflix >>> >> >> >> -- >> John Zhuge >> > -- John Zhuge