That's a reasonable alternative. On Fri, Jun 29, 2018 at 7:57 PM Julian Hyde <jh...@apache.org> wrote:
> Maybe there could be a separator char as one of the adapter’s parameters. > People should choose a value, say ‘$’ or ‘#’, that is legal in an unquoted > SQL identifier but does not occur in any of their index or type names. > > If not specified, the adapter would end up in a simple mode, say looking > for indexes first, then looking for types, and people would need to make > sure indexes and types have distinct names. After the transition to > single-type indexes, people could stop using the parameter. > > Julian > > > > On Jun 29, 2018, at 4:43 PM, Andrei Sereda <and...@sereda.cc> wrote: > > > > That's a valid point. Then user would define a different pattern like > > "i$index_t$type" for his cluster. > > > > I think we should first answer wherever such scenarios should be > supported > > by calcite (given that they're already deprecated by the vendor). If yes, > > what should be collision strategy ? User defined pattern like above or > > failure or auto generated name ? > > > > On Fri, Jun 29, 2018, 19:14 Julian Hyde <jh...@apache.org> wrote: > > > >>> In elastic (index/type) pair is guaranteed to be unique therefore > >>> "${index}_${type}" will be also unique (as string). This is only > >> necessary > >>> when we have several types per index. Valid question is wherever user > >>> should be allowed such flexibility. > >> > >> Uniqueness is not my concern. > >> > >> Suppose there is an index called "x_y" with a type called "z", and > >> another index called "x" with a type called "y_z". If I write "x_y_z" > >> it's not clear how it should be broken into index/type. > >> > >> > >> On Fri, Jun 29, 2018 at 3:15 PM, Andrei Sereda <and...@sereda.cc> > wrote: > >>>> Can you show how those examples affect SQL against the ES adapter > and/or > >>> how they affect JSON models? > >>> > >>> The discussion is how to properly bridge (index/type) concept from ES > >> into > >>> relational world. Proposal to use placeholders ($index / $type) affects > >>> only how table is named in calcite. They're not used as SQL literals. > IE > >> it > >>> affects only configuration phase of the schema. > >>> Pretty much we're doing string/replace to derive table name from > >>> ($index/$type). > >>> > >>>> You seem to be using '_' as a separator character. Are we sure that > >>>> people will never use it in index or type name? Separator characters > >>>> often cause problems. > >>> In elastic (index/type) pair is guaranteed to be unique therefore > >>> "${index}_${type}" will be also unique (as string). This is only > >> necessary > >>> when we have several types per index. Valid question is wherever user > >>> should be allowed such flexibility. > >>> > >>> > >>> > >>> On Fri, Jun 29, 2018 at 2:19 PM Julian Hyde <jh...@apache.org> wrote: > >>> > >>>> Andrei, > >>>> > >>>> I'm not an ES user so I don't fully understand this issue, but my two > >>>> cents anyway... > >>>> > >>>> Can you show how those examples affect SQL against the ES adapter > >>>> and/or how they affect JSON models? > >>>> > >>>> You seem to be using '_' as a separator character. Are we sure that > >>>> people will never use it in index or type name? Separator characters > >>>> often cause problems. > >>>> > >>>> Julian > >>>> > >>>> > >>>> > >>>> > >>>> On Fri, Jun 29, 2018 at 10:58 AM, Andrei Sereda <and...@sereda.cc> > >> wrote: > >>>>> I agree there should be a configuration option. How about the > >> following > >>>>> approach. > >>>>> > >>>>> Expose both variables ${index} and ${type} in configuration (JSON) > and > >>>> user > >>>>> will use them to generate table name in calcite schema. > >>>>> > >>>>> Example > >>>>> "table_name": "${type}" // current > >>>>> "table_name": "${index}" // new (default?) > >>>>> "table_name": "${index}_${type}" // most generic. supports multiple > >> types > >>>>> per index > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> On Fri, Jun 29, 2018 at 9:26 AM Michael Mior <mm...@apache.org> > >> wrote: > >>>>> > >>>>>> I think it sounds like you and Andrei are in a good position to > >> tackle > >>>> this > >>>>>> one so I'm happy to have you both work on whatever solution you > >> think is > >>>>>> best. > >>>>>> > >>>>>> -- > >>>>>> Michael Mior > >>>>>> mm...@apache.org > >>>>>> > >>>>>> > >>>>>> > >>>>>> Le ven. 29 juin 2018 à 04:19, Christian Beikov < > >>>> christian.bei...@gmail.com > >>>>>>> > >>>>>> a écrit : > >>>>>> > >>>>>>> IMO the best solution would be to make it configurable by > >> introducing > >>>> a > >>>>>>> "table_mapping" config with values > >>>>>>> > >>>>>>> * type - every type in the known indices is mapped as table > >>>>>>> * index - every known index is mapped as table > >>>>>>> > >>>>>>> We'd probably also need a "type_field" configuration for defining > >>>> which > >>>>>>> field to use for the type determination as one of the possible > >> future > >>>>>>> ways to do things is to introduce a custom field: > >>>>>>> > >>>>>>> > >>>>>> > >>>> > >> > https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html#_custom_type_field_2 > >>>>>>> > >>>>>>> We already detect the ES version, so we can set a smart default for > >>>> this > >>>>>>> setting. Let's make the index config param optional. > >>>>>>> > >>>>>>> * When no index is given, we discover indexes, the default for > >>>>>>> "table_mapping" then is "index" > >>>>>>> * When index is given, the we only discover types according to > >> the > >>>>>>> "type_field" configuration and the default for "table_mapping" > >> is > >>>>>>> "type" > >>>>>>> > >>>>>>> This would also allow to discover indexes but still use "type" as > >>>>>>> "table_mapping". > >>>>>>> > >>>>>>> What do you think? > >>>>>>> > >>>>>>> Mit freundlichen Grüßen, > >>>>>>> > >>>> > ------------------------------------------------------------------------ > >>>>>>> *Christian Beikov* > >>>>>>> Am 29.06.2018 um 02:41 schrieb Andrei Sereda: > >>>>>>>> Yes. There is an API to list all indexes / types in elastic. They > >>>> can > >>>>>> be > >>>>>>>> automatically imported into a schema. > >>>>>>>> > >>>>>>>> What needs to be agreed upon is how to expose those elements in > >>>> calcite > >>>>>>>> schema (naming / behaviour). > >>>>>>>> > >>>>>>>> 1) Many (most?) of setups are single type per index. Natural way > >> to > >>>>>> name > >>>>>>>> would be "elastic.$index" (elastic being schema name). Multiple > >>>>>> indexes > >>>>>>>> would be under same schema "elastic.index1" "elastic.index2" etc. > >>>>>>>> > >>>>>>>> 2) What if index has several types should they exported as > >> calcite > >>>>>>> tables: > >>>>>>>> "elastic.$index_type1" "elastic.$index_type2" ? Or (current > >>>> behaviour) > >>>>>>> as > >>>>>>>> "elastic.type1" and "elastic.type2". Or as subschema > >>>>>>>> "elastic.$index.type1" ? > >>>>>>>> > >>>>>>>> Now what if one has combination of (1) and (2) ? > >>>>>>>> Setup (2) is already deprecated (and will be unsupported in next > >>>>>> version) > >>>>>>>> > >>>>>>>> > >>>>>>>> On Thu, Jun 28, 2018 at 7:31 PM Christian Beikov < > >>>>>>> christian.bei...@gmail.com> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Is there an API to discover indexes? If there is, I'd suggest we > >>>>>> allow a > >>>>>>>>> config option that to make the adapter discover the possible > >>>> indexes. > >>>>>>>>> We'd still have to adapt the code a bit, but internally, the > >> schema > >>>>>>>>> could just keep a cache of type name to index name map and be > >> able > >>>> to > >>>>>>>>> support both scenarios. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Mit freundlichen Grüßen, > >>>>>>>>> > >>>>>> > >> ------------------------------------------------------------------------ > >>>>>>>>> *Christian Beikov* > >>>>>>>>> Am 29.06.2018 um 00:12 schrieb Andrei Sereda: > >>>>>>>>>>> 1) What's the time horizon for the current adapter no longer > >>>> working > >>>>>>>>> with these > >>>>>>>>>> changes to ES ? > >>>>>>>>>> Current adapter will be working for a while with existing > >> setup. > >>>> The > >>>>>>>>>> problem is nomenclature and ease of use. > >>>>>>>>>> > >>>>>>>>>> Their new SQL concepts mapping > >>>>>>>>>> < > >>>>>>>>> > >>>>>>> > >>>>>> > >>>> > >> > https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html > >>>>>>>>>> drops > >>>>>>>>>> the notion of ES type (which before was equivalent of RDBMS > >> table) > >>>>>> and > >>>>>>>>> uses > >>>>>>>>>> ES index as new table equivalent (before ES index was equal to > >>>>>>> database). > >>>>>>>>>> Most users use elastic this way (one type , one index) index == > >>>>>> table. > >>>>>>>>>> > >>>>>>>>>> Currently calcite requires schema per index. In RDBMS parlance > >>>>>> database > >>>>>>>>> per > >>>>>>>>>> table (I'd like to change that). > >>>>>>>>>> > >>>>>>>>>>> 2) Any guess how complicated it would be to maintain code > >> paths > >>>> for > >>>>>>> both > >>>>>>>>>>> behaviours? I know this is probably really challenging to > >>>> estimate, > >>>>>>> but > >>>>>>>>> I > >>>>>>>>>>> really have no idea of the scope of these changes. Would it > >> mean > >>>> two > >>>>>>>>>>> different ES adapters? > >>>>>>>>>> One can have just a separate calcite schema implementations > >> (same > >>>>>>>>> adapter / > >>>>>>>>>> module) : > >>>>>>>>>> 1) LegacySchema (old). Schema can have only one index (but > >>>> multiple > >>>>>>>>>> types). Type == table in this case. > >>>>>>>>>> 2) NewSchema (new). Single schema can have multiple indexes > >>>> (type is > >>>>>>>>>> dropped). Index == table in this case > >>>>>>>>>> > >>>>>>>>>>> 3) Do we really need compatibility with the current version of > >>>> the > >>>>>>>>>> adapter? > >>>>>>>>>>> IMO this depends on what versions of ES we would lose support > >> for > >>>>>> and > >>>>>>>>> how > >>>>>>>>>>> complex it would be for users of the current ES adapter to > >> make > >>>>>>> updates > >>>>>>>>>> for > >>>>>>>>>>> any Calcite API changes. > >>>>>>>>>> The issue is not in adapter but how calcite schema exposes > >> tables. > >>>>>>>>> Should > >>>>>>>>>> it expose index as individual table (new), or ES type (old) ? > >>>>>>>>>> > >>>>>>>>>> Andrei. > >>>>>>>>>> > >>>>>>>>>> On Thu, Jun 28, 2018 at 5:23 PM Michael Mior <mm...@apache.org > >>> > >>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Unfortunately I know very little about ES so I'm not in a > >> great > >>>>>>>>> position to > >>>>>>>>>>> asses the impact of these changes. I will say that that legacy > >>>>>>>>>>> compatibility is great, but maintaining two sets of logic is > >>>> always > >>>>>> a > >>>>>>>>>>> challenge. A few follow up questions: > >>>>>>>>>>> > >>>>>>>>>>> 1) What's the time horizon for the current adapter no longer > >>>> working > >>>>>>>>> with > >>>>>>>>>>> these changes to ES? > >>>>>>>>>>> > >>>>>>>>>>> 2) Any guess how complicated it would be to maintain code > >> paths > >>>> for > >>>>>>> both > >>>>>>>>>>> behaviours? I know this is probably really challenging to > >>>> estimate, > >>>>>>> but > >>>>>>>>> I > >>>>>>>>>>> really have no idea of the scope of these changes. Would it > >> mean > >>>> two > >>>>>>>>>>> different ES adapters? > >>>>>>>>>>> > >>>>>>>>>>> 3) Do we really need compatibility with the current version of > >>>> the > >>>>>>>>> adapter? > >>>>>>>>>>> IMO this depends on what versions of ES we would lose support > >> for > >>>>>> and > >>>>>>>>> how > >>>>>>>>>>> complex it would be for users of the current ES adapter to > >> make > >>>>>>> updates > >>>>>>>>> for > >>>>>>>>>>> any Calcite API changes. > >>>>>>>>>>> > >>>>>>>>>>> Thanks for your continued work on the ES adapter Andrei! > >>>>>>>>>>> > >>>>>>>>>>> -- > >>>>>>>>>>> Michael Mior > >>>>>>>>>>> mm...@apache.org > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Le jeu. 28 juin 2018 à 12:57, Andrei Sereda <and...@sereda.cc> > >> a > >>>>>>> écrit > >>>>>>>>> : > >>>>>>>>>>>> Hello, > >>>>>>>>>>>> > >>>>>>>>>>>> Elastic announced > >>>>>>>>>>>> < > >>>>>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>>>> > >>>> > >> > https://www.elastic.co/guide/en/elasticsearch/reference/master/removal-of-types.html > >>>>>>>>>>>> that they will be deprecating mapping types in ES6 and > >> indexes > >>>> will > >>>>>>> be > >>>>>>>>>>>> single-typed only. > >>>>>>>>>>>> > >>>>>>>>>>>> Historical analogy < > >> https://www.elastic.co/blog/index-vs-type> > >>>>>>> between > >>>>>>>>>>>> RDBMS and elastic was that index is equivalent to a database > >> and > >>>>>> type > >>>>>>>>>>>> corresponds to table in that database. In a couple of > >> releases > >>>>>>> (ES6-8) > >>>>>>>>>>> this > >>>>>>>>>>>> shall not longer be true. > >>>>>>>>>>>> > >>>>>>>>>>>> Recent SQL addition > >>>>>>>>>>>> <https://www.elastic.co/blog/elasticsearch-6-3-0-released> > >> to > >>>>>>> elastic > >>>>>>>>>>>> confirms > >>>>>>>>>>>> this trend > >>>>>>>>>>>> < > >>>>>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>>>> > >>>> > >> > https://www.elastic.co/guide/en/elasticsearch/reference/current/_mapping_concepts_across_sql_and_elasticsearch.html > >>>>>>>>>>>>> . > >>>>>>>>>>>> Index is equivalent to a table and there are no more ES > >> types. > >>>>>>>>>>>> > >>>>>>>>>>>> I would like to propose to include this logic in Calcite ES > >>>>>> adapter. > >>>>>>>>> IE, > >>>>>>>>>>>> expose each ES single-typed index as a separate table inside > >>>>>> calcite > >>>>>>>>>>>> schema. This is in contrast to current integration where > >> schema > >>>>>> can > >>>>>>>>> only > >>>>>>>>>>>> have a single index. Current approach forces you to create > >>>> multiple > >>>>>>>>>>> schemas > >>>>>>>>>>>> to query single-typed indexes (on the same ES cluster). > >>>>>>>>>>>> > >>>>>>>>>>>> Legacy compatibility can always be controlled with > >> configuration > >>>>>>>>>>>> parameters. > >>>>>>>>>>>> > >>>>>>>>>>>> Do you agree with such changes ? If yes, would you consider a > >>>> PR ? > >>>>>>>>>>>> > >>>>>>>>>>>> Regards, > >>>>>>>>>>>> Andrei. > >>>>>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>> > >> > >