Hi everyone, According to the feedback, it seems adding `isTemporary` to factory context is the preferred way to fix the issue. I'll go ahead and make the change if no one objects.
On Tue, Aug 25, 2020 at 5:36 PM Rui Li <lirui.fu...@gmail.com> wrote: > Hi, > > Thanks everyone for your inputs. > > Temporary hive table is not supported at the moment. If we want to support > it, I agree with Jingsong that the life cycle of the temporary table should > somehow be bound to the hive catalog. For example, hive catalog should be > responsible to delete the table folder when the temporary table is dropped, > or when hive catalog itself gets unregistered. Whether such a table should > be stored in metastore is another topic and open to discussion. Hive > doesn't store temporary tables in metastore, so we probably won't want to > do it either. > > I'm fine with either of the solutions proposed. I think we can add the > `isTemporary` flag to factory context, even though temporary tables > currently don't belong to a catalog. Because conceptually, whether a table > is temporary should be part of the context that a factory may be interested > in. > > On Tue, Aug 25, 2020 at 4:48 PM Jingsong Li <jingsongl...@gmail.com> > wrote: > >> Hi Jark, >> >> You raised a good point: Creating the Hive temporary table. >> AFAIK, Hive temporary tables should be stored in metastore, Hive >> metastore will maintain their life cycle. Correct me if I am wrong. >> >> So actually, if we want to support Hive temporary tables, we should >> finish one thing: >> - A temporary table should belong to Catalog. >> - Instead of current, a temporary table belongs to CatalogManager. >> >> It means, `createTemporaryTable` and `dropTemporaryTable` should be >> proxied into the Catalog. >> In this situation, actually, we don't need the "isTemporary" flag. (But >> we can have it too) >> >> Best, >> Jingsong >> >> On Tue, Aug 25, 2020 at 4:32 PM Jark Wu <imj...@gmail.com> wrote: >> >>> Hi, >>> >>> I'm wondering if we always fallback to using SPI for temporary tables, >>> then >>> how does the create Hive temporary table using Hive dialect work? >>> >>> IMO, adding an "isTemporary" to the factory context sounds reasonable to >>> me, because the factory context should describe the full content of >>> create >>> table DDL. >>> >>> Best, >>> Jark >>> >>> >>> On Tue, 25 Aug 2020 at 16:01, Jingsong Li <jingsongl...@gmail.com> >>> wrote: >>> >>> > Hi Dawid, >>> > >>> > But the temporary table does not belong to Catalog, actually >>> > Catalog doesn't know the existence of the temporary table. Let the >>> table >>> > factory of catalog to create source/sink sounds a little sudden. >>> > >>> > If we want to make temporary tables belong to Catalog, I think we need >>> to >>> > involve catalog when creating temporary tables. >>> > >>> > Best, >>> > Jingsong >>> > >>> > On Tue, Aug 25, 2020 at 3:55 PM Dawid Wysakowicz < >>> dwysakow...@apache.org> >>> > wrote: >>> > >>> > > Hi Rui, >>> > > >>> > > My take is that temporary tables should use the factory of the >>> catalog >>> > > they were registered with. >>> > > >>> > > What you are describing sounds very much like a limitation/bug in >>> Hive >>> > > catalog only. I'd be in favor of passing the *isTemporary* flag. >>> > > >>> > > Best, >>> > > >>> > > Dawid >>> > > >>> > > On 25/08/2020 09:37, Rui Li wrote: >>> > > > Hi Dev, >>> > > > >>> > > > Currently temporary generic tables cannot work with hive catalog >>> [1]. >>> > > When >>> > > > hive catalog is chosen as the current catalog, planner will use >>> > > > HiveTableFactory to create source/sink for the temporary >>> > > > table. HiveTableFactory cannot tell whether a table is temporary or >>> > not, >>> > > > and considers it as a Hive table, which leads to job failure. >>> > > > I've discussed with Jingsong offline and we believe one solution >>> is to >>> > > make >>> > > > planner avoid using catalog table factory for temporary tables. >>> But I'd >>> > > > also like to hear more opinions from others whether this is the >>> right >>> > way >>> > > > to go. I think a possible alternative is to add an *isTemporary* >>> field >>> > > > to TableSourceFactory.Context & TableSinkFactory.Context, so that >>> > > > HiveTableFactory knows how to handle such tables. What do you >>> think? >>> > > > >>> > > > [1] https://issues.apache.org/jira/browse/FLINK-18999 >>> > > > >>> > > >>> > > >>> > >>> > -- >>> > Best, Jingsong Lee >>> > >>> >> >> >> -- >> Best, Jingsong Lee >> > > > -- > Best regards! > Rui Li > -- Best regards! Rui Li