Thanks Lefty! I haven't created a JIRA but it's on my list. Still interested in any input anyone might have?
Some of you guys _must_ have some knowledge on this stuff. You're correct that the DROP stuff is documented. What I meant is that I couldn't find the relevant parts in the code. On Sat, Jul 21, 2018 at 6:15 AM, Lefty Leverenz <leftylever...@gmail.com> wrote: > Agreed, the Hive wiki is woefully incomplete. But it does mention DROP for > external tables in two sections of the DDL doc: > > - External Tables > <https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL# > LanguageManualDDL-ExternalTables>: > "When dropping an EXTERNAL table, data in the table is NOT deleted from > the file system." > - Drop Table > <https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL# > LanguageManualDDL-DropTable>: > "When dropping an EXTERNAL table, data in the table will NOT be deleted > from the file system." > > A wiki page comparing managed and external tables would be very helpful, > and the list is a good start. I suggest you open a JIRA issue. > > Who can help Lars with this task? > > -- Lefty > > > On Thu, Jul 19, 2018 at 3:04 PM Lars Francke <lars.fran...@gmail.com> > wrote: > > > Hi, > > > > I've been wondering if anyone can tell me what the differences and > > limitations of managed vs. external tables are these days. > > > > I've seen the docs from Hortonworks[1] that list these features as not > > supported by External tables: > > * Query cache > > * Materialized views, except in a limited way > > * Default statistics gathering > > * Compute queries using statistics > > * Automatic runtime filtering > > * File merging after insert > > > > But there are no details. For some of them I'm not even sure what they > are > > supposed to mean because I couldn't find any documentation in our Wiki :( > > > > So I'd love to create a Wiki page that describes the differences between > > managed & external tables and list relevant JIRAs but I need your help > for > > that. > > > > I scanned the code for all references to TableType and this is the list I > > found but I'm almost certainly missing something: > > > > * ARCHIVE/UNARCHIVE - DDLTask - Only works for managed tables > > * TRUNCATE - DDLSemanticAnalyzer - Only works for managed tables > > * MERGE/CONCATENATE - HiveRelOpMaterializationValidator - Only works for > > managed tables > > * Constraints - DDLSemanticAnalyzer - (NOT NULL, DEFAULT, CHECK, only > RELY > > ist allowed) > > * IMPORT - ImportSemanticAnalyzer - This has some wild restrictions I > > didn't follow for external tables > > * Query Results Caching - https://issues.apache.org/ > jira/browse/HIVE-18513 > > SemanticAnalyzer - Documentation missing for Results Cache > > > > So there's a bunch of those from the Hortonworks list missing. If anyone > > could point me to documentation or code for those that'd be great. > > > > I also didn't find the most obvious/oldest one: Data is not deleted for > > EXTERNAL tables on DROP, I'm sure I just missed it. > > > > And last but not least: I have not looked at the grammar to see what's > > already forbidden at that level. > > > > As I said: Any hints would be greatly appreciated. > > > > Thank you! > > > > Lars > > > > A side note: There are so many fantastic features in Hive but because > lots > > of them are not or under-documented most people I know just stick to the > > basics. That's a pity :( > > > > [1] < > > > > https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.0/ > using-hiveql/content/hive_hive_3_tables.html > > > > > >