Hi Mich, yes, the structured field has very good selectivity. I would not achieve perfectly equally sized buckets, but I don't expect any skew problems.
Of course, moving the structured field to top-level would allow bucketing. But I would prefer to not change the schema, as many queries have already been written against the existing structure. Best regards Michael > On 2015-04-17, at 19:08, Mich Talebzadeh <m...@peridale.co.uk> wrote: > > Hi Michael, > > I would be curious to know what advantage you are going to get by hashing a > structured field. Has that structured field got very high selectivity so you > end up with equally sized buckets (files) spread? > > How about the following > > hive> CREATE TABLE foo (id bigint, bar struct<a:string, b:string>) CLUSTERED > BY (id) INTO 32 buckets; > OK > Time taken: 0.787 seconds > > > HTH > > > Mich Talebzadeh > > http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/> > > Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", > ISBN 978-0-9563693-0-7. > co-author "Sybase Transact SQL Guidelines Best Practices", ISBN > 978-0-9759693-0-4 > Publications due shortly: > Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and > Coherence Cache > Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume > one out shortly > > NOTE: The information in this email is proprietary and confidential. This > message is for the designated recipient only, if you are not the intended > recipient, you should destroy it immediately. Any information in this message > shall not be understood as given or endorsed by Peridale Ltd, its > subsidiaries or their employees, unless expressly so stated. It is the > responsibility of the recipient to ensure that this email is virus free, > therefore neither Peridale Ltd, its subsidiaries nor their employees accept > any responsibility. > > > -----Original Message----- > From: Michael Häusler [mailto:mich...@akatose.de] > Sent: 17 April 2015 17:36 > To: user@hive.apache.org > Subject: Table bucketing on structured fields > > Hi there, > > in Hive 0.13.0, I am trying to create a table that should be bucketed by a > structured field: > > CREATE TABLE foo (bar struct<a:string,b:string>) CLUSTERED BY (bar.a) INTO 32 > buckets; > > Unfortunately, I am getting an error that dots are not allowed in the buckets > specification: > > Error occurred executing hive query: OK FAILED: ParseException line 2:17 > mismatched input '.' expecting ) near 'bar' in table buckets specification > > > Is there a workaround? > > Thanks a lot > Michael