Michael Unlike partitions, hive doesn't choose buckets by itself based on a where clause in HQL. You need to specify the number of buckets that the query should be executing on using TABLE SAMPLE clause . SELECT * from T TABLESAMPLE (m OUT OF n BUCKETS ON id) where user_id = X;
https://cwiki.apache.org/Hive/languagemanual-sampling.html Buckets is mostly used for sampling and your query doesn't look like one, if you are looking for grouping of data into subsets and processing only the same intead of the whole data set then may be you should consider using Partitions. Regards Bejoy.K.S ________________________________ From: "mdefoinplatel....@orange.com" <mdefoinplatel....@orange.com> To: "user@hive.apache.org" <user@hive.apache.org> Sent: Friday, March 2, 2012 3:15 PM Subject: Where clause and bucketized table Hi folks, I have a table T bucketized on user_id and I am surprised to see that all the buckets are read during the execution of the following query: SELECT * from T where user_id = X What should I do to make sure hive will account for the bucket structure to run this query ? Cheers, Michael _________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, France Telecom - Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorization. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, France Telecom - Orange shall not be liable if this message was modified, changed or falsified. Thank you.