I'm not sure that is going to give you what you are expecting since the counts will probably be based on the result of the join, not on the original tables. Try it and see, but remain skeptical until you validate the results.
From: Frank Luo [mailto:j...@merkleinc.com] Sent: Thursday, December 03, 2015 10:14 AM To: user@hive.apache.org Subject: RE: how to get counts as a byproduct of a query Ryan, Thanks for your reply. Your previous response gave me some hints. I think below will scan tables just once: from table_A a join table_B b on a.X = b.X insert INTO TABLE table_C select a.X, a.Y, b.Z insert OVERWRITE TABLE count_A select count(a.X) insert OVERWRITE TABLE count_B select count(b.X) ; From: Ryan Harris [mailto:ryan.har...@zionsbancorp.com] Sent: Wednesday, December 02, 2015 4:20 PM To: user@hive.apache.org Subject: RE: how to get counts as a byproduct of a query Personally, I'd do it this way... https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics Select suba.X, suba.Y, suba.countA, subb.Z, subb.countB FROM (SELECT x, y, count(1) OVER (PARTITION BY X) as countA) suba JOIN (SELECT x, z, count(1) OVER (PARTITION BY X) as countB) subb ON (suba.X = subb.X) From: Frank Luo [mailto:j...@merkleinc.com] Sent: Wednesday, December 02, 2015 2:43 PM To: user@hive.apache.org<mailto:user@hive.apache.org> Subject: RE: how to get counts as a byproduct of a query I might not illustrate the problem well. Let’s try on a sample. Here is what I have: Table_A, column X and Y Table B, column X and Z I want to do a join on both tables on column X, like “select a.X, a.Y, b.Z From A a Join B b on A.X=B.X” In the meaning time, I want to get counts for both Table A and B. I am not able to write a query to do both. From: Jörn Franke [mailto:jornfra...@gmail.com] Sent: Wednesday, December 02, 2015 3:22 PM To: user@hive.apache.org<mailto:user@hive.apache.org> Subject: Re: how to get counts as a byproduct of a query I am not sure if I understand, but why this should not be possible using SQL in hive? On 02 Dec 2015, at 21:26, Frank Luo <j...@merkleinc.com<mailto:j...@merkleinc.com>> wrote: Didn’t get any response, so trying one more time. I cannot believe I am the only one facing the problem. From: Frank Luo Sent: Tuesday, December 01, 2015 10:40 PM To: user@hive.apache.org<mailto:user@hive.apache.org> Subject: how to get counts as a byproduct of a query Very often I need to run a query against a table(s), then collect some counts. I am wondering if there is a way to kill two birds by scanning the table once. (I don’t mind to save the counts as a separate file or something like that) For example, I got a table A and B. I need to do an inner join to get some result. In the meaning time, I need to know the table counts for both A and B. Is there a smart way to get join result as well as the counts by reading the tables once? Thanks in advance. This email and any attachments transmitted with it are intended for use by the intended recipient(s) only. If you have received this email in error, please notify the sender immediately and then delete it. If you are not the intended recipient, you must not keep, use, disclose, copy or distribute this email without the author’s prior permission. We take precautions to minimize the risk of transmitting software viruses, but we advise you to perform your own virus checks on any attachment to this message. We cannot accept liability for any loss or damage caused by software viruses. The information contained in this communication may be confidential and may be subject to the attorney-client privilege. This email and any attachments transmitted with it are intended for use by the intended recipient(s) only. If you have received this email in error, please notify the sender immediately and then delete it. If you are not the intended recipient, you must not keep, use, disclose, copy or distribute this email without the author’s prior permission. We take precautions to minimize the risk of transmitting software viruses, but we advise you to perform your own virus checks on any attachment to this message. We cannot accept liability for any loss or damage caused by software viruses. The information contained in this communication may be confidential and may be subject to the attorney-client privilege. ________________________________ THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS CONFIDENTIAL and may contain information that is privileged and exempt from disclosure under applicable law. If you are neither the intended recipient nor responsible for delivering the message to the intended recipient, please note that any dissemination, distribution, copying or the taking of any action in reliance upon the message is strictly prohibited. If you have received this communication in error, please notify the sender immediately. Thank you. This email and any attachments transmitted with it are intended for use by the intended recipient(s) only. If you have received this email in error, please notify the sender immediately and then delete it. If you are not the intended recipient, you must not keep, use, disclose, copy or distribute this email without the author’s prior permission. We take precautions to minimize the risk of transmitting software viruses, but we advise you to perform your own virus checks on any attachment to this message. We cannot accept liability for any loss or damage caused by software viruses. The information contained in this communication may be confidential and may be subject to the attorney-client privilege. ====================================================================== THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS CONFIDENTIAL and may contain information that is privileged and exempt from disclosure under applicable law. If you are neither the intended recipient nor responsible for delivering the message to the intended recipient, please note that any dissemination, distribution, copying or the taking of any action in reliance upon the message is strictly prohibited. If you have received this communication in error, please notify the sender immediately. Thank you.