[ https://issues.apache.org/jira/browse/HIVE-11034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Srini Pindi updated HIVE-11034: ------------------------------- Attachment: (was: hive_issue.zip) > Joining multiple tables producing different results with different order of > join > -------------------------------------------------------------------------------- > > Key: HIVE-11034 > URL: https://issues.apache.org/jira/browse/HIVE-11034 > Project: Hive > Issue Type: Bug > Components: Query Processor > Affects Versions: 0.13.0 > Environment: Linux 2.6.32-279.19.1.el6.x86_64 > Reporter: Srini Pindi > Priority: Critical > > Join between one main table with other tables with different join columns > returns wrong results in hive. Changing the order of the joins between main > table and other tables is producing different results. > Here are the steps to reproduce the issue: > 1. Create tables as follows: > create table p(ck string, email string); > create table a1(ck string, flag string); > create table a2(email string, flag string); > create table a3(ck string, flag string); > 2. Load data into the tables as follows: > P > ||ck||email|| > |10|e10| > |20|e20| > |30|e30| > |40|e40| > > A1 > ||ck||flag|| > |10||N| > |20||Y| > |30||Y| > |40||Y| > A2 > ||email||flag|| > |e10||Y| > |e20||N| > |e30||Y| > |e40||Y| > > A3 > ||ck||flag|| > |10||Y| > |20||Y| > |30||N| > |40||Y| > > 3. Good query: > {panel} > select p.ck > from p > left outer join a1 on p.ck = a1.ck > left outer join a3 on p.ck = a3.ck > left outer join a2 on p.email = a2.email > where a1.flag = 'Y' > and a3.flag = 'Y' > and a2.flag = 'Y' > ; > {panel} > and results are > 40 > 4. Bad query > {panel} > select p.ck > from p > left outer join a1 on p.ck = a1.ck > left outer join a2 on p.email = a2.email > left outer join a3 on p.ck = a3.ck > where a1.flag = 'Y' > and a2.flag = 'Y' > and a3.flag = 'Y' > ; > {panel} > Producing results as: > 30 > 40 -- This message was sent by Atlassian JIRA (v6.3.4#6332)