use bloom filters to improve the performance of map joins
---------------------------------------------------------

                 Key: HIVE-1721
                 URL: https://issues.apache.org/jira/browse/HIVE-1721
             Project: Hadoop Hive
          Issue Type: New Feature
          Components: Query Processor
            Reporter: Namit Jain
            Assignee: Liyin Tang


In case of map-joins, it is likely that the big table will not find many 
matching rows from the small table.
Currently, we perform a hash-map lookup for every row in the big table, which 
can be pretty expensive.

It might be useful to try out a bloom-filter containing all the elements in the 
small table.
Each element from the big table is first searched in the bloom filter, and only 
in case of a positive match,
the small table hash table is explored.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to