Hi everyone,

I had to kill some queries that were taking forever, and it turns out
they were doing cartesian products (missing ON clause on a JOIN).

I wonder how I could see that in the EXPLAIN output (which I still find
a bit cryptic). Specifically, the stage that it was stuck in was this:

  Stage: Stage-7
    Map Reduce
      Alias -> Map Operator Tree:
        $INTNAME
            Reduce Output Operator
              sort order:
              tag: 1
              value expressions:
                    expr: _col1
                    type: int
        $INTNAME1
            Reduce Output Operator
              sort order:
              tag: 0
              value expressions:
                    expr: _col0
                    type: bigint
                    expr: _col1
                    type: string
      Reduce Operator Tree:
        Join Operator
          condition map:
               Inner Join 0 to 1
          condition expressions:
            0 {VALUE._col0} {VALUE._col1}
            1 {VALUE._col1}
          handleSkewJoin: false
          outputColumnNames: _col0, _col1, _col3
          File Output Operator
            compressed: true
            GlobalTableId: 0
            table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat

Is there anything in there that should have alerted me?

I found out by looking at the query, but I wonder if the query plan (if
I could read it) would have given me that information.

Thanks a lot

David Morel

Reply via email to