Hello, I am making a query such that:
insert overwrite table selection_hourly_clicks partition (date_hour = PARTNAME) select sel_sid, count(*) cc from (select split(parse_url(iv.referrer_url,'PATH'), '_')[1] sel_sid from item_raw iv where iv.date_hour='PARTNAME' AND iv.referrer_url is not null AND substring(parse_url(iv.referrer_url,'PATH'),0,8)=='/mypath/') s group by sel_sid if the url referrer starts is like /mypath/blabla_10, I get 10, which is the sel_sid, and then agregate by number of sel_sids per hour. all is fine, and the query runs. but for some partitions, it finds nothing, which is also fine. but when I look over hdfs, I see files like: SEQ"org.apache.hadoop.io.BytesWritableorg.apache.hadoop.io.Text���������&�u"�͇���<� SEQ"org.apache.hadoop.io.BytesWritableorg.apache.hadoop.io.Text��������h�:��j'P�*/ those are for partitions that does not have a count, i,e the query does not return anything. when it returns something it writes a file like: SEQ"org.apache.hadoop.io.BytesWritableorg.apache.hadoop.io.Text������i�0+9? ������� �������1515 everything totally works, but this behaivor is inconsistent with my other group by queryies, which dont write anyfile if the group by does not produce and result. is there something wrong with my query? best regards, -c.b.