hive 0.14 on some platform return some not NULL value as NULL
I use hive 1.1.0 cli on computer A (linux) the result is 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM2357378283356 9150119100048 7326356 NULL 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM2357378283356 121501191035580028 7326356 NULL UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG2362223711289 161501191549050061 14837289 NULL Y49EY895ACABHS95DRQEE8DVFEB8JSE12360853052224 111501191426280023 115883224 NULL But I use hive0.14 cli in my test enviroment the result is correct. I use hive 0.10 on computer B (linux) the result is 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM2357378283356 9150119100048 73263562015-01-19 10:44:44 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM2357378283356 121501191035580028 73263562015-01-19 10:35:58 UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG2362223711289 161501191549050061 14837289 2015-01-19 15:49:05 Y49EY895ACABHS95DRQEE8DVFEB8JSE12360853052224 111501191426280023 115883224 2015-01-19 14:26:28 Why ? I attach my log. Also in my log I found 2015-04-01 09:55:38,409 WARN [main] org.apache.hadoop.hive.serde2.lazy.LazyStruct: Extra bytes detected at the end of the row! Ignoring similar problems. r7raul1...@163.com
Re: Re: hive 0.14 on some platform return some not NULL value as NULL
DDL is CREATE TABLE dw.fct_traffic_navpage_path_detl( date_id string, chanl_id bigint, sessn_id string, gu_id string, prov_id string, city_id string, landing_page_type_id string, landing_track_time string, landing_url string, nav_refer_tracker_id string, nav_refer_page_type_id string, nav_refer_page_value string, nav_refer_link_position string, nav_tracker_id string, nav_page_categ_id string, nav_page_type_id string, nav_page_value string, nav_srce_type string, internal_keyword string, internal_result_sum string, pltfm_id int, app_vers string, nav_link_position string, nav_button_position string, nav_track_time string, nav_next_tracker_id string, sessn_last_time string, sessn_pv int, detl_tracker_id string, detl_page_type_id string, detl_page_value string, detl_pm_id bigint, detl_link_position string, detl_position_track_id string, cart_tracker_id string, cart_page_type_id string, cart_page_value string, cart_link_postion string, cart_button_position string, cart_position_track_id string, cart_prod_id bigint, ordr_tracker_id string, ordr_page_type_id string, ordr_code string, updt_time string, cart_pm_id bigint, brand_code string, categ_type int, os string, end_user_id string, add_cart_flag string, navgation_page_flag int, nav_page_url string, detl_button_position string, manul_flag int, manul_track_date string, nav_refer_tpa string, nav_refer_tpa_id string, nav_refer_tpc string, nav_refer_tpi string, nav_refer_tcs string, nav_refer_tcsa string, nav_refer_tcdt string, nav_refer_tcd string, nav_refer_tci string, nav_refer_postn_type string, nav_tpa_id string, nav_tpa string, nav_tpc string, nav_tpi string, nav_tcs string, nav_tcsa string, nav_tcdt string, nav_tcd string, nav_tci string, nav_postn_type string, detl_tpa_id string, detl_tpa string, detl_tpc string, detl_tpi string, detl_tcs string, detl_tcsa string, detl_tcdt string, detl_tcd string, detl_tci string, detl_postn_type string, cart_tpa_id string, cart_tpa string, cart_tpc string, cart_tpi string, cart_tcs string, cart_tcsa string, cart_tcdt string, cart_tcd string, cart_tci string, cart_postn_type string, sessn_chanl_id bigint, gu_sec_flg bigint, detl_refer_page_type_id string, detl_refer_page_value string, detl_event_id string, nav_refer_intrn_reslt_sum string, nav_intrn_reslt_sum string, nav_refer_intrn_kw string, nav_intrn_kw string, detl_track_time string, cart_track_time string) PARTITIONED BY ( ds string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/user/hive/dw/fct_traffic_navpage_path_detl' TBLPROPERTIES ( 'numPartitions'='265', 'numFiles'='26677', 'last_modified_by'='bi_etl', 'last_modified_time'='1423633028', 'transient_lastDdlTime'='1427870517', 'numRows'='0', 'totalSize'='8268127466928', 'rawDataSize'='0') My query is : SELECT a1.sessn_id, a1.ordr_code, a1.cart_tracker_id, a1.end_user_id, a1.cart_track_time FROM dw.fct_traffic_navpage_path_detl a1 WHERE a1.ds = '2015-01-19' ANDa1.cart_tracker_id > 0 AND(a1.cart_button_position IS NULL OR length(a1.cart_button_position) = 0) AND a1.sessn_id IN ('Y49EY895ACABHS95DRQEE8DVFEB8JSE1', 'UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG', '87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM') I attach my sample data. r7raul1...@163.com From: Thejas Nair Date: 2015-04-02 15:28 To: dev Subject: Re: hive 0.14 on some platform return some not NULL value as NULL Can you give more details - the query you are running - schema of the table - serialization format of the table, sample records if possible. On Wed, Apr 1, 2015 at 6:32 PM, r7raul1...@163.com wrote: > > I use hive 1.1.0 cli on computer A (linux) the result is > > 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM2357378283356 9150119100048 > 7326356 NULL > > 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM2357378283356 121501191035580028 > 7326356 NULL > > UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG2362223711289 161501191549050061 > 14837289 NULL > > Y49EY895ACABHS95DRQEE8DVFEB8JSE12360853052224 111501191426280023 > 115883224 NULL > > But I use hive0.14 cli in my test enviroment the result is correct. > > > I use hive 0.10 on computer B (linux) the result is > > 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM2357378283356 9150119100048 > 73263562015-01-19 10:44:44 > > 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM2357378283356 121
Re: Re: hive 0.14 on some platform return some not NULL value as NULL
In my test enviroment . I use hive 0.14 ,hive 1.1.0 ,the result is ok. But in production enviroment ,the result is not correct. r7raul1...@163.com From: Thejas Nair Date: 2015-04-02 16:41 To: r7raul1...@163.com CC: dev Subject: Re: Re: hive 0.14 on some platform return some not NULL value as NULL I am unable to reproduce this issue using the sample data - For this query, using 1.1.0, i get the following result- 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM2357378283356 9150119100048 7326356 2015-01-19 10:44:442015-01-19 On Thu, Apr 2, 2015 at 12:36 AM, r7raul1...@163.com wrote: > > DDL is > CREATE TABLE dw.fct_traffic_navpage_path_detl( > date_id string, > chanl_id bigint, > sessn_id string, > gu_id string, > prov_id string, > city_id string, > landing_page_type_id string, > landing_track_time string, > landing_url string, > nav_refer_tracker_id string, > nav_refer_page_type_id string, > nav_refer_page_value string, > nav_refer_link_position string, > nav_tracker_id string, > nav_page_categ_id string, > nav_page_type_id string, > nav_page_value string, > nav_srce_type string, > internal_keyword string, > internal_result_sum string, > pltfm_id int, > app_vers string, > nav_link_position string, > nav_button_position string, > nav_track_time string, > nav_next_tracker_id string, > sessn_last_time string, > sessn_pv int, > detl_tracker_id string, > detl_page_type_id string, > detl_page_value string, > detl_pm_id bigint, > detl_link_position string, > detl_position_track_id string, > cart_tracker_id string, > cart_page_type_id string, > cart_page_value string, > cart_link_postion string, > cart_button_position string, > cart_position_track_id string, > cart_prod_id bigint, > ordr_tracker_id string, > ordr_page_type_id string, > ordr_code string, > updt_time string, > cart_pm_id bigint, > brand_code string, > categ_type int, > os string, > end_user_id string, > add_cart_flag string, > navgation_page_flag int, > nav_page_url string, > detl_button_position string, > manul_flag int, > manul_track_date string, > nav_refer_tpa string, > nav_refer_tpa_id string, > nav_refer_tpc string, > nav_refer_tpi string, > nav_refer_tcs string, > nav_refer_tcsa string, > nav_refer_tcdt string, > nav_refer_tcd string, > nav_refer_tci string, > nav_refer_postn_type string, > nav_tpa_id string, > nav_tpa string, > nav_tpc string, > nav_tpi string, > nav_tcs string, > nav_tcsa string, > nav_tcdt string, > nav_tcd string, > nav_tci string, > nav_postn_type string, > detl_tpa_id string, > detl_tpa string, > detl_tpc string, > detl_tpi string, > detl_tcs string, > detl_tcsa string, > detl_tcdt string, > detl_tcd string, > detl_tci string, > detl_postn_type string, > cart_tpa_id string, > cart_tpa string, > cart_tpc string, > cart_tpi string, > cart_tcs string, > cart_tcsa string, > cart_tcdt string, > cart_tcd string, > cart_tci string, > cart_postn_type string, > sessn_chanl_id bigint, > gu_sec_flg bigint, > detl_refer_page_type_id string, > detl_refer_page_value string, > detl_event_id string, > nav_refer_intrn_reslt_sum string, > nav_intrn_reslt_sum string, > nav_refer_intrn_kw string, > nav_intrn_kw string, > detl_track_time string, > cart_track_time string) > PARTITIONED BY ( > ds string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > '/user/hive/dw/fct_traffic_navpage_path_detl' > TBLPROPERTIES ( > 'numPartitions'='265', > 'numFiles'='26677', > 'last_modified_by'='bi_etl', > 'last_modified_time'='1423633028', > 'transient_lastDdlTime'='1427870517', > 'numRows'='0', > 'totalSize'='8268127466928', > 'rawDataSize'='0') > > My query is : > > SELECT a1.sessn_id, > >a1.ordr_code, > >a1.cart_tracker_id, > >a1.end_user_id, > >a1.cart_track_time > > FROM dw.fct_traffic_navpage_path_detl a1 > > WHERE a1.ds = '2015-01-19' > > ANDa1.cart_tracker_id > 0 > > AND(a1.cart_button_position IS NULL OR length(a1.cart_button_position) = > 0) > > ANDa1.sessn_id IN ('Y49EY895ACABHS95DRQEE8DVFEB8JSE1', > >'UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG', > >'87FQE
Re: Re: hive 0.14 on some platform return some not NULL value as NULL
I download full data from hdfs. Then load data into my table. In my test enviroment. Everything is ok. My production is hadoop 2.3.0-cdh 5.0.2 REDHAT 5.8 java version "1.6.0_35" r7raul1...@163.com From: r7raul1...@163.com Date: 2015-04-02 16:57 To: dev Subject: Re: Re: hive 0.14 on some platform return some not NULL value as NULL In my test enviroment . I use hive 0.14 ,hive 1.1.0 ,the result is ok. But in production enviroment ,the result is not correct. r7raul1...@163.com From: Thejas Nair Date: 2015-04-02 16:41 To: r7raul1...@163.com CC: dev Subject: Re: Re: hive 0.14 on some platform return some not NULL value as NULL I am unable to reproduce this issue using the sample data - For this query, using 1.1.0, i get the following result- 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM2357378283356 9150119100048 7326356 2015-01-19 10:44:442015-01-19 On Thu, Apr 2, 2015 at 12:36 AM, r7raul1...@163.com wrote: > > DDL is > CREATE TABLE dw.fct_traffic_navpage_path_detl( > date_id string, > chanl_id bigint, > sessn_id string, > gu_id string, > prov_id string, > city_id string, > landing_page_type_id string, > landing_track_time string, > landing_url string, > nav_refer_tracker_id string, > nav_refer_page_type_id string, > nav_refer_page_value string, > nav_refer_link_position string, > nav_tracker_id string, > nav_page_categ_id string, > nav_page_type_id string, > nav_page_value string, > nav_srce_type string, > internal_keyword string, > internal_result_sum string, > pltfm_id int, > app_vers string, > nav_link_position string, > nav_button_position string, > nav_track_time string, > nav_next_tracker_id string, > sessn_last_time string, > sessn_pv int, > detl_tracker_id string, > detl_page_type_id string, > detl_page_value string, > detl_pm_id bigint, > detl_link_position string, > detl_position_track_id string, > cart_tracker_id string, > cart_page_type_id string, > cart_page_value string, > cart_link_postion string, > cart_button_position string, > cart_position_track_id string, > cart_prod_id bigint, > ordr_tracker_id string, > ordr_page_type_id string, > ordr_code string, > updt_time string, > cart_pm_id bigint, > brand_code string, > categ_type int, > os string, > end_user_id string, > add_cart_flag string, > navgation_page_flag int, > nav_page_url string, > detl_button_position string, > manul_flag int, > manul_track_date string, > nav_refer_tpa string, > nav_refer_tpa_id string, > nav_refer_tpc string, > nav_refer_tpi string, > nav_refer_tcs string, > nav_refer_tcsa string, > nav_refer_tcdt string, > nav_refer_tcd string, > nav_refer_tci string, > nav_refer_postn_type string, > nav_tpa_id string, > nav_tpa string, > nav_tpc string, > nav_tpi string, > nav_tcs string, > nav_tcsa string, > nav_tcdt string, > nav_tcd string, > nav_tci string, > nav_postn_type string, > detl_tpa_id string, > detl_tpa string, > detl_tpc string, > detl_tpi string, > detl_tcs string, > detl_tcsa string, > detl_tcdt string, > detl_tcd string, > detl_tci string, > detl_postn_type string, > cart_tpa_id string, > cart_tpa string, > cart_tpc string, > cart_tpi string, > cart_tcs string, > cart_tcsa string, > cart_tcdt string, > cart_tcd string, > cart_tci string, > cart_postn_type string, > sessn_chanl_id bigint, > gu_sec_flg bigint, > detl_refer_page_type_id string, > detl_refer_page_value string, > detl_event_id string, > nav_refer_intrn_reslt_sum string, > nav_intrn_reslt_sum string, > nav_refer_intrn_kw string, > nav_intrn_kw string, > detl_track_time string, > cart_track_time string) > PARTITIONED BY ( > ds string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > '/user/hive/dw/fct_traffic_navpage_path_detl' > TBLPROPERTIES ( > 'numPartitions'='265', > 'numFiles'='26677', > 'last_modified_by'='bi_etl', > 'last_modified_time'='1423633028', > 'transient_lastDdlTime'='1427870517', > 'numRows'='0', > 'totalSize'='8268127466928', > 'rawDataSize'='0') > > My query is : > > SELECT a1.sessn_id, > >a1.ordr_code, > >a1.cart_tracker_id, > >a1.end_user_id, > >a1.cart_track_time > > FROM dw.fct_traffic_navpage_path_detl a1 > > WHERE a1.ds = '2015-01-19
Re: Re: hive 0.14 on some platform return some not NULL value as NULL
Sorry ,I check my production jdk is java version "1.7.0_45" not java version "1.6.0_35" r7raul1...@163.com From: r7raul1...@163.com Date: 2015-04-02 17:01 To: dev Subject: Re: Re: hive 0.14 on some platform return some not NULL value as NULL I download full data from hdfs. Then load data into my table. In my test enviroment. Everything is ok. My production is hadoop 2.3.0-cdh 5.0.2 REDHAT 5.8 java version "1.6.0_35" r7raul1...@163.com From: r7raul1...@163.com Date: 2015-04-02 16:57 To: dev Subject: Re: Re: hive 0.14 on some platform return some not NULL value as NULL In my test enviroment . I use hive 0.14 ,hive 1.1.0 ,the result is ok. But in production enviroment ,the result is not correct. r7raul1...@163.com From: Thejas Nair Date: 2015-04-02 16:41 To: r7raul1...@163.com CC: dev Subject: Re: Re: hive 0.14 on some platform return some not NULL value as NULL I am unable to reproduce this issue using the sample data - For this query, using 1.1.0, i get the following result- 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM2357378283356 9150119100048 7326356 2015-01-19 10:44:442015-01-19 On Thu, Apr 2, 2015 at 12:36 AM, r7raul1...@163.com wrote: > > DDL is > CREATE TABLE dw.fct_traffic_navpage_path_detl( > date_id string, > chanl_id bigint, > sessn_id string, > gu_id string, > prov_id string, > city_id string, > landing_page_type_id string, > landing_track_time string, > landing_url string, > nav_refer_tracker_id string, > nav_refer_page_type_id string, > nav_refer_page_value string, > nav_refer_link_position string, > nav_tracker_id string, > nav_page_categ_id string, > nav_page_type_id string, > nav_page_value string, > nav_srce_type string, > internal_keyword string, > internal_result_sum string, > pltfm_id int, > app_vers string, > nav_link_position string, > nav_button_position string, > nav_track_time string, > nav_next_tracker_id string, > sessn_last_time string, > sessn_pv int, > detl_tracker_id string, > detl_page_type_id string, > detl_page_value string, > detl_pm_id bigint, > detl_link_position string, > detl_position_track_id string, > cart_tracker_id string, > cart_page_type_id string, > cart_page_value string, > cart_link_postion string, > cart_button_position string, > cart_position_track_id string, > cart_prod_id bigint, > ordr_tracker_id string, > ordr_page_type_id string, > ordr_code string, > updt_time string, > cart_pm_id bigint, > brand_code string, > categ_type int, > os string, > end_user_id string, > add_cart_flag string, > navgation_page_flag int, > nav_page_url string, > detl_button_position string, > manul_flag int, > manul_track_date string, > nav_refer_tpa string, > nav_refer_tpa_id string, > nav_refer_tpc string, > nav_refer_tpi string, > nav_refer_tcs string, > nav_refer_tcsa string, > nav_refer_tcdt string, > nav_refer_tcd string, > nav_refer_tci string, > nav_refer_postn_type string, > nav_tpa_id string, > nav_tpa string, > nav_tpc string, > nav_tpi string, > nav_tcs string, > nav_tcsa string, > nav_tcdt string, > nav_tcd string, > nav_tci string, > nav_postn_type string, > detl_tpa_id string, > detl_tpa string, > detl_tpc string, > detl_tpi string, > detl_tcs string, > detl_tcsa string, > detl_tcdt string, > detl_tcd string, > detl_tci string, > detl_postn_type string, > cart_tpa_id string, > cart_tpa string, > cart_tpc string, > cart_tpi string, > cart_tcs string, > cart_tcsa string, > cart_tcdt string, > cart_tcd string, > cart_tci string, > cart_postn_type string, > sessn_chanl_id bigint, > gu_sec_flg bigint, > detl_refer_page_type_id string, > detl_refer_page_value string, > detl_event_id string, > nav_refer_intrn_reslt_sum string, > nav_intrn_reslt_sum string, > nav_refer_intrn_kw string, > nav_intrn_kw string, > detl_track_time string, > cart_track_time string) > PARTITIONED BY ( > ds string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > '/user/hive/dw/fct_traffic_navpage_path_detl' > TBLPROPERTIES ( > 'numPartitions'='265', > 'numFiles'='26677', > 'last_modified_by'='bi_etl', > 'last_modified_time'='1423633028', > 'transient_lastDdlTime'='1427870517', > 'numRows'='0', > 'totalSize'='8268127466928', > 'rawDataSize'='0') > > My query
Re: Re: Is it necessary to update beelinepositive q.out files?
, string, string] separator=[[B@116265c3] nullstring=\N lastColumnTakesRest=false You see hive 0.14 lost some column info. Why? BTW, My meta database schema is hive 0.10 not update to hive 0.14. r7raul1...@163.com From: Thejas Nair Date: 2015-04-02 15:29 To: dev Subject: Re: Is it necessary to update beelinepositive q.out files? beeline tests have been disabled for a while and and i believe the q.out files are already outdated. You don't have to update them. On Wed, Apr 1, 2015 at 12:39 PM, Alexander Pivovarov wrote: > Hello Everyone > > I'm working on fixing groupby3_map.q query > https://issues.apache.org/jira/browse/HIVE-10168 > > I already updated clientpositive and clientpositive/spark q.out files > > Is it necessary to update beelinepositive q.out file? > > Most of the beelinepositive q.out files were updated 2 years ago > https://github.com/apache/hive/tree/trunk/ql/src/test/results/beelinepositive > > Looks like beelinepositive q.out files are not used during Jenkins build. > > if we still need to update them can you tell me the command which can > generate beelinepositive q.out file > > Thank you > Alex
Re: Re: hive 0.14 on some platform return some not NULL value as NULL
, string, string, string] separator=[[B@116265c3] nullstring=\N lastColumnTakesRest=false You see hive 0.14 lost some column info. Why? BTW, My meta database schema is hive 0.10 not update to hive 0.14. r7raul1...@163.com From: r7raul1...@163.com Date: 2015-04-02 15:36 To: dev CC: thejas.nair Subject: Re: Re: hive 0.14 on some platform return some not NULL value as NULL DDL is CREATE TABLE dw.fct_traffic_navpage_path_detl( date_id string, chanl_id bigint, sessn_id string, gu_id string, prov_id string, city_id string, landing_page_type_id string, landing_track_time string, landing_url string, nav_refer_tracker_id string, nav_refer_page_type_id string, nav_refer_page_value string, nav_refer_link_position string, nav_tracker_id string, nav_page_categ_id string, nav_page_type_id string, nav_page_value string, nav_srce_type string, internal_keyword string, internal_result_sum string, pltfm_id int, app_vers string, nav_link_position string, nav_button_position string, nav_track_time string, nav_next_tracker_id string, sessn_last_time string, sessn_pv int, detl_tracker_id string, detl_page_type_id string, detl_page_value string, detl_pm_id bigint, detl_link_position string, detl_position_track_id string, cart_tracker_id string, cart_page_type_id string, cart_page_value string, cart_link_postion string, cart_button_position string, cart_position_track_id string, cart_prod_id bigint, ordr_tracker_id string, ordr_page_type_id string, ordr_code string, updt_time string, cart_pm_id bigint, brand_code string, categ_type int, os string, end_user_id string, add_cart_flag string, navgation_page_flag int, nav_page_url string, detl_button_position string, manul_flag int, manul_track_date string, nav_refer_tpa string, nav_refer_tpa_id string, nav_refer_tpc string, nav_refer_tpi string, nav_refer_tcs string, nav_refer_tcsa string, nav_refer_tcdt string, nav_refer_tcd string, nav_refer_tci string, nav_refer_postn_type string, nav_tpa_id string, nav_tpa string, nav_tpc string, nav_tpi string, nav_tcs string, nav_tcsa string, nav_tcdt string, nav_tcd string, nav_tci string, nav_postn_type string, detl_tpa_id string, detl_tpa string, detl_tpc string, detl_tpi string, detl_tcs string, detl_tcsa string, detl_tcdt string, detl_tcd string, detl_tci string, detl_postn_type string, cart_tpa_id string, cart_tpa string, cart_tpc string, cart_tpi string, cart_tcs string, cart_tcsa string, cart_tcdt string, cart_tcd string, cart_tci string, cart_postn_type string, sessn_chanl_id bigint, gu_sec_flg bigint, detl_refer_page_type_id string, detl_refer_page_value string, detl_event_id string, nav_refer_intrn_reslt_sum string, nav_intrn_reslt_sum string, nav_refer_intrn_kw string, nav_intrn_kw string, detl_track_time string, cart_track_time string) PARTITIONED BY ( ds string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/user/hive/dw/fct_traffic_navpage_path_detl' TBLPROPERTIES ( 'numPartitions'='265', 'numFiles'='26677', 'last_modified_by'='bi_etl', 'last_modified_time'='1423633028', 'transient_lastDdlTime'='1427870517', 'numRows'='0', 'totalSize'='8268127466928', 'rawDataSize'='0') My query is : SELECT a1.sessn_id, a1.ordr_code, a1.cart_tracker_id, a1.end_user_id, a1.cart_track_time FROM dw.fct_traffic_navpage_path_detl a1 WHERE a1.ds = '2015-01-19' ANDa1.cart_tracker_id > 0 AND(a1.cart_button_position IS NULL OR length(a1.cart_button_position) = 0) AND a1.sessn_id IN ('Y49EY895ACABHS95DRQEE8DVFEB8JSE1', 'UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG', '87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM') I attach my sample data. r7raul1...@163.com From: Thejas Nair Date: 2015-04-02 15:28 To: dev Subject: Re: hive 0.14 on some platform return some not NULL value as NULL Can you give more details - the query you are running - schema of the table - serialization format of the table, sample records if possible. On Wed, Apr 1, 2015 at 6:32 PM, r7raul1...@163.com wrote: > > I use hive 1.1.0 cli on computer A (linux) the result is > > 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM2357378283356 9150119100048 > 7326356 NULL > > 87FQEZT1UEDXJHJQPFFX7G7ET8S2DVPM2357378283356 121501191035580028 > 7326356 NULL > > UBDTK8D9XUZ9GRZU8NZNXDEG73D4PCZG2362223711289 161501191549050061 > 14837289 NULL > > Y49EY895ACABHS95DRQEE8DVFEB8JSE12360853052224 11150
Re: Re: hive 0.14 on some platform return some not NULL value as NULL
I use hive 0.14 to use hive 0.10 metastroe server .The problem fixed. Now hive 0.14 return correct result. r7raul1...@163.com From: r7raul1...@163.com Date: 2015-04-07 10:34 To: dev CC: thejas.nair Subject: Re: Re: hive 0.14 on some platform return some not NULL value as NULL I found difference form log: In hive 0.14 DEBUG lazy.LazySimpleSerDe: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe initialized with: columnNames=[date_id, chanl_id, sessn_id, gu_id, prov_id, city_id, landing_page_type_id, landing_track_time, landing_url, nav_refer_tracker_id, nav_refer_page_type_id, nav_refer_page_value, nav_refer_link_position, nav_tracker_id, nav_page_categ_id, nav_page_type_id, nav_page_value, nav_srce_type, internal_keyword, internal_result_sum, pltfm_id, app_vers, nav_link_position, nav_button_position, nav_track_time, nav_next_tracker_id, sessn_last_time, sessn_pv, detl_tracker_id, detl_page_type_id, detl_page_value, detl_pm_id, detl_link_position, detl_position_track_id, cart_tracker_id, cart_page_type_id, cart_page_value, cart_link_postion, cart_button_position, cart_position_track_id, cart_prod_id, ordr_tracker_id, ordr_page_type_id, ordr_code, updt_time, cart_pm_id, brand_code, categ_type, os, end_user_id, add_cart_flag, navgation_page_flag, nav_page_url, detl_button_position, manul_flag, manul_track_date, nav_refer_tpa, nav_refer_tpa_id, nav_refer_tpc, nav_refer_tpi, nav_refer_tcs, nav_refer_tcsa, nav_refer_tcdt, nav_refer_tcd, nav_refer_tci, nav_refer_postn_type, nav_tpa_id, nav_tpa, nav_tpc, nav_tpi, nav_tcs, nav_tcsa, nav_tcdt, nav_tcd, nav_tci, nav_postn_type, detl_tpa_id, detl_tpa, detl_tpc, detl_tpi, detl_tcs, detl_tcsa, detl_tcdt, detl_tcd, detl_tci, detl_postn_type, cart_tpa_id, cart_tpa, cart_tpc, cart_tpi, cart_tcs, cart_tcsa, cart_tcdt, cart_tcd, cart_tci, cart_postn_type] columnTypes=[string, bigint, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, int, string, string, string, string, string, string, int, string, string, string, bigint, string, string, string, string, string, string, string, string, bigint, string, string, string, string, bigint, string, int, string, string, string, int, string, string, int, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string] separator=[[B@e50bca4] nullstring=\N lastColumnTakesRest=false In hive 0.10 DEBUG lazy.LazySimpleSerDe: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe initialized with: columnNames=[date_id, chanl_id, sessn_id, gu_id, prov_id, city_id, landing_page_type_id, landing_track_time, landing_url, nav_refer_tracker_id, nav_refer_page_type_id, nav_refer_page_value, nav_refer_link_position, nav_tracker_id, nav_page_categ_id, nav_page_type_id, nav_page_value, nav_srce_type, internal_keyword, internal_result_sum, pltfm_id, app_vers, nav_link_position, nav_button_position, nav_track_time, nav_next_tracker_id, sessn_last_time, sessn_pv, detl_tracker_id, detl_page_type_id, detl_page_value, detl_pm_id, detl_link_position, detl_position_track_id, cart_tracker_id, cart_page_type_id, cart_page_value, cart_link_postion, cart_button_position, cart_position_track_id, cart_prod_id, ordr_tracker_id, ordr_page_type_id, ordr_code, updt_time, cart_pm_id, brand_code, categ_type, os, end_user_id, add_cart_flag, navgation_page_flag, nav_page_url, detl_button_position, manul_flag, manul_track_date, nav_refer_tpa, nav_refer_tpa_id, nav_refer_tpc, nav_refer_tpi, nav_refer_tcs, nav_refer_tcsa, nav_refer_tcdt, nav_refer_tcd, nav_refer_tci, nav_refer_postn_type, nav_tpa_id, nav_tpa, nav_tpc, nav_tpi, nav_tcs, nav_tcsa, nav_tcdt, nav_tcd, nav_tci, nav_postn_type, detl_tpa_id, detl_tpa, detl_tpc, detl_tpi, detl_tcs, detl_tcsa, detl_tcdt, detl_tcd, detl_tci, detl_postn_type, cart_tpa_id, cart_tpa, cart_tpc, cart_tpi, cart_tcs, cart_tcsa, cart_tcdt, cart_tcd, cart_tci, cart_postn_type, sessn_chanl_id, gu_sec_flg, detl_refer_page_type_id, detl_refer_page_value, detl_event_id, nav_refer_intrn_reslt_sum, nav_intrn_reslt_sum, nav_refer_intrn_kw, nav_intrn_kw, detl_track_time, cart_track_time] columnTypes=[string, bigint, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, string, int, string, string, string, string, string, string, int, string, string, string, bigint, string, string, string, string, string, string, string, string, bigint, string, string, string, string, bigint, string, int, string, string, string, int, string, string, int, string, string, string, string, string, string, string, string, string, string, string, string, string, string
hive on tez optimize MRR to MR?
select userid,count(*) from u_data group by userid order by useridwill product MRR. I think when the result of userid,count(*) is small(one reduce can process the result) . This query plan can optimize to MR ? r7raul1...@163.com
hive on tez not convert map join to broadcast join
In MR query plan is Map Join Operator condition map: Left Outer Join0 to 1 keys: 0 ordr_code (type: string), cart_prod_id (type: bigint) 1 parnt_ordr_code (type: string), comb_prod_id (type: bigint) outputColumnNames: _col1, _col2, _col3, _col5, _col10, _col11, _col15, _col16, But in tez Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE) No broadcast edge Reducer 3 <- Map 5 (SIMPLE_EDGE), Reducer 2 (SIMPLE_EDGE) Merge Join Operator condition map: Left Outer Join0 to 1 keys: 0 ordr_code (type: string), cart_prod_id (type: bigint) 1 parnt_ordr_code (type: string), comb_prod_id (type: bigint) r7raul1...@163.com