??????Why Spark-sql miss TableScanDesc.FILTER_EXPR_CONF_STR params when I move Hive table to Spark?

???????? Thu, 28 Jan 2016 05:04:22 -0800

If we support TableScanDesc.FILTER_EXPR_CONF_STR like hive 

we may write sql LIKE this


select ydb_sex from ydb_example_shu where ydbpartion='20151110' limit 10
select ydb_sex from ydb_example_shu where ydbpartion='20151110' and 
(ydb_sex='??' or ydb_province='????' or ydb_day>='20151217') limit 10
select count(*) from ydb_example_shu where ydbpartion='20151110' and 
(ydb_sex='??' or ydb_province='????' or ydb_day>='20151217') limit 10


If we may not  support TableScanDesc.FILTER_EXPR_CONF_STR like hive  we write 
Sql like this  

set ya100.spark.filter.ydb_example_shu=ydbpartion='20151110';
select ydb_sex from ydb_example_shu  limit 10

set ya100.spark.filter.ydb_example_shu=ydbpartion='20151110' and (ydb_sex='??' 
or ydb_province='????' or ydb_day>='20151217');
select ydb_sex from ydb_example_shu  limit 10

set ya100.spark.filter.ydb_example_shu=ydbpartion='20151110' and (ydb_sex='??' 
or ydb_province='????' or ydb_day>='20151217');
select count(*) from ydb_example_shu limit 10

set ya100.spark.filter.ydb_example_shu=ydbpartion='20151110' and (ydb_sex in 
('??','??','????','????'));
select ydb_sex,ydb_province from ydb_example_shu   limit 10

set ya100.spark.filter.ydb_example_shu=ydbpartion='20151110';
select count(*) from ydb_example_shu   limit 10



------------------ ???????? ------------------
??????: "????????";<muyann...@qq.com>;
????????: 2016??1??28??(??????) ????8:28
??????: "????????"<muyann...@qq.com>; "user"<u...@spark.apache.org>; 
"dev"<dev@spark.apache.org>; 

????: ??????Why Spark-sql miss TableScanDesc.FILTER_EXPR_CONF_STR params when I 
move Hive table to Spark?



we always used Sql like below.

select count(*) from ydb_example_shu where ydbpartion='20151110' and 
(ydb_sex='' or ydb_province='LIAONING' or ydb_day>='20151217') limit 10

Spark don't push down predicates for TableScanDesc.FILTER_EXPR_CONF_STR, which 
means that every query is full scan can`t use the index (Something like 
HbaseStoreHandle).








------------------ ???????? ------------------
??????: "????????";<muyann...@qq.com>;
????????: 2016??1??28??(??????) ????7:27
??????: "user"<u...@spark.apache.org>; "dev"<dev@spark.apache.org>; 

????: Why Spark-sql miss TableScanDesc.FILTER_EXPR_CONF_STR params when I move 
Hive table to Spark?



Dear spark
I am test StorageHandler on Spark-SQL.
but i find the TableScanDesc.FILTER_EXPR_CONF_STR is miss ,but i need it ,is 
three any where i could found it?
I really want to get some filter information from Spark Sql, so that I could 
make a pre filter by my Index ;
so where is the 
TableScanDesc.FILTER_EXPR_CONF_STR=hive.io.filter.expr.serialized? it is 
missing or replace by other method ,thanks every body ,thanks .


for example  I make a custorm StorageHandler like hive .

creat table xxx(...)
STORED BY 'cn.net.ycloud.ydb.handle.Ya100StorageHandler' 
TBLPROPERTIES(
"ya100.handler.master"="101.200.130.48:8080",
"ya100.handler.table.name"="ydb_example_shu",
"ya100.handler.columns.mapping"="phonenum,usernick,ydb_sex,ydb_province,ydb_grade,ydb_age,ydb_blood,ydb_zhiye,ydb_earn,ydb_prefer,ydb_consume,ydb_day,content,ydbpartion,ya100_pipe"
)

in Ya100StorageHandler code .
I wang to use TableScanDesc.FILTER_EXPR_CONF_STR  like this

  String filterExprSerialized = conf.get(TableScanDesc.FILTER_EXPR_CONF_STR);
    if (filterExprSerialized == null) {
        return "";
//         throw new IOException("can`t found filter condition in your Sql ,at 
least you must special a field as ydbpartion ");
    }else{
        LOG.info(filterExprSerialized);
        ExprNodeGenericFuncDesc filterExpr =    
Utilities.deserializeExpression(filterExprSerialized);
        LOG.info(filterExpr);
        try {
            return Ya100Utils.parserFilter(filterExpr,info);
        } catch (Throwable e) {
            throw new IOException(e);
        }
    }

??????Why Spark-sql miss TableScanDesc.FILTER_EXPR_CONF_STR params when I move Hive table to Spark?

Reply via email to