I've got Hive Transactional table 'data_http' in ORC format, containing
around 100.000.000 rows.

When I execute query:

select * from data_http
where res_url like '%mts.ru%'

it completes in 10 seconds.

But executing query

select * from data_http
where res_url like '%mts_ru%'


takes more than 30 minutes.

Why '_' wildcard decrease perfomance?

Reply via email to