>
> ... saw a simple sample() function while browsing the documentation ...
I grepped an export of the Hive wiki for 'sample(' and 'sample (' but only
found tablesample in these three docs:
-
https://cwiki.apache.org/confluence/display/Hive/Tutorial#Tutorial-Sampling
-
https://cwiki.
If it should be sampled using subquery would be inevitable, something like,
select x from (select distinct key as x from src)a where rand() > 0.9 limit
10;
2014-02-12 6:07 GMT+09:00 Oliver Keyes :
> Hey all
>
> So, what I'm looking to do is get N randomly-sampled distinct values from
> a colum
Hey all
So, what I'm looking to do is get N randomly-sampled distinct values from a
column in a table. I'm kind of flummoxed by how to do this without using
TABLESAMPLE, which would require me to add Yet Another Subquery (it'd be
'select these values, from this sample, from these distinct values')