Thank you very much!
It worked with one small correction:
SELECT coord.latitude, coord.longitude FROM
(SELECT parseCoordinates(latitude, longitude) as coord FROM foo) bar;
Going to test this some more now.
Cheers,
Lars
PS: Sorry for sending this to the old list
Hi Itai,
I did not think about sampling users instead of sampling records, but it
makes a much more sense indeed.
As it happens my ID is also hexadecimal and so I did exactly what you
suggested. In the results my error is less than 1% comparing to observed
values!
Many thanks!!
Radek
On 14 Janua
We had a similar challenge and we dealt with it by sampling based on
the user id.
We have a unique id which is a random hexadecimal format- for instance
A12890900.
The query is running only on users that have an id that ends with 00.
At the end we multiply by 256 and get a pretty close num
Hi,
I am working on some large scale unique users analysis (think hundreds of
millions of records per day). Since number of all records per month goes
into many billions I am hoping that there may be some alternative to running
"SELECT DISTINCT user_unique_id..." such as sampling data or perhaps
d