Re: Returning multiple values from an UDF

2011-01-14 Thread Lars Francke
Thank you very much! It worked with one small correction: SELECT coord.latitude, coord.longitude FROM (SELECT parseCoordinates(latitude, longitude) as coord FROM foo) bar; Going to test this some more now. Cheers, Lars PS: Sorry for sending this to the old list

Re: Unique users analysis

2011-01-14 Thread Radek Maciaszek
Hi Itai, I did not think about sampling users instead of sampling records, but it makes a much more sense indeed. As it happens my ID is also hexadecimal and so I did exactly what you suggested. In the results my error is less than 1% comparing to observed values! Many thanks!! Radek On 14 Janua

Re: Unique users analysis

2011-01-14 Thread Itai Hochman
We had a similar challenge and we dealt with it by sampling based on the user id. We have a unique id which is a random hexadecimal format- for instance A12890900. The query is running only on users that have an id that ends with 00. At the end we multiply by 256 and get a pretty close num

Unique users analysis

2011-01-14 Thread Radek Maciaszek
Hi, I am working on some large scale unique users analysis (think hundreds of millions of records per day). Since number of all records per month goes into many billions I am hoping that there may be some alternative to running "SELECT DISTINCT user_unique_id..." such as sampling data or perhaps d