Hi, Another option is using an external fastbit index, which will give you more than count(*) and work excellently over complex ranges. https://sdm.lbl.gov/fastbit/
On Wed, Feb 19, 2014 at 5:25 AM, Sergei Golubchik <[email protected]> wrote: > Hi, Tomek! > > On Jan 05, Tomek R wrote: > > Hi, > > I have realised that whenever I am doing any query and want to count > > number of rows:SELECT COUNT(*) FROM atable WHERE <complicated condition>; > > On large tables (>10e6 rows), when the count is >100e3, it takes over > > a second... even when all the indexes are set up, etc... and with > > large counts, I have not been able to go below ~<1 s. > > So I would like to implement APPROXIMATE_COUNT(*) function, which > > checks, if, for example every 10th, 100th or 1000th row is present in > > my final output... and based on this statistics, return the > > approximate number of rows that given query produces. I am using > > MyISAM engine and would like to do the implementation on it, where the > > total number of rows is automatically stored. > > Obviously, it would be useful for anybody dealing with these huge > > aggregate websites, where users do not really need to know if their > > keyword search returned 1234567890 results or just 1.2e9 results... > > Please, can you advise me how to start this project? I.e. whether it's > > easy to do in SQL or whether I would have to start looking somewhere > > in the source code of MariaDB? I have plenty of time, programming/SQL > > experience, but no experience with maria/mysql source code. > > See the new server variable in MariaDB-10.0 - > optimizer_use_condition_selectivity - and in particular, when happens > when it's set to 5. The manual documents it as > > 5 Additionally use selectivity of certain non-range predicates > calculated on record sample. > > It means that to estimate the selectivity of the WHERE condition for a > specific table, exactly, SELECT ... FROM atable WHERE <complicated > condition>, > optimizer reads few first rows of the table (as set in > optimizer_selectivity_sampling_limit) and checks how many of them pass > the WHERE condition. > > This is very close to what you want, but currently it only works for > LIKE predicates. > > If you'd extend this to work - optionally - for all WHERE conditions, > you'll be able to see your "approximate count" value in the EXPLAIN > output. > > Regards, > Sergei > > > _______________________________________________ > Mailing list: https://launchpad.net/~maria-discuss > Post to : [email protected] > Unsubscribe : https://launchpad.net/~maria-discuss > More help : https://help.launchpad.net/ListHelp >
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : [email protected] Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp

