Hi, Vicențiu! On Dec 11, Vicențiu Ciorbaru wrote: > Hi! > > Here is my proposal on extending the storage engine API to provide a > functionality for retrieving random rows from tables (those that have > indexes). The storage engines for which I plan to implement this are: > MyISAM, Aria, Innodb. Possibly RocksDB, TokuDB. ... > Maybe I am complicating it too much and a simple random_sample() > function is sufficient, kind of how ha_records_in_range does it.
Yeah. My first thought was something like index_random(). That is, index_init(), index_random(), index_random(), ..., index_end(). index_init() does whatever it always needs to do to prepare using an index. and index_random() gets one random row. Not like records_in_range, that takes the index number and does _init/_end internally, because records_in_range is typically done only once or just a few times, while index_random() is more like index_next(), will be repeated continuously. But then I was thinking, why do you need to specify an index at all? Shouldn't it be just "get me a random row"? Index or whatever - that's engine implementation detail. For example, MyISAM with a fixed-size rows can just read from lseek(floor((file_size/row_size)*rand())*row_size). Sergey had a good point about being able to repeat exactly the same sampling by specifying the seed. This should be solved if the engine will get its random numbers from the thd_rnd service. Regards, Sergei _______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp