hi guys one doubt... using btree have a O(log N), what about using tokudb?
2014-02-12 12:45 GMT-02:00 Sergei Golubchik <s...@mariadb.org>: > Hi, Pierre! > > Okay, there were quite a few replies already here. > I'll chime in, but only to confirm that there's no easy solution, > unfortunately :( > > On Feb 10, Pierre wrote: > > Hello, > > > > Mariadb is getting slower and slower when I'm inserting a massive > > amout of data. I'm trying to insert 166 507 066 rows (12go) using load > > data infile '' into an empty table. I splitted my file in 13 parts of > > the same size because it was too long to insert in one shot. When I > > inserted more than 100M rows, it starts to be extremely slow. > > > > Here are the time for each load data infile : > > > > LOAD DATA INFILE '/tmp/hash/xaa' : 2min 49sec > > LOAD DATA INFILE '/tmp/hash/xab' : 2min 49sec > > LOAD DATA INFILE '/tmp/hash/xac' : 3min 48sec > > LOAD DATA INFILE '/tmp/hash/xad' : 3min 48sec > > LOAD DATA INFILE '/tmp/hash/xae' : 3min 49sec > > LOAD DATA INFILE '/tmp/hash/xaf' : 5min 59sec > > LOAD DATA INFILE '/tmp/hash/xag' : 10min 50sec > > LOAD DATA INFILE '/tmp/hash/xah' : 20min 7sec > > LOAD DATA INFILE '/tmp/hash/xai' : 47min 7sec > > LOAD DATA INFILE '/tmp/hash/xaj' : 4 hours 1 min 9.34 sec ( ouch !) > > LOAD DATA INFILE '/tmp/hash/xak' : still running... > > > > Can someone explain me what is happening internaly and why it is so slow > ? > > Yes. That's because indexes are implemented as B-trees, and B-tree has a > cost of insertion of O(log N). As you insert more data, insertions > become slower. Furthermore, B-tree organizes data in pages. MariaDB > tries to keep all pages in memory, but as soon as index becomes larger > than keycache size, every key insertion means at least one disk read and > one write (assuming your key values are distributed normally and > randomly). > > > Do you think if I just create an index (not a unique index) it will be > > faster ? > > Not really. Inserting values into a B-tree one-by-one will always be > slow when your B-tree becomes big enough. > > But you don't need to insert values into a B-tree one by one. MyISAM > has a much faster (like, two orders of magnitude) way of building > b-trees by creating a tree all at once. > > This is used when you insert data into the empty table. Or when you add > an index to the existing table. Or when you enable indexes. That is: > > 1. LOAD DATA INFILE 'everything, not in chunks' > (a convenient way would be to load from a pipe, and cat all your > chunks into it). > 2. ALTER TABLE ... ADD KEY > 3. ALTER TABLE DISABLE KEYS. insert the data. ALTER TABLE ENABLE KEYS > > But unfortunately, all this only works for non-unique keys, when MariaDB > doesn't need to expect a unique constraint error and won't need to > delete rows when creating indexes. For unique indexes values will be > inserted into a b-tree one by one. > > Now, there's another trick you can use. Create a table. Then disable all > indexes - not with ALTER TABLE ENABLE KEYS (which only disables > non-unique indexes) - but from a command-line with myisamchk: > > $ myisamchk --keys-used=0 /path/to/table.MYI > > then insert your data. And REPAIR TABLE or REPAIR TABLE ... QUICK. > This will rebuild all indexes, the fast way. > The second variant (with QUICK) will not copy your (presumably, huge) > data - it will only rebuild indexes. But it cannot remove duplicates > either. > > And also it's worth noting that MariaDB-10.0 rebuilds both UNIQUE > and non-UNIQUE indexes the fast way when you do ALTER TABLE ... ADD KEY. > > Regards, > Sergei > > > _______________________________________________ > Mailing list: https://launchpad.net/~maria-discuss > Post to : maria-discuss@lists.launchpad.net > Unsubscribe : https://launchpad.net/~maria-discuss > More help : https://help.launchpad.net/ListHelp > -- Roberto Spadim SPAEmpresarial Eng. Automação e Controle
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp