What data type are the constraints over? A unique index over a raw md5 is much better than one on a varchar(255) or even varchar(32)
Sent from my iPhone > On Feb 12, 2014, at 12:49 AM, Pierre <pi...@hotmail.it> wrote: > > Ok I understand, but this doesn't resolve the initialization problem, I'm > sure there is a bug, or something which can be improved a lot. I can't use > the UNIQUE constraint when I have to much key which doesn't fit in RAM. > > Because even If have memcache/bloom filter in front, I still need to create > my initial dataset with the unique constraint,and even if my initial dataset > is unique, I can't insert them in my table because of the bug explained > before. > > If mysql can do a sort so quickly (create index), it should be trivial to > reuse this sorted data, eliminate the duplicate and create the UNIQUE index. > > > Le 12/02/2014 09:24, Justin Swanhart a écrit : > > Hi, > > > > This is not a bug, but how b tree indexes work. For them to be efficient > > they must fit in ram. > > There are buffering mechanisms that can be used for secondary indexes in > > some cases, because the > > write can be done without a read, but ONLY when the index is not unique. > > It if it unique, then the > > index dive is necessary and a btree traversal can take multiple random IO > > to get to the leaf node. > > > > Faster IO can help, so you can look into flash storage. > > > > Consider fronting your lookups with memcache (a hash index) or a bloom > > filter (there are many > > implementations on the net) to reduce lookups. > > > > --Justin > > Le 12/02/2014 09:04, Pierre a écrit : >> Using this technique I have the same Issue. It's now running for severals >> hours, I'm at a 40% and >> looking at show full processlist, it's getting slower and slower. It will >> never finish. >> >> I think there is a bug here. >> >> Firstly, regardly the memory usage, It doesn't correctly use the buffer I >> did set, I think it's only >> using the key_buffer_size. myisam_sort_buffer_size or >> bulk_insert_buffer_size are not used on this >> task. >> >> So what's happening ? When the RAM is full, mariadb is doing random access >> on the disk to sort and >> filter Go of data ! That's why my cpu was only used a few percent, the >> process was in i/o wait most >> of the time. >> >> So what I am saying here is : mariadb can't crate UNIQUE keys/index if the >> rows doesn't fit in RAM. >> >> However if I try to create a standard index (ie non unique), it works well >> and it's done in less >> than an hour. >> >> ALTER IGNORE TABLE mytable ADD INDEX (c1), ADD INDEX(c2); >> Query OK, 349086532 rows affected (44 min 25.21 sec) >> Records: 349086532 Duplicates: 0 Warnings: 0 >> >> In this second usage case, Maria is doing a good work by using the >> myisam_sort_buffer_size. I think >> it's doing something like an external/merge sort, spliting the rows in part >> that fit in RAM, sorting >> them, merging them and creating index. >> >> It was 100% cpu most of the time, when It was not it was because mysql was >> loading the rows in RAM >> from hard disk (and not doing a random access on the hard disk like in >> create unique index). So why >> UNIQUE index is not behaving the same way ? >> >> It's easy to reproduce the bug, just create a binary file of 2 or 3 x size >> of RAM, then load data >> infile and try to create a UNIQUE index on it. It will never end. >> >> >Am 10.02.2014 13:45, schrieb Pierre: >> >> Mariadb is getting slower and slower when I'm inserting a massive amout >> >> of data. I'm trying to >> insert 166 507 066 >> >> rows (12go) using load data infile '' into an empty table. I splitted my >> >> file in 13 parts of the >> same size because >> >> it was too long to insert in one shot. When I inserted more than 100M >> >> rows, it starts to be >> extremely slow. >> > >> >he reason are the permamently updated keys >> >in general for large inserts on a new table >> > >> >UNIQUE KEY `p1` (`p1`), >> >UNIQUE KEY `p2` (`p2`), >> > >> >* remove keyes >> >* insert data >> >* add kyes >> > >> >https://dev.mysql.com/doc/refman/5.5/en/insert-speed.html >> >> _______________________________________________ >> Mailing list: https://launchpad.net/~maria-discuss >> Post to : maria-discuss@lists.launchpad.net >> Unsubscribe : https://launchpad.net/~maria-discuss >> More help : https://help.launchpad.net/ListHelp > > _______________________________________________ > Mailing list: https://launchpad.net/~maria-discuss > Post to : maria-discuss@lists.launchpad.net > Unsubscribe : https://launchpad.net/~maria-discuss > More help : https://help.launchpad.net/ListHelp _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp