What data type are the constraints over?  A unique index over a raw md5 is much 
better than one on a varchar(255) or even varchar(32)

Sent from my iPhone

> On Feb 12, 2014, at 12:49 AM, Pierre <pi...@hotmail.it> wrote:
> 
> Ok I understand, but this doesn't resolve the initialization problem, I'm 
> sure there is a bug, or something which can be improved a lot. I can't use 
> the UNIQUE constraint when I have to much key which doesn't fit in RAM.
> 
> Because even If have memcache/bloom filter in front, I still need to create 
> my initial dataset with the unique constraint,and even if my initial dataset 
> is unique, I can't insert them in my table because of the bug explained 
> before.
> 
> If mysql can do a sort so quickly (create index), it should be trivial to 
> reuse this sorted data, eliminate the duplicate and create the UNIQUE index.
> 
> 
> Le 12/02/2014 09:24, Justin Swanhart a écrit :
> > Hi,
> >
> > This is not a bug, but how b tree indexes work.  For them to be efficient 
> > they must fit in ram.
> >   There are buffering mechanisms that can be used for secondary indexes in 
> > some cases, because the
> > write can be done without a read, but ONLY when the index is not unique.  
> > It if it unique, then the
> > index dive is necessary and a btree traversal can take multiple random IO 
> > to get to the leaf node.
> >
> > Faster IO can help, so you can look into flash storage.
> >
> > Consider fronting your lookups with memcache (a hash index) or a bloom 
> > filter (there are many
> > implementations on the net) to reduce lookups.
> >
> > --Justin
> 
> Le 12/02/2014 09:04, Pierre a écrit :
>> Using this technique I have the same Issue. It's now running for severals 
>> hours, I'm at a 40% and
>> looking at show full processlist, it's getting slower and slower. It will 
>> never finish.
>> 
>> I think there is a bug here.
>> 
>> Firstly, regardly the memory usage, It doesn't correctly use the buffer I 
>> did set, I think it's only
>> using the key_buffer_size. myisam_sort_buffer_size or 
>> bulk_insert_buffer_size are not used on this
>> task.
>> 
>> So what's happening ? When the RAM is full, mariadb is doing random access 
>> on the disk to sort and
>> filter Go of data ! That's why my cpu was only used a few percent, the 
>> process was in i/o wait most
>> of the time.
>> 
>> So what I am saying here is : mariadb can't crate UNIQUE keys/index if the 
>> rows doesn't fit in RAM.
>> 
>> However if I try to create a standard index (ie non unique), it works well 
>> and it's done in less
>> than an hour.
>> 
>> ALTER IGNORE TABLE mytable ADD INDEX (c1), ADD INDEX(c2);
>> Query OK, 349086532 rows affected (44 min 25.21 sec)
>> Records: 349086532  Duplicates: 0  Warnings: 0
>> 
>> In this second usage case, Maria is doing a good work by using the 
>> myisam_sort_buffer_size. I think
>> it's doing something like an external/merge sort, spliting the rows in part 
>> that fit in RAM, sorting
>> them, merging them and creating index.
>> 
>> It was 100% cpu most of the time, when It was not it was because mysql was 
>> loading the rows in RAM
>> from hard disk (and not doing a random access on the hard disk like in 
>> create unique index). So why
>> UNIQUE index is not behaving the same way ?
>> 
>> It's easy to reproduce the bug, just create a binary file of 2 or 3 x size 
>> of RAM, then load data
>> infile and try to create a UNIQUE index on it. It will never end.
>> 
>> >Am 10.02.2014 13:45, schrieb Pierre:
>> >> Mariadb is getting slower and slower when I'm inserting a massive amout 
>> >> of data. I'm trying to
>> insert 166 507 066
>> >> rows (12go) using load data infile '' into an empty table. I splitted my 
>> >> file in 13 parts of the
>> same size because
>> >> it was too long to insert in one shot. When I inserted more than 100M 
>> >> rows, it starts to be
>> extremely slow.
>> >
>> >he reason are the permamently updated keys
>> >in general for large inserts on a new table
>> >
>> >UNIQUE KEY `p1` (`p1`),
>> >UNIQUE KEY `p2` (`p2`),
>> >
>> >* remove keyes
>> >* insert data
>> >* add kyes
>> >
>> >https://dev.mysql.com/doc/refman/5.5/en/insert-speed.html
>> 
>> _______________________________________________
>> Mailing list: https://launchpad.net/~maria-discuss
>> Post to     : maria-discuss@lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~maria-discuss
>> More help   : https://help.launchpad.net/ListHelp
> 
> _______________________________________________
> Mailing list: https://launchpad.net/~maria-discuss
> Post to     : maria-discuss@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~maria-discuss
> More help   : https://help.launchpad.net/ListHelp

_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to     : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp

Reply via email to