Given zero optimization in the original post, innodb_flush_log_at_trx_commit = 
2  is almost the right fix, orders of magnitude faster on hard disk.
Add simple batching to this, and it is possible to add another order of 
magnitude, or 2.
Write into several tables instead of 1, and that will improve the throughput by 
5-10% more or something 😊

I doubt anything there is a need for anything fancy, TokuDB , Redis, what not.


From: Guillaume Lefranc
Sent: Wednesday, 9 October 2019 15:24
To: rhys.campb...@swisscom.com
Cc: Mailing-List mariadb
Subject: Re: [Maria-discuss] Performance tuning sought for MariaDB

I doubt that TokuDB will help with 100MB of data. It's been designed with big 
datasets in mind and generally underperforms with small data.
The question is more how fast you can read and write to the table. At this 
point even an in-memory database could be considered and maybe MariaDB isn't 
even the right choice for that (Redis?)

Le mer. 9 oct. 2019 à 09:15, <rhys.campb...@swisscom.com> a écrit :
Might be worth having a look at TokuDB 
https://mariadb.com/kb/en/library/tokudb/ 
 
From: Maria-discuss 
<maria-discuss-bounces+rhys.campbell=swisscom....@lists.launchpad.net> On 
Behalf Of JCA
Sent: 07 October 2019 21:03
To: Kenneth Penza <kpe...@gmail.com>
Cc: Mailing-List mariadb <maria-discuss@lists.launchpad.net>
Subject: Re: [Maria-discuss] Performance tuning sought for MariaDB
 
 
 
On Mon, Oct 7, 2019 at 12:12 PM Kenneth Penza <kpe...@gmail.com> wrote:
Opening many connections will cause issues, especially if each thread is 
running a sub-optimal query. Try to decrease the number of connections, one-way 
can be using message queues.

Tuning the database I/O is important, ensure you are using a separate mount 
point for /var/lib/mysql and mount it with noatime. Resize the 
innodb_buffer_pool_size to ensure that it can hold your workload. Subsequently, 
perform some tests with innodb_flush_method = "O_DIRECT".

>From your description, data S is the key which in turn is composed of T and A. 
>Can the data be stripped by the application and T and A stored separately? 
>Maybe use persistent virtual columns and index those instead.
 
       I don't think I understand. It is not that S is composed of T and A; 
what I am reading from the FIFO is two items: S and A. T is derived from S and 
A, together with a timestamp.
 
Recall that in InnoDB the primary key is a clustered index thus the table is 
written to disk sorted by column S. Inserts and updates may require moving rows 
around slowing the SQL statements (high insert/update concurrency will worsen 
the situation). If column S is the primary key and is large, all other indexes 
will use the primary key as the prefix, causing indexes to be large increasing 
the load on the system I/O.
 
      There is something to what you are saying, for I have a high 
insert/update concurrency. I will definitely try to understand to details of 
what you are suggesting. 
 
 
 
On Mon, Oct 7, 2019 at 7:13 PM JCA <1.41...@gmail.com> wrote:
Thanks for your feedback. Please see my comments  interspersed below.
 
On Mon, Oct 7, 2019 at 10:38 AM Guillaume Lefranc <guilla...@adishatz.net> 
wrote:
Hello,
 
thread-handling=pool-of-threads 
max_connections = 1000
table_open_cache = 800
query_cache_type = 0
innodb_buffer_pool_size = 512M
innodb_buffer_pool_instances = 10
innodb_adaptive_hash_index_partitions = 20
innodb_lock_wait_timeout = 5000
No need to use buffer pool instances with only 512M of buffer. you said you 
have 24GB of RAM - why not increase the buffer size? how big is your table on 
the disk right now? If you want the best performance it must be hold in the 
buffer. 
 
It does not seem to be that big - about 101 MB. I used the following command:
 
 SELECT table_name AS `Table`, round(((data_length + index_length) / 1024 / 
1024), 2) `Size (MB)` FROM information_schema.TABLES WHERE table_schema = 
"<my-db-name>";
 
I had to look it up.
 
 
With this, my application can keep up with the FIFO writer, but - depending on 
the circumstances - my database can't. As I am writing this, there are over 
1300 threads connected to my database; any command that I issue at the mysql 
CLI takes over one minute to return. I am keeping track on how long each thread 
takes to complete, and that is of the order of hundreds of seconds - sometimes 
thousands. Each thread is itself simple, in that it just issues a couple of 
simple MariaDB commands.  Currently my table consists of 1.6 million entries, 
and growing - on this basis, I expect that things will get only worse. Each 
entry,however, will never require more than a couple of hundred bytes of 
storage. The operations that can be undertaken on entries are insertion, 
deletion and modification, the latter being straightforward - like e.g. 
incrementing a counter or replacing a short string.
 
You are not considering many factors which is
a) limiting the number of concurrent threads - with 1300 threads you are 
creating concurrency races and locking issues. Try limiting your threads to a 
factor of the # of cores
 
   I will. Notice, though, that most of the time most of the cores are idle 
anyway.
 
b), you're writing to a disk system. The number of CPU cores won't matter if 
you saturate the disk. You say nothing about the disk, if it's SSD, HDD etc. 
Note that HDD's are severely handicapped when it comes to concurrent IO 
operations.
 
   It is an HDD. I am sure it will be a factor in making things slower.
 
c) given the factor above you could maybe try relaxing commit to disk if 
integrity of the data is not of utmost importance, for example by adding 
"innodb_flush_log_at_trx_commit = 2" to your config.
 
     Thanks - I'll try that.
 
My system has 24 GB of  RAM and 12 cores. Occasionally all the cores are fully 
busy with MariaDB activity, but most of the time barely one or two are.
 
d) CPU in a DB system will mostly be used for queries (index scans) and some 
for reindexing but you're only doing writes, so unless you have dozens of 
indexes
e) Table design - is your table design efficient ? how many rows per sec are 
you writing? maybe you could also benefit from hash partitioning or clever data 
types.
 
I am a newbie when it comes to interacting with MariaDB - please,  bear with 
me. I know I must use a single database and a single table. I also know - 
because of the nature of the data that are being written to the FIFO - that the 
 probability for two different threads to be operating on the same entry in the 
table at the same time is negligible - i.e. for all practical purposes, that 
will not happen.
What I need is advice on how to configure my instance of MariaDB to perform 
optimally in the scenario above. In particular, I would like for it to make 
better use of all the cores available - in essence, to parallelize the database 
operations as much as possible.
 
f) Congratulations, you have managed to contradict yourself in two sentences. 
If your workload isn't parallel by design you will not make use of the 
available cores. Also, your workload sounds IO-bound to me - there's a strong 
change the disk is the bottleneck.
g) "I know I must use a single database and a single table." How do you know 
this if you are a newbie? No offense meant but nost of monolithic design is not 
the best to leverage performance....
 
     The data that I have consist of a string S, two string attributes T and A, 
and an integer attribute D.  String S is what I have to use as a key in order 
to insert/delete/modify entries. Can this be advantageously be stored across 
several tables?  The S strings have nothing in common, beyond the attributes 
that I mentioned. As for the parallelism, the essence is what I pointed out - 
namely, that no two threads will work on the same entry (as accessed by S) at 
the same time.  They can of course be adding new entries, or deleting existing 
ones, at the same time, with the proviso that the insertion operations will 
always be for different values of S, and analogously for the deletion 
operations. I am indeed open to suggestions about a better design.
 
-GL
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to     : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to     : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp

_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to     : maria-discuss@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-discuss
More help   : https://help.launchpad.net/ListHelp

Reply via email to