Opening many connections will cause issues, especially if each thread is running a sub-optimal query. Try to decrease the number of connections, one-way can be using message queues.
Tuning the database I/O is important, ensure you are using a separate mount point for /var/lib/mysql and mount it with noatime. Resize the innodb_buffer_pool_size to ensure that it can hold your workload. Subsequently, perform some tests with innodb_flush_method = "O_DIRECT". >From your description, data S is the key which in turn is composed of T and A. Can the data be stripped by the application and T and A stored separately? Maybe use persistent virtual columns and index those instead. Recall that in InnoDB the primary key is a clustered index thus the table is written to disk sorted by column S. Inserts and updates may require moving rows around slowing the SQL statements (high insert/update concurrency will worsen the situation). If column S is the primary key and is large, all other indexes will use the primary key as the prefix, causing indexes to be large increasing the load on the system I/O. On Mon, Oct 7, 2019 at 7:13 PM JCA <1.41...@gmail.com> wrote: > Thanks for your feedback. Please see my comments interspersed below. > > On Mon, Oct 7, 2019 at 10:38 AM Guillaume Lefranc <guilla...@adishatz.net> > wrote: > >> Hello, >> >> thread-handling=pool-of-threads >>> max_connections = 1000 >>> table_open_cache = 800 >>> query_cache_type = 0 >>> innodb_buffer_pool_size = 512M >>> innodb_buffer_pool_instances = 10 >>> innodb_adaptive_hash_index_partitions = 20 >>> innodb_lock_wait_timeout = 5000 >>> >> No need to use buffer pool instances with only 512M of buffer. you said >> you have 24GB of RAM - why not increase the buffer size? how big is your >> table on the disk right now? If you want the best performance it must be >> hold in the buffer. >> > > It does not seem to be that big - about 101 MB. I used the following > command: > > SELECT table_name AS `Table`, round(((data_length + index_length) / 1024 > / 1024), 2) `Size (MB)` FROM information_schema.TABLES WHERE table_schema = > "<my-db-name>"; > > I had to look it up. > > >>> With this, my application can keep up with the FIFO writer, but - >>> depending on the circumstances - my database can't. As I am writing this, >>> there are over 1300 threads connected to my database; any command that I >>> issue at the mysql CLI takes over one minute to return. I am keeping track >>> on how long each thread takes to complete, and that is of the order of >>> hundreds of seconds - sometimes thousands. Each thread is itself simple, in >>> that it just issues a couple of simple MariaDB commands. Currently my >>> table consists of 1.6 million entries, and growing - on this basis, I >>> expect that things will get only worse. Each entry,however, will never >>> require more than a couple of hundred bytes of storage. The operations that >>> can be undertaken on entries are insertion, deletion and modification, the >>> latter being straightforward - like e.g. incrementing a counter or >>> replacing a short string. >>> >> >> You are not considering many factors which is >> a) limiting the number of concurrent threads - with 1300 threads you are >> creating concurrency races and locking issues. Try limiting your threads to >> a factor of the # of cores >> > > I will. Notice, though, that most of the time most of the cores are > idle anyway. > > >> b), you're writing to a disk system. The number of CPU cores won't matter >> if you saturate the disk. You say nothing about the disk, if it's SSD, HDD >> etc. Note that HDD's are severely handicapped when it comes to concurrent >> IO operations. >> > > It is an HDD. I am sure it will be a factor in making things slower. > > >> c) given the factor above you could maybe try relaxing commit to disk if >> integrity of the data is not of utmost importance, for example by adding >> "innodb_flush_log_at_trx_commit = 2" to your config. >> >> Thanks - I'll try that. > > >> My system has 24 GB of RAM and 12 cores. Occasionally all the cores are >> fully busy with MariaDB activity, but most of the time barely one or two >> are. >> >> d) CPU in a DB system will mostly be used for queries (index scans) and >> some for reindexing but you're only doing writes, so unless you have dozens >> of indexes >> e) Table design - is your table design efficient ? how many rows per sec >> are you writing? maybe you could also benefit from hash partitioning or >> clever data types. >> >>> >>> I am a newbie when it comes to interacting with MariaDB - please, bear >>> with me. I know I must use a single database and a single table. I also >>> know - because of the nature of the data that are being written to the FIFO >>> - that the probability for two different threads to be operating on the >>> same entry in the table at the same time is negligible - i.e. for all >>> practical purposes, that will not happen. >>> What I need is advice on how to configure my instance of MariaDB to >>> perform optimally in the scenario above. In particular, I would like for it >>> to make better use of all the cores available - in essence, to parallelize >>> the database operations as much as possible. >>> >>> f) Congratulations, you have managed to contradict yourself in two >> sentences. >> > If your workload isn't parallel by design you will not make use of the >> available cores. Also, your workload sounds IO-bound to me - there's a >> strong change the disk is the bottleneck. >> g) "I know I must use a single database and a single table." How do you >> know this if you are a newbie? No offense meant but nost of monolithic >> design is not the best to leverage performance.... >> >> The data that I have consist of a string S, two string attributes T > and A, and an integer attribute D. String S is what I have to use as a key > in order to insert/delete/modify entries. Can this be advantageously be > stored across several tables? The S strings have nothing in common, beyond > the attributes that I mentioned. As for the parallelism, the essence is > what I pointed out - namely, that no two threads will work on the same > entry (as accessed by S) at the same time. They can of course be adding > new entries, or deleting existing ones, at the same time, with the proviso > that the insertion operations will always be for different values of S, and > analogously for the deletion operations. I am indeed open to suggestions > about a better design. > > >> -GL >> > _______________________________________________ > Mailing list: https://launchpad.net/~maria-discuss > Post to : maria-discuss@lists.launchpad.net > Unsubscribe : https://launchpad.net/~maria-discuss > More help : https://help.launchpad.net/ListHelp >
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp