On Mon, Oct 7, 2019 at 12:12 PM Kenneth Penza <kpe...@gmail.com> wrote:
> Opening many connections will cause issues, especially if each thread is > running a sub-optimal query. Try to decrease the number of connections, > one-way can be using message queues. > > Tuning the database I/O is important, ensure you are using a separate > mount point for /var/lib/mysql and mount it with noatime. Resize the > innodb_buffer_pool_size to ensure that it can hold your workload. > Subsequently, perform some tests with innodb_flush_method = "O_DIRECT". > > From your description, data S is the key which in turn is composed of T > and A. Can the data be stripped by the application and T and A stored > separately? Maybe use persistent virtual columns and index those instead. > I don't think I understand. It is not that S is composed of T and A; what I am reading from the FIFO is two items: S and A. T is derived from S and A, together with a timestamp. Recall that in InnoDB the primary key is a clustered index thus the table > is written to disk sorted by column S. Inserts and updates may require > moving rows around slowing the SQL statements (high insert/update > concurrency will worsen the situation). If column S is the primary key and > is large, all other indexes will use the primary key as the prefix, causing > indexes to be large increasing the load on the system I/O. > There is something to what you are saying, for I have a high insert/update concurrency. I will definitely try to understand to details of what you are suggesting. > > > > On Mon, Oct 7, 2019 at 7:13 PM JCA <1.41...@gmail.com> wrote: > >> Thanks for your feedback. Please see my comments interspersed below. >> >> On Mon, Oct 7, 2019 at 10:38 AM Guillaume Lefranc <guilla...@adishatz.net> >> wrote: >> >>> Hello, >>> >>> thread-handling=pool-of-threads >>>> max_connections = 1000 >>>> table_open_cache = 800 >>>> query_cache_type = 0 >>>> innodb_buffer_pool_size = 512M >>>> innodb_buffer_pool_instances = 10 >>>> innodb_adaptive_hash_index_partitions = 20 >>>> innodb_lock_wait_timeout = 5000 >>>> >>> No need to use buffer pool instances with only 512M of buffer. you said >>> you have 24GB of RAM - why not increase the buffer size? how big is your >>> table on the disk right now? If you want the best performance it must be >>> hold in the buffer. >>> >> >> It does not seem to be that big - about 101 MB. I used the following >> command: >> >> SELECT table_name AS `Table`, round(((data_length + index_length) / 1024 >> / 1024), 2) `Size (MB)` FROM information_schema.TABLES WHERE table_schema = >> "<my-db-name>"; >> >> I had to look it up. >> >> >>>> With this, my application can keep up with the FIFO writer, but - >>>> depending on the circumstances - my database can't. As I am writing this, >>>> there are over 1300 threads connected to my database; any command that I >>>> issue at the mysql CLI takes over one minute to return. I am keeping track >>>> on how long each thread takes to complete, and that is of the order of >>>> hundreds of seconds - sometimes thousands. Each thread is itself simple, in >>>> that it just issues a couple of simple MariaDB commands. Currently my >>>> table consists of 1.6 million entries, and growing - on this basis, I >>>> expect that things will get only worse. Each entry,however, will never >>>> require more than a couple of hundred bytes of storage. The operations that >>>> can be undertaken on entries are insertion, deletion and modification, the >>>> latter being straightforward - like e.g. incrementing a counter or >>>> replacing a short string. >>>> >>> >>> You are not considering many factors which is >>> a) limiting the number of concurrent threads - with 1300 threads you are >>> creating concurrency races and locking issues. Try limiting your threads to >>> a factor of the # of cores >>> >> >> I will. Notice, though, that most of the time most of the cores are >> idle anyway. >> >> >>> b), you're writing to a disk system. The number of CPU cores won't >>> matter if you saturate the disk. You say nothing about the disk, if it's >>> SSD, HDD etc. Note that HDD's are severely handicapped when it comes to >>> concurrent IO operations. >>> >> >> It is an HDD. I am sure it will be a factor in making things slower. >> >> >>> c) given the factor above you could maybe try relaxing commit to disk if >>> integrity of the data is not of utmost importance, for example by adding >>> "innodb_flush_log_at_trx_commit = 2" to your config. >>> >>> Thanks - I'll try that. >> >> >>> My system has 24 GB of RAM and 12 cores. Occasionally all the cores are >>> fully busy with MariaDB activity, but most of the time barely one or two >>> are. >>> >>> d) CPU in a DB system will mostly be used for queries (index scans) and >>> some for reindexing but you're only doing writes, so unless you have dozens >>> of indexes >>> e) Table design - is your table design efficient ? how many rows per sec >>> are you writing? maybe you could also benefit from hash partitioning or >>> clever data types. >>> >>>> >>>> I am a newbie when it comes to interacting with MariaDB - please, bear >>>> with me. I know I must use a single database and a single table. I also >>>> know - because of the nature of the data that are being written to the FIFO >>>> - that the probability for two different threads to be operating on the >>>> same entry in the table at the same time is negligible - i.e. for all >>>> practical purposes, that will not happen. >>>> What I need is advice on how to configure my instance of MariaDB to >>>> perform optimally in the scenario above. In particular, I would like for it >>>> to make better use of all the cores available - in essence, to parallelize >>>> the database operations as much as possible. >>>> >>>> f) Congratulations, you have managed to contradict yourself in two >>> sentences. >>> >> If your workload isn't parallel by design you will not make use of the >>> available cores. Also, your workload sounds IO-bound to me - there's a >>> strong change the disk is the bottleneck. >>> g) "I know I must use a single database and a single table." How do you >>> know this if you are a newbie? No offense meant but nost of monolithic >>> design is not the best to leverage performance.... >>> >>> The data that I have consist of a string S, two string attributes T >> and A, and an integer attribute D. String S is what I have to use as a key >> in order to insert/delete/modify entries. Can this be advantageously be >> stored across several tables? The S strings have nothing in common, beyond >> the attributes that I mentioned. As for the parallelism, the essence is >> what I pointed out - namely, that no two threads will work on the same >> entry (as accessed by S) at the same time. They can of course be adding >> new entries, or deleting existing ones, at the same time, with the proviso >> that the insertion operations will always be for different values of S, and >> analogously for the deletion operations. I am indeed open to suggestions >> about a better design. >> >> >>> -GL >>> >> _______________________________________________ >> Mailing list: https://launchpad.net/~maria-discuss >> Post to : maria-discuss@lists.launchpad.net >> Unsubscribe : https://launchpad.net/~maria-discuss >> More help : https://help.launchpad.net/ListHelp >> >
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp