On Wed, May 12, 2010 at 10:41:35AM +0200, Kevin Krammer wrote: > Since you are writing a bit down that you think it is caused by > kres-migrator, > where did you get it from (here it seems to be part of the kdepim-runtime > package).
Yes, kres-migrator is part of kdepim-runtime. I do have that package installed, as it seems to be indirectly depended upon by kde-minimal. It's kdepim itself, and its application dependencies (kaddresbook, kalarm, kmail knode, knotes, kontact, korganizer, etc.) I don't have installed. Maybe the kdepim-runtime dependency itself is a bug--can't say. > kres-migrator is called when an application accesses the KResource framework, > e.g. some app accessing the old addressbook API. > Not using KDEPIM apps does not necessarily mean non of your other > applications > access PIM data. Looks like the culprit here is libkabc. There's a "Default Addressboook" created by the library, that's presumably empty. I'm not sure what's loading libkabc in the first place. I do know that I didn't even have kabc database files (.kde/share/apps/kabc/std.vcf*) until upgrading to 4.4. Maybe it's an explicit part of the migration? Or I suppose one of the panel widgets I'm using might depend on it now, but I don't believe that's the case. > Anyone know what kind of data is stored in these logs? Looked into this a bit. The InnoDB documentation itself is a little lacking on describing its particular architecture, but there's an InnoDB tuning tutorial [1] that's rather helpful. These files serve as InnoDB's REDO logs. They serve two purposes. First, committed transactions are written to the REDO logs sequentially, so that table updates (with possible random seeks) can be done in a write-back mode "at leisure." Second, REDO logs serve as a durability mesaure. Each time the database is restarted, the REDO logs are replayed to ensure that recent transactions have been properly commited--say if either the database is "kill -9ed" or there's other table corruption. They may also be used in recovery, whereby if table corruption is found and old tables can be reloaded from backup, then the REDO logs can be replayed to bring the tables up to date. You can also forward REDO logs to standby (fail-over) servers to ensure their database tables are up to date. The REDO logs themselves contain row updates from insert/update statements. So for a given row length, the REDO logs contain the last LOG_SIZE/ROW_LENGTH transactions. They're not used in selects or other non-mutating accesses. REDO log size is not an issue of correctness. A small log size might result in decreased performance by forcing a burst of inserts/updates to be committed to table before completing a transaction. A larger log size may also be of benefit in data recovery if database corruption is found, and a recent enough table backup is maintained so that the REDO log still contains all non-backed up transactions. Let's try to quantify this a bit. I'm not exactly sure what kind of database workloads Akonadi is targetting, but for PIM applications we're looking at managing (1) contacts, (2) calendar entries, (3) "TODO" tasks, (4) notes-to-self, etc. It seems to me that each of these things results in: - Table row length on order of 1 kB. - Total number of rows < 10,000 (how many people do you know?) - Largely read-only data sets, grows over period of years. - A working set (actively updated rows) < 1,000 per day. Probably < 100. Thus, I would conclude that tables rarely grow larger than 10 MB (1 kB * 10,000). The number of inserts/updates per table shouldn't exceed 100 kB-1 MB per day. We're also unlikely to see bursty updates anytime information is manually provided, since it has to be typed in. Bursty updates would happen on device synchronization, at which point you might see 100 kB-1MB of table updates in a few-second window. This means that we should be able to record 1-10 days of update history with 1 MB transaction logs. And that's under a heavy PIM load. We also know that InnoDB's "leisure write" pace is 64 pages per second. Each page is 16 kB. If we're pessimistic, and say that row updates are randomly distributed enough such that there's only a single updated-row per page, then it would take up to 1.5s-15.6s to "leisurly flush" 100 kB-1MB of table updates. So it really only makes sense to increase the REDO log beyond 1 MB for performance purposes if we expect 1,000 of random-row (1 MB) updates to occur more frequently than once every 15 seconds. That doesn't really seem plausible with these kinds of workloads. In the even it _does_ happen, then it just takes a little longer to finish the sync. So that's my argument for 1 MB REDO logs. Let's look at the other defaults for a bit: innodb_log_buffer_size=1M -- Should be large enough to assemble a transaction in memory. 1 MB fits 1000 1kB-single-row transactions. Fine. innodb_buffer_pool_size=80M -- Page cache size, including cached read pages. Seems fairly large and wasteful use of RAM if we expect tables themselves to grow no larger than 10MB. 8 MB might be a more reasonable default, and in the event that a database does grow large, the Linux buffer cache should eliminiate most of the disk fetches (unless the tables are opened O_DIRECT, I'm not sure about that). innodb_file_per_table=1 -- Each table (basically, application) uses its own DB file. Implicitly configured is each table having a minimum size of 10 MB, and growing in 8 MB increments. Cutting to the chase: by the current configuration, each per-user MySQL instance will use 80 MB of RAM for a database buffer pool, and will create 10 MB files per-table (per-application), but we expect tables to never increase beyond that size. Overall seems kind of wasteful for this type of workload. These are also defaults that are meant for the lower-end of centralized DB workload-scenarios, not per-user PIM storage. If InnoDB is to be used long term, we could really benefit from constructing a sample workload and (asking a DBA to help us in) tuning the inital parameters appropriately. But if we're going to go with things as is, I've already made the case for why a 1 MB REDO log should be sufficient. I would actually claim now, though, that 5 MB REDO logs wouldn't be unreasonable in this context either. Turns out that 5 MB REDO logs are the InnoDB default, and that would mean that REDO logs would occupy the same amount of disk space as an empty table. Since we're paying a 10 MB penalty for each application to use Akonadi in the first place, another 10 MB for the logs isn't extremely egregious. The part that bothers me is that the Akonadi folks are basically aware of the situation, and feel justified in claiming [2] that 100+ MB of disk is reasonable. Franlky, if you ask even an arm-chair DBA if using InnoDB with these parameters are appropriate for per-user PIM management, they'll look at you like your crazy--which is, from what I can tell, the underlying reason for so much of the dislike with KDE 4.4. I can't imagine that SQLite was really _so bad_ of a target for low-usage PIM workloads that the Akonadi folks couldn't have just written a plugin for it some time ago and filed some bugs. Afterall, Firefox uses it rather extensively, seems like it would've been a perfect fit. But that's another story, and we just have to make do with what we have right now. [1] http://mysqldump.azundris.com/archives/78-Configuring-InnoDB-An-InnoDB-tutorial.html [2] http://techbase.kde.org/Projects/PIM/Akonadi#Akonadi_needs_too_much_space_in_my_home_directory.21 -- To UNSUBSCRIBE, email to debian-kde-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20100512171951.ga17...@club.cc.cmu.edu