This version is improved, and clearly a lot of work has gone into it.
But I still see problems with it affecting the rest of the system.

I deselected "Enable Indexing" and "Enable Watching" in the Indexing
Preferences (first tab, General), but it still indexed, even after
rebooting.  Because of this, I had to ununstall tracker completely, to
get my system usable with the old version.

I decided to give this new version a try, as it's clearly had a lot of
work done.  So I installed it, and this is what I found.  I'm using
tracker-0.6.2-0ubuntu3:

1. I see it still insists on indexing even though the "Enable Indexing"
preference is deselected.  This is either a bug or a misleading UI.  It
means if I don't want indexing, even temporarily, I either have to
"killall -9 trackerd" or uninstall the package.  It would be much better
to be able to disable indexing, and re-enable it when I don't mind the
impact.

2. The first time it ran (the new version), my laptop slowed to a crawl
after about 5-10 minutes.  I thought this might be the old disk I/O
problems, but it turned out to be excessive VM usage.  trackerd was
using 900MB virtual memory, of which 704MB was RSS.  I have 1GB RAM
total, and a Gnome desktop plus Firefox needs about half of that, and so
everything ran very slowly.  I think needing 704MB RSS is excessive for
an indexer, and probably indicates a bug.  I got out of this by killing
trackerd (-9).

3. The next it ran was after a power cycle.  This time, for a couple of
hours it stayed quite small (10M  RSS), nice.  But it was using 100%
CPU, and hardly any I/O according to the Disk Usage monitor.  A quick
strace shows it doing repeated SQLite commits (and creating and
unlinking a temporary log file with each commit).  But crucially: it's
doing this and no other system calls.  This means it's doing a lot of
commits, but not indexing anything.  It's not reading my filesystem at
all.  Also, presumably it should never use 100% CPU for a sustained long
time, if it's on the maximum throttle setting (which it is).

4. Despite the low I/O activity according to Disk Usage monitor, it's
actually I/O bound.  What's happening is that every small SQLite write
create a log file on /tmp, writes to that, calls fsync, writes the main
file, calls fsync on that, then unlinks the log file.  Perhaps it isn't
obvious: those sequential fsyncs on the main database will be causing
tracker to run a lot more slowly than usual, and they also force the
disk head to remain close to the filesystem logging area (fsync only has
to commit the log, nothing else).  I noticed with the earlier version
that _this_ is sometimes the cause of "kills disk I/O", not the reading
for indexing, not the inotify watching, but the continuous rapid rate of
fsync calls on the database.  The solution to this is to aggregate many
db writes into single transactions, to reduce the fsync rate safely.
For an indexing application like this, you can use a timer to decide
when enough writes have been gathered and a commit should be done, so
that fsyncs are rate limited by time.

So, I still find that I cannot use tracker on my laptop for now.  But I
have these suggestions, which might make it possible to use in future:

1. Fix the occasional massive memory usage.  I suspect this is a bug you
would want to fix anyway because tracker is advertised as a small, low
footprint program.

2. Fix the Indexing Preferences so that deselecting "Enable Indexing"
actually does disable indexing, until you turn it on again.  I could
understand if the preference didn't have any effect until trackerd is
restarted (although that would not ideal), but this doesn't turn off
indexing even after a reboot, which makes no sense.

3. Fix the state where it's spending 100% CPU doing lots of small writes
to the database with fsync commits, without apparently doing any
filesystem indexing.  Is this caused by the SQLite incremental BLOB
writes, perhaps?  Perhaps this would be fixed by the next item:

4. Don't do a full commit after every write to the database.  Aggregate
them in transactions, so that a disk commit (fsync) happens at a limited
rate.  Ideally limit the rate using a timer, plus a limit on the amount
of uncommitted data.  This will make a big difference to disk I/O for
other applications in some circumstances because of interactions with
disk seeks, even when it looks like there's very little I/O caused by
trackerd in statistics.  But even better: it will probably make trackerd
much faster at writing to the database, and use much less CPU and less
power, all of which can only be good.

Finally, here's a couple of suggestions which aren't showstoppers for
me, but may be useful:

5. I noticed that trackerd says "Tracker version 0.6.1" but the package
installed is 0.6.2-ubuntu3.

6. The SQLite log file is created in /tmp, which is volatile: it's empty
after a reboot.  This seems to defeat the purpose of a log file, which
is to be able to recover the structure of the database file, and
committed data, after a system crash.  If the log file is not there
after a crash and reboot, then the database file may have a corrupt
structure.  Or does SQLite not need the log file to ensure the database
structure after a crash?  In which case, why is it created? ;-)

Thanks for all your work so far.  It's great to see improvements have
been made in response to earlier feedback, and you've obviously put a
lot of work in.

Though I will always disagree that "power users can just disable
tracker", especially with the UI not doing that, and also power users,
and people with lots of documents and text, are surely the people who
would find tracker most useful!  Much better would be if it worked well
for everyone, and it looks like that may be the case eventually :-)


** Attachment added: "Trace of trackerd in 100% CPU loop writing many small 
writes + fsync() to SQLite database"
   http://launchpadlibrarian.net/9196506/tracker-strace.txt

-- 
[gutsy] trackerd kills disk io
https://bugs.launchpad.net/bugs/131983
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to