On Thursday, 24 October 2013 13:21:49 CEST, Mike Cardwell wrote:
Ordinarily I wouldn't have brought this to your attention, but
seeing as one of Trojitas main features is it's performance, I
thought you would want to know.

Thanks a lot for this report -- it's definitely a bug. I can reproduce it pretty easily, even though profiling this with valgrind was rather painful (it took an hour).

It looks that most of the time is actually spent in the code which turns a plaintext mail into HTML -- turning URLs in to clickable links, generating nice collapsible quotes, etc. The WebKit itself still takes two seconds here on a ~4.9MB of Lorem Ipsum, but it has to run twice due to the hackish way how the formatting is implemented.

If someone is willing to look at this and spend some time writing patches, here is what I would recommend:

- Rework the way how the Composer::Util::plainTextToHtml is called. Right now, it happens from the slot connected to the WebKit's pageLoadFinished signal, so the Webkit itself will process the contents twice.

- A good change would be to move this transformation to the QNetworkAccessManager -- the SimplePartWidget can be fed with a special URL like trojita-pretty-format://..., similar to the existing trojita-imap://msg/...; the QNetworkReply subclass used under the hood shall wait for the data to become available, call the transformation and return the result using the usual QNetworkReply semantics.

- As an added bonus, this formatting reply subclass could easily offload the actual call to Composer::Util::plainTextToHtml to an extra thread. To make this possible, the function shall be changed to not use static local variables. The number of parallel instances can be controlled by using a per-QNAM thread pool.

Please let me know if people are interested in this and/or are willing to do the work. I won't have time to get this working anytime soon :(.

As a stopgap measure, I wrote a patch which just disables this funcitonality on >100kB plaintexts, see https://git.reviewboard.kde.org/r/113421/ or the big-plaintext-workaround branch on github.

Some extra comments follow:

- I have not analyzed whether the actual plaintext -> HTML thing could be made faster. Perhaps our regexps are wrong and waste time. Note that some of the worst offenders have been improved since March in commit 880c7a4e73c647d37d7704ea4d8974a51c873848.

- The Qt5's QRegularExpression uses libpcre's engine for matching; benchmarks claim speedups in any cases, the range is rather wide (30% - 600% and these do not look pathological). Trojita shall be ported to use that on Qt5.

- The valgrind profiling showed that our resizing takes its toll; we're dynamically manipulating the QWebView so that it works "right" when embedded into another widget without scrollbars. However, timing on a real CPU via QElapsedTimer showed this to be a non-issue.

- If it takes WebKit two seconds to render a page, blocking the GUI in the process, something is very obviously fishy here. This is on Qt4 using the qtwebkit23 package. I have no idea how the stock qtwebkit from Qt 4.8 behaves. I'm definitely *not* blaming WebKit for the current case, though -- our code is the biggest problem at this point.

Thanks again for your input. As you said, performance is one of the goals of Trojita, and this shall definitely be solved.

Cheers,
Jan

--
Trojitá, a fast Qt IMAP e-mail client -- http://trojita.flaska.net/

Reply via email to