GNOME’s performance-list mailing list is very high signal at the moment, though the archives seems to have lost emails for a few days.
These test results and suggested workaround/fixes are worth a gazillion viciously uninformed osnews or IRC discussions. Here’s a summary of what I’ve noticed recently though I can’t find the exact archive URLs for the stuff that I remember seeing, so this is full of errors that should be corrected in the comments so I can update it:
- Workarounds in GTK+ to use the older GDK instead of Cairo, to test whether cairo is responsible, and as workarounds if there are cairo problems to be fixed. At least one of these changes seems to have a noticeable effect, at least on non-FPU hardware.
- Text drawing optimizations: There might have been a slow down in text rendering (maybe in freettype?), and this might even be more of the cause of the GTK+ 2.6->2.8 slowdown than the use of Cairo.
- Cairo optimizations: I’m pretty sure that I’ve seen some talk of patches optimizing small but significant parts of Cairo, but I can’t find them now.
- GTK+ subsetting. Matthias has some patches for embedded environments that cut the GTK+ code size down by about 15%, by omitting unnecessary API, saving a couple of hundred K. (gtkmm has had this for a few weeks now, offering significant code size and runtime savings.)
- Nokia’s 770 Internet Tablet versus floating point calculations: The 770 doesn’t have an FPU (floating point unit), so any use of floats leads to a big slowdown, even if it’s just passing a float through the API. This is unlikely to be responsible for a big slowdown seen on the desktop, but when they fix this on the 770 then they can start looking more at other optimisations that will help the desktop too. Note that the OLPC does have an FPU.
As far as I can tell, the jury is still out on whether Cairo is the change that made GTK+ 2.8 slower than 2.6, and it’s still not entirely clear whether 2.8 is indeed slower than 2.6 on regular desktops. Note also that Cairo hasn’t had much optimization work until now so there should be lots of easy optimizations. But performance-list is figuring all this out.
You can thank OpenedHand, Carl Worth, Matthias Clasen, and the others (not me) for getting this done.
As usual, these all seem like small parts of the code that have a large impact and that can be relatively easily fixed. I continue to believe that this is almost always the only meaningful form of optimization. Optimization by design is rarely possible other than by just trying to avoid well known problems and trying to generally allow room for optimizations in future. Amost all projects whose primary aim is to be fast or small end up being unusable, unmaintainable, or incomplete, though exceptions to this rule should be possible.
Right on brother.