Fishpool

To content | To menu | To search

Tag - Linux

Entries feed - Comments feed

Wednesday 21 May 2008

Stop distracting wireless led blinking

While there's lots to like about my current laptop, one thing that had been quite annoying was the wireless indicator led. For a long time, it didn't function at all, because iwl4965 didn't contain support for driving it. Recently (effectively starting with Fedora 9 for me), that support came in, but now it's not annoying because you can't tell whether wireless is enabled, rather because the led is blinking all the time, which is a distraction.

I liked the behavior of my old laptop's ipw2200 much better: blink while searching/associating with a network, and then stay on constantly. Happily, Erich Schubert just pointed out how to fix the iwl4965 blinking behavior. That script is (I think) for Debian/Ubuntu, and a slightly different kind is needed on Fedora. I'm not sure this is the best way to go about it, but at least it works for me: put the following in /etc/NetworkManager/dispatcher.d/iwl-no-blink

#!/bin/sh
if [ "$0" = "wlan0" ]; then
    for dir in /sys/class/leds/iwl-phy*X; do
        echo none > $dir/trigger
    done
fi

Monday 31 March 2008

Optimizing Linux for random I/O on hardware RAID

There's a relatively little known feature about Linux IO scheduling that has a pretty significant effect in large scale database deployments at least with MySQL that a recent article on MySQL Performance Blog prompted me to write about. This may have an effect on other databases and random I/O systems as well, but we've definitely seen it with MySQL 5.0 on RHEL 4 platform. I have not studied this on RHEL 5, and since the IO subsystem and the Completely Fair Queue scheduler that is default on RHEL kernels has received further tuning since, I can not say if it still exists.

Though I've heard YouTube discovered these same things, I have not yet seen a simple explanation of why this is so - so I'll take a shot at explaining it.

In short, a deployment with a RAID controller or external storage system visible to the operating system as a single block device will not reach its maximum performance under RHEL default settings, and can be easily coaxed about 20% higher on average random I/O (and significantly higher in spot benchmarks) with a single kernel parameter (elevator=noop) or equivalent runtime tuning via /sys/block/*/queue/scheduler in RHEL5, where you can also set this on a per-device basis.

We first saw this in 2005 on a quad-CPU server with a RAID controller connected to 10 SCSI disks. At that time, we found that configuring the RAID to expose five RAID-1 pairs which we then striped to a single volume using LVM increased performance despite making the OS and CPU do more work on I/O. The difference in performance was about 20%.

Our most recent proof of the same effect was a quad-CPU server connected to a NetApp storage system over FC. Since it was not convenient to expose multiple volumes from the NetApp to stripe them together, we searched for other solutions, and prompted by a presentation by the YouTube engineers looked at the I/O scheduling options and found a simple way to improve performance was to turn off I/O reordering by the kernel. Again, the overall impact between the settings was about 20%, though at times much greater.

The lesson is simple: reordering I/O requests multiple times provides no benefits, and reordering them too early will in fact be detrimental. Explaining why that is so is a bit involving, and is based on a few assumptions we have not bothered to verify, since the empirical results have supported our conclusions and got us where we wanted.

In order to keep the explanation simple, I will describe it conceptually on a very small scale. When reading this, please take this into account and understand that to measure the effect we have seen in practice, the size of the solution should be increased from what I am describing.

First, consider the case of direct-attached storage exposed to the Linux kernel as independent devices. In this configuration, the kernel maintains a per-device I/O queue, and the CFQ scheduler will reorder I/O requests to each device separately in order to maintain fair per-process balancing, low latency and high throughput. This is the configuration in which CFQ does a great job of maximizing performance, and works fairly well with any amount of spindles. As the application (a database in this case) fires random I/O, each of the spindles is executing them independently and serves requests as soon as they are issued. In other words, the system is good at keeping each of the I/O queues "hot". The sustained top I/O rate is roughly linear to the number of spindles, or with 15k rpm drives, about 1000 ops for four drives.

Now, lets introduce a hardware RAID of some sort, in particular one which is enabled to further reorder operations thanks to a "big" battery-backed up cache. Thanks to that cache, the RAID can commit thousands of write operations per second for fairly long periods (seconds), flushing them to disk after merging. On the other hand, the kernel now sees just one device, and has one I/O queue to it. The CFQ scheduler sits in front of this queue, reordering pending I/O requests. All is fine until the I/O pressure rises up to about what a single spindle can process on a sustained basis, or about 250 requests per second on those 15k drives. However, as soon as the queue starts building up, the CFQ scheduler kicks into action, and reorders the queue from random to sorted per block number (an oversimplification, but close enough).

All is good? No, it's not. The sequential blocks on that RAID volume are not truly sequential, but reside on different spindles and could thus be processed simultaneously. To demonstrate, lets assume your four-spindle array has one billion sectors or five hundred gigs per device, and further, that it is striped at 64k extents or 7.8 million stripes across each device.

On both configurations, the striping is essentially the same. Every 128 sectors or 64k is one one device, then the next one, and so on. The difference is that with LVM in place, the kernel knows this, while with the RAID, it has no idea of the layout of the array, essentially treating it as a single spindle.

Now, those couple of thousand request that were just issued, contain sequences such as writes to sectors 10, 200, 50, 300, 1020, 600, 1500 and 700. Due to the striping, four of these can be executed simultaneously, so the optimal order to issue these, of course depending on what else might be going on, is something like 10, 200, 300, 1500, 50, 700, 1020, and 600, executed through four queues: [10, 50, 600], [200, 700], [300] and [1020, 1500]. In the LVM configuration this might be what really happens. However, the single I/O queue to the RAID device will have these sorted into ascending block order, and with enough such operations in the queue, the RAID processor no longer has enough view to the queue to efficiently re-re-order them to utilize all the spindles, so only some of them are hot at any given time. TCQ should help, but in practice it won't issue enough outstanding requests to fix the problem. In our experience the top sustained rate is not more than 1.5 times one spindle, or 300-400 requests per second, while the array should really run at over the 1000 ops per second thanks to the additional persistent cache on the RAID controller.

Bottom line: CFQ is great, but only if the kernel actually knows everything about the physical layout of the media. It also looks like some of the recently introduced tuning parameters (which I know nothing about, just noted their appearance) might help avoid the worst hit. However, ultimately it doesn't matter - if your hardware allows efficient "outsourcing" of the I/O scheduling to a large secure cache, use it, and don't bother making the kernel do the job without all the information.

I hope this explanation makes sense, and that I haven't botched any important details or made wrong assumptions. Please comment if any of this is inaccurate.

PS. A tuning guide for Oracle recommends the deadline scheduler due to latency guarantees. We have not benchmarked that against noop.

Sunday 30 September 2007

Fedora 8 is looking good

Once again, I couldn't resist the urge to stay on the bleeding edge, so I went ahead and updated my home machine from F7 to F8 test 2. Encouraged by the results, I then (again) did the unthinkable and went through the same process on my laptop, which I depend on for getting stuff done. Crazy. Well, that's the way I like to play the game. And I wasn't quite THAT crazy - I didn't upgrade everything, just the parts that I was really happy about. Besides, I've set the laptop up with a whole-system snapshot LVM backup so that I can back up a day if things start to look bad.

They haven't. Apart from a few minor glitches (such as the Rawhide NetworkManager 0.7 really not being at all ready, dealt with by using the F8t2 NM 0.6.5 instead), I really like all the improvements in the usual suspects - GNOME 2.20 is a brilliant incremental update, OpenOffice 2.3 is a slight improvement on the already-improved 2.2 (but damn, are those release notes bad or what), the Power Manager is getting really good at predicting battery life, and (drumroll, please) Evolution has regained its stability! That is major. The "it seems to forgot to include an attachment you mention in the text" feature is a neat little improvement, too, but really, not having e-d-s crash on network events (such as resume in a new WLAN) is the real satisfaction-improvement for me.

One negative about F8: it doesn't include Seahorse 1.0 (as of yet, anyway), so GNOME Keyring integration was a bit lacking. That was easy enough to fix with a rebuilt package, and after switching the old pam_keyring to gnome-keyring-pam, I now have a very good package for dealing with my hundred-and-fourtyseven different daily passwords, too. Well, almost -- still can't really get rid of Revelation and some manual password management, and Epiphany doesn't yet integrate to Keyring. But it's getting there, for sure.

Friday 21 September 2007

MySQL Community vs Enterprise tension

I probably don't spend quite enough time following progress around MySQL considering how critical the product is to us. I'd like to consider it part of the infrastructure in a way I treat Red Hat Enterprise Linux, ie something I can trust to make good progress and follow up on a quarterly basis. Naturally we have people who watch both much more closely, but my time simply should, and pretty much is, spent doing something else.

However, it seems MySQL really demands a bit more attention right now. Today I went and read Jeremy Cole's opinion about MySQL Community (a failure), and I have to say I agree on many of the points. MySQL simply has not yet found a model that works as well as that of Red Hat's Fedora vs Enterprise Linux - that is, really giving the Community edition to the community to direct, and using the Enterprise edition as a platform for enterprises to depend on.

I feel the fundamental problem really is quite simple; as long as MySQL maintains the community edition (both binaries AND the source tree) themselves, and don't let the community integrate features to it on a timely basis, the model will not function, not even to their paying customers (us included). However, if they reverse this particular point from the current status-quo, all of the other benefits are inevitable.

The comparison to Fedora and RHEL is rather obvious, despite the distribution vs single product differences. Fedora is a great community Linux distribution with the latest-and-greatest features integrated to it on a very timely fashion. Not even Ubuntu can really compete with Fedora in terms of features. However, what Fedora gives up to reach this is a certain amount of polish and reliability. I will happily use Fedora as a personal platform, because of the latest features, but I would not pretend to run a stable system on top of it. For that, I'll rather choose something a bit more mature, that has proven itself in the community and received further QA ahead of commercial release. This is RHEL, and this is what the MySQL Enterprise should be. A version that, when it's released, I shouldn't have to hesitate to install on a new production server.

I also today learned about the Dorsal Source MySQL community release. Now this looks like something MySQL Community release probably should be like. I'll have to give it a test round and see what's up.

Update: Baron Schwartz describes a MySQL Enterprise that I would have far less trouble using than the existing one..

Wednesday 29 August 2007

Working 3D on the 965GM

I took a second (third, whatever) look at how to get 3D acceleration enabled with the TravelMate, and finally found the clue to avoiding a display lockup the moment an OpenGL application was started.

Fedora 7 will not support it as-is. You'll need at least kernel 2.6.22.1 (2.6.22.4 is now in updates) and Mesa 6.5.3. I found it easiest to install Richard Hughes' "Utopia" builds of mesa-libGL and libdrm and a rebuilt fc8 xorg-x11-drv-i810. With these three packages, DRI can now be enabled and the machine is stable. Performance isn't stellar, but it's plenty enough to enjoy compiz and a slightly blinged up desktop, which is essentially what I was looking for, anyway. Ready-made binary attached. Remember, you need to update the kernel and drm bits too with the linked stuff.

Tuesday 21 August 2007

Acer Crystal Eye and GStreamer

The Crystal Eye webcam in new Acer laptops, my TravelMate 6292 included, works with the linux-uvc driver, as I noted before. To use it in GStreamer applications, you need to have the v4l2src component, which recently moved from the gstreamer-plugins-bad collection to gstreamer-plugins-good. In Fedora 7, you must have g-p-g version 0.10.6, which was just released to updates-testing (in a few days in updates, I would expect).

If you don't want to build linux-uvc yourself (it's very easy), you may want to enable the drpixel yum repo that has it pre-built for Fedora kernels.

rpm -ivh http://download.tuxfamily.org/rpm/drpixel/fedora/7/i386/repodata/repoview/drpixel-release-0-1-2.html
yum --enablerepo=updates-testing --enablerepo=drpixel install gstreamer-plugins-good kmod-uvc

To test it, run:

gst-launch v4l2src queue-size=2 !  ffmpegcolorspace ! ximagesink

Wednesday 1 August 2007

Sound on Acer Travelmate 6292 under Linux

I know I said I'd wait until the end of my vacation to tinker with audio on this laptop, but I couldn't help it -- I wanted to watch DVDs, and movies without sound aren't all that great an experience. So, I had to dig in and see what the solution is.

Not all that easy, it turns out. Fedora 7's latest update kernel still has no support for the Realtek ALC268 sound codec, despite supporting a number of other codecs in Santa Rosa-based laptops. The latest development version of ALSA does have support for a couple of laptops with the 268 chip, but not the TM 6292. Another patch does exist that gets closer, and I made a version on top of that one that provides rudimentary support.

That is, the speakers work now, and so does the headphone jack. However, plugging in the headphones doesn't mute the speakers, and there is only one volume control for both of them. Actually, there are three (called Headphone, PCM, and Front), but only two of them do anything, and they do the same thing (control the volume of both speakers and headphones). Microphone input doesn't work at all. However, all those details are way beyond what I want to know about audio hardware control, and I'm satisfied enough to simply get some sound out of the machine for now. Some other enterprising soul may fill in the blanks.

Patch filed at ALSA's bug tracker. If you're using the 2.6.22.1-33.fc7 kernel (the latest update Fedora 7 kernel as of this moment), you can download a replacement snd-hda-intel.ko kernel module that should enable sound for this machine. Install with

rm /lib/modules/2.6.22.1-33.fc7/kernel/sound/pci/hda/snd-hda-intel.ko
cp snd-hda-intel.ko /lib/modules/2.6.22.1-33.fc7/extra/
depmod -ae
kill $(lsof -t /dev/snd/*)
modprobe -r snd-hda-intel
modprobe snd-hda-intel

Wednesday 18 July 2007

Acer TravelMate 6292 and Fedora 7 Linux

As I mentioned in my previous note, my previous laptop destroyed its fan last week. Since it had started to show its age in other respects as well and was deemed not worth repairing, I got a new one yesterday -- an Acer TravelMate 6292. This is a Core 2 Duo / Santa Rosa chipset based model, with some pretty cutting-edge technology inside. I'll write down the details later when typing is easier, but for anyone who might be considering one to use with Linux: yes, it does work, quite well in fact, but a bit of tweaking is required due to its very new components.

  • Fedora 7 LiveCD didn't like to boot, possibly due to a missing driver (it didn't like my previous laptop's external Firewire CD drive either). It might be possible to work around by changing BIOS settings, but I borrowed a USB CD drive instead.

  • Otherwise, the LiveCD install experience (including resizing and moving the Windows partition out of the way) was a very smooth one. I hadn't done this before, and was positively surprised. I'm certain Microsoft hasn't made their install this smooth, and I doubt Apple has, either. Much recommended, if you're even a little bit curious.

  • Network-based update post-install no problem using a wired network. All in all, the install took about 1 hour to move Windows partition, 20 minutes to install Fedora, and 30 minutes for it to load updates afterwards (this was surprisingly slow for some reason).

  • Wireless (Intel Wireless 4965 A/G/N adapter) driver (iwlwifi) was preinstalled, but the required firmware wasn't (the package only included firmware for the previous model, 3945). No problem, just install iwlwifi-4965-ucode from ATrpms.

  • Things which worked without any effort at all: battery monitoring, CPU frequency control, temperature monitoring, wired Ethernet, Bluetooth, docking station, and many other things I take for granted. In fact, the machine was entirely functional save for the missing wireless adapter microcode straight off the LiveCD, and all that I did for it was to improve performance past the "functional" stage.

  • Display was a bit fuzzy, and 3D acceleration didn't work. This was because the preinstalled Xorg Intel driver v 2.0 includes only basic support for GMA X3100. Both problems disappear by installing a new kernel (for updated 3D/DRI driver) and Xorg 1.3.0/Intel 2.1.0 (for 2D etc), ie by running this command as root:
    yum --enablerepo=updates-testing update kernel\* Xorg-X11-drv-i810 Xorg-X11-server-Xorg
    
  • Both suspend-to-ram (S3) and hibernate-to-disk work fine, once the usb drivers are forced out of the kernel prior to suspend. Create /etc/pm/config.d/unload_modules with one line:

    SUSPEND_MODULES="ehci_hcd ohci_hcd uhci_hcd"
    
  • Update: The Crystal Eye webcam (USB ID 064e:a101) works using the linux-uvc driver, which needs to be installed from source (download, extract, make, make install). Make sure you configure each application to use V4L2 instead of the old V4L API. For example with Ekiga, choose V4L2 instead of V4L in the configuration druid or in the Video Devices Preferences.
  • Something still to do about audio, apparently common to many Santa Rosa laptops and the ALSA Intel HD Audio driver, at least ones which use a Realtek codec. Notes from Ubuntu might guide you along - me, I'll try again after my vacations. Perhaps someone else will bother to fix this one. :) Update: a modified driver now provides basic sound output.

  • Haven't tried to use the fingerprint reader (USB ID 147e:2016) yet, the biometrics libraries required look a bit overwhelming to install.

Monday 18 June 2007

Update on Fedora 7

A few weeks ago I mentioned having upgraded to Fedora 7, and linked to a couple of bugs that were bothering me. No longer; as far as I can tell, my laptop is now stabler than it has ever been, plus way more functional. It's like I got a new computer all together ;) In particular, the crash when enabling an external screen was just fixed yesterday by Keith Packard. Thanks, Keith!

Tuesday 12 June 2007

Font rendering Mac OS X vs Windows vs Linux

A topic which I've run tests with before (long time before), apparently has come back with Apple's Safari for Windows release. Jeff Atwood finds Mac OS X fonts wonky - well, they're certainly soft. Apple has never been too keen on strong hinting, perhaps because it messes inter-glyph metrics in favor of contrast. Windows is wonky in its own way - ClearType has good contrast, but letter spacing is sometimes a bit annoying.

Just for kicks, here's what Fedora 7 with FreeType autohinting and subpixel rendering (equivalent of ClearType) looks like. This is just one of the modes, but the one I personally prefer:


Comparing at 200% rendering to Jeff's examples; Safari/Windows, IE7/Windows, and Firefox/Fedora

I guess it's up to everyone's preferences, but I think Fedora wins this one. Spacing isn't perfect here either (in particular "b est" looks a bit ugly), but overall, contrast is excellent and paragraph spacing is very close to what it would be without hinting.

- page 1 of 4