Monday, November 15, 2010

Pushing a graphics card into a VM, part 5

Part 1 Part 2 Part 3 Part 4 Part 5

So here's the final thumbnail summary:

Hardware:

  • Video card #1: ATI 5750 (but the 5770 should work too and is slightly faster, but the 5750 was on the Xen compatibility list).
  • Video card #2: nVidia Corporation VGA G98 [GeForce 8400 GS] *PCI* card (BIOS set to use PCI as first card)
  • Intel(r) Desktop Board DX58SO -- not a great motherboard, but it was available at Fry's and was on the Xen VT-d compatibility list
  • Intel Core I7-950 processor
  • 12GB of Crucial 3x4GB DDR3 RAM
  • Hard drives: 2 Hitachi 7200 rpm SATA 2GB drives, configured as RAID1 via Linux software RAID.
  • Antec Two Hundred V2 gamer case to handle swapping OS's via the front 2 1/2" drive port.
  • Various 2 1/2" drives to hold the Linux OS's that I was experimenting with
Software:
  • OpenSUSE 11.3, *stock*.
  • Windows 7, *stock*.
By adding the PCI card, my Linux console remains my Linux console, and Xen properly starts up my Windows DomU. My configuration is now complete. I may extend my Windows LVM volume to 200G so I can install more games on it, but note that all of my personal files, ISO's, etc. live on Linux. Note that 5.9 is what the Windows Performance Index should be for that particular hard drive combo, so this Windows system is as good as most mid-range gaming systems performance-wise. I added the paravirtualization drivers for the networking and disk controller, but they didn't improve performance any -- all they did was reduce how much CPU the dom0 qemu was expending implementing virtual disk and network controllers. Given that I have a surplus of CPU on this system (8 threads, 3.2ghz), it's in retrospect no surprise that I saw no performance gain on the disk and network from going paravirtual -- all I did was free up more CPU for use for things like, say, video encoding.

Thoughts and conclusions:

One thing that was very clear through this entire process is that I'm very much pushing beyond the state of the art here. The software and hardware configurations needed for this to work were very twiddly -- there is exactly one (1) Linux distribution (OpenSuse 11.3) which will do it at this point in time, and there were no GUI tools for OpenSuse 11.3 which would create a Xen virtual machine with the proper PCI devices. Furthermore, the experimental Xen 4.01 software on OpenSuse is almost entirely undocumented -- or, rather, it has man pages provided with it, but the man pages document an earlier version of Xen which is significantly different from what's actually shipped with OpenSuse 11.3.

From a general virtualization perspective, comparing Xen, KVM, and ESXi, Xen currently wins on capabilities but only by a hair, and those capabilities are almost totally undocumented -- or worse yet, don't work the way the documentation says they work. Xen's only fundamental technological advantage over KVM and ESXi right now is its ability to run paravirtualized Linux distributions without needing the VT-x and VT-d extensions -- a capability which is important for ISP's with tens of thousands of older servers without these extensions, but becoming increasingly less important as VT-x is now everywhere except in the low-end Atom processors. Comparing my Xen installation at home with my KVM installation at work, both of which I have now used extensively and pushed their capabilities to their limits, I can see why Red Hat is pushing the KVM merger of hypervisor and operating system -- KVM gives you significantly greater ability to monitor the overall performance of your system, vs. Xen where 'xm top' is a poor substitute for being able to get detailed monitoring of your overall system performance, is significantly better at resource management since the same resource manager handles everything (core hypervisor/dom0 plus VM's), and the Linux scheduler can consider everything when deciding what to schedule, rather than having the Xen hypervisor out in the background making decisions about which Xen domain to schedule next based upon very little information.

In short, my general conclusion is that KVM is the future of Linux virtualization. Unfortunately my experience with both KVM and Xen 4.0 is that both are somewhat immature compared to VMware's ESX and ESXi products. They are difficult to manage, their documentation is persistently out of date and often incorrect, and both have a bad tendency to crash cryptically when doing things that they're supposed to be able to do. Their core functionality works well -- I've been running Internet services on Xen domains for over five years now and for that problem domain it is bullet-proof, while at work I am developing for several different variants of Linux using KVM virtual machines on Fedora 14 as well as running a Windows VM to handle the VSphere management tools, and it's been bullet-proof. But they decidedly are not as polished as VMware at this point, other than Citrix's XenServer, which lacks the PCI passthrough capability of ESXi and thus was not useful for the projects I was considering.

My take on this, however, is that VMware's time as head of the virtualization pack is going to be short. There isn't much more that they can add to their platform that the KVM and Xen people aren't already working on. Indeed, the graphics passthrough capability of Xen is already beyond where VMware is. At some point VMware is going to find themselves in the same position vs. open source virtualization that SGI and Sun found themselves in vs. open source POSIX. You'll note that SGI and Sun are no longer in business...

-ELG

Sunday, November 14, 2010

Pushing a graphics card into a VM, part 4

Part 1 Part 2 Part 3 Part 4 Part 5

Okay, so virt-manager did pick up my new VM once I created it with xm create on a config file, but when I rebooted the system the VM was gone. So how can I fix this? Well, by taking advantage of functionality that OpenSUSE has had for auto-starting Xen virtual machines all along: Just move my config file into /etc/xen/auto and it'll auto start (and auto shutdown, if I have the xen tools installed) at system boot.

Of course, that requires a config file. Rather than paste it here, I'll let you view the config file as a text file. Note that 'gfx_passthru=1' is commented out. The Xen documentation says I need it, but if I put it there, my VM doesn't start up -- it crashes into the QEMU monitor. Also I ran into another issue, a timing issue. pciback is grabbing the console away from Linux and leaving the video card half-initialized, and when Xen grabs the video card and shoves it into the VM, the video card locks up the system solid when Windows tries to write to it. My solution to that was even simpler -- put the older of the nVidia cards back into the system, and load the 'nouveau' driver using YaST's System > Kernel > INITRD_MODULES and System > Kernel > MODULES_LOADED_ON_BOOT functionality. This flips the console away from the ATI card early enough that it doesn't conflict with Xen giving the video card to Windows. This also gives me a Linux console on the nVidia card that I can switch to by plugging in a second keyboard to the front USB on my chassis (the one I did *not* push into the Windows VM) and flipping my monitor to its DVI input (rather than the HDMI coming from the ATI card).

With all of this done, I can now reboot my system and get Windows on video card 0, and Linux on video card 1. I suppose I could reverse the video cards (to give the boot video card to Linux), unfortunately my board puts the second 16-lane PCIe slot too close to the bottom of the case and a double-width PCIe card won't fit there. Maybe when I upgrade to one of those spiffy SuperMicro server motherboards with the IMPI and such, at which point I won't need a second video card anyhow because the on-board video will suffice for Linux...

Next up in Part 5: Thoughts and conclusions.

Saturday, November 13, 2010

Pushing a graphics card into a VM, Part 3

Part 1 Part 2 Part 3 Part 4 Part 5

OpenSUSE 11.3 was a quite easy install. I haven't used SUSE since the early 'oughts, but first impressions were pretty good. OpenSUSE 11.3 is KDE-based, which is a change from the other distributions I've been using for the past few years -- Ubuntu on my server at home, Debian on my web and email server, and various Red Hat derivations at work -- and seems to be pretty well put together. The latest incarnation of YAST makes it more easily managable from the command line over a slow network connection than the latest Ubuntu or Red Hat, which rely on GUI tools at the desktop. The biggest difference between Red Hat and SUSE was that SUSE uses a different package dependency manager, "zypper", which is roughly equivalent to Red Yat's "yast" and Debian's "apt-get" but with its own quirks. It appears to be slightly faster than "yast" and roughly the same speed as "apt-get". If you wonder why SUSE/Novell wrote "zypper", at the time they wrote it, "yast" was excruciatingly slow and utterly unusable unless you had the patience of Job. Red Hat has sped up "yast" significantly since that time, but SUSE has stuck with "zypper" nevertheless. I also set up the bridging VLAN configuration that I mention in my previous post about how to do it on Fedora. Again SUSE has slightly different syntax than Red Hat for how to do this in /etc/sysconfig/network/* (note *not* network-scripts), but again it was fairly easy to figure out via reading the ifup / ifdown scripts and consulting SUSE's documentation.

So anyhow, I installed the virtualization environment via "yast" and rebooted into the Xen kernel downloaded by that. At that point I created a "win7" LVM volume in LVM volume group "virtgroup" on my 2TB RAID array, and went into the Red Hat "virt-manager" and attached to my Xen domain, then told it to use that LVM volume as the "C" drive and installed Windows 7 on it. I'm using a LVM volume because at work with KVM, I find that this gives significantly better disk i/o performance in my virtual machine than pointing the virtual disk drive at file on a filesystem. Since both Xen and KVM use Qemu to provide the virtual disk drive to the VM, I figured that the same issue would apply to Xen, and adopted the same solution that I adopted at work -- just point it at a LVM volume, already. (More on that later, maybe).

Okay, so now I have Windows 7-64 bit installed and running, so I shut it down and went to attach PCI devices to it via virt-manager and... err. No. Virt-manager wouldn't do it. Red Hat strikes again, it claims that Xen can't do PCI passthrough! So I went back to the handy Xen Wiki and started figuring out via trial and error how to use the "xm" command line, where the 'man' page for xm doesn't in any way reflect the actual function of the program that you see when you type 'xm help'. So here we go...

First, claw back the physical devices you're going to use via booting with them attached to pciback. So my module line for the xen.gz kernel in /boot/grub/menu.lst looks like...

module /vmlinuz-2.6.34.7-0.5-xen root=/dev/disk/by-id/ata-WDC_WD5000BEVT-22ZAT0_WD-WXN109SE2104-part2 resume=/dev/datagroup/swapvol splash=silent showopts pciback.hide=(02:00.0)(02:00.1)(00:1a.0)(00:1a.1)(00:1a.2)(00:1a.7)(00:1b.0)

Note that while XenSource has renamed pciback to 'xen-pciback', OpenSUSE renames it back to 'pciback' for backward compatibility with older versions of Xen. So anyhow, on my system, this hides the ATI card and its sound card component, and the USB controller to which the mouse and keyboard are attached. I leave the other USB controller attached to Linux. I did not have any luck pushing USB devices directly to the VM, I had to push the entire controller instead, apparently the Xen version of QEMU shipped with OpenSUSE 11.3 doesn't implement the USB (or else I simply need to read the source). Note that you want to make sure your system boots *without* the pciback.hide before you boot *with* it, because once the kernel starts booting and sees those lines, your keyboard, mouse, and video go away!

Okay, so now I'm booted. I ssh into the system via the network port (err, make sure that's set up before you boot with the pciback.hide too!) and go into virt-manager (via X11 displaying back to my Macbook, again make sure you have some way of displaying X11 remotely before you start this) and start up the VM. At that point I can do:

  • xm list
and see my domain running, as well as log into Windows via virt-manager. So next, I attach my devices...

  • xm pci-attach win7 0000:02:00.0
  • xm pci-attach win7 0000:02:00.1
  • xm pci-attach win7 0000:00:1a.0
  • xm pci-attach win7 0000:00:1a.1
  • xm pci-attach win7 0000:00:1a.2
  • xm pci-attach win7 0000:00:1a.7
  • xm pci-attach win7 0000:00:1b.0
Windows detects the devices, loads drivers, and prompts me to reboot to activate. So I tell Windows to reboot, and it comes back up, but nothing's showing up on my real (as vs. virtual) video screen. then I go into device-manager in Windows and see what happened. The two USB devices (keyboard and mouse) show up just fine. But the ATI video card shows up with an error. I look at what Windows tells me about the video card, and Windows tells me that there is a resource conflict with another video card -- the virtual video card provided by QEMU. So I disable the QEMU video card, reboot and... SUCCESS! I now have Windows 7 on my main console with video and keyboard and mouse!

Windows Experience reports:

  • Calculations per second: 7.6
  • Memory: 7.8
  • Graphics: 7.2
  • Gaming graphics: 7.2
  • Primary hard disk: 5.9
Those are pretty good, quite sufficient for gaming, except for the disk performance which is mediocre because we're going through the QEMU-emulated hard drive adapter rather than a paravirtualized adapter. When doing network I/O to download Civilization V via Steam I also notice mediocre performance (and high CPU utilization on the dom0 host) for the same reason. We'll fix that later. But for playing games, we're set! Civilization V looks great on a modern videocard on a 1080P monitor with a fast CPU!

Okay, so now I have a one-off boot, but I want this to come up into Windows every time my server boots. I don't want to have to muck around with a remote shell and such every time I want to play Windows games on my vastly over-powered Linux server (let's face it, a Core I7-950 with 12GB of memory is somewhat undertasked pushing out AFS shares to a couple of laptops). And that, friends, is where part 4 comes in. But we'll talk about that tomorrow.

-ELG

Pushing a video card into a VM, Part 2

Part 1 Part 2 Part 3 Part 4 Part 5

The first issue I ran into was that my hardware was inadequate to the task. My old Core-2 Duo setup lacked VT-d support. So I went to the Xen compatibility list and found a motherboard which supported VT-d and upgraded my motherboard. At the same time I also upgraded my case to an Antec case that has a slot on the front for plugging in 2 1/2 inch drives. This was to make it easier to swap operating systems. Theoretically you can hot-swap, but I've not tested that and don't plan to.

Since I am inherently a lazy penguin (much like Larry Wall), the next thing I did was try the "virtualization environments". I found that XenServer was a very well designed environment for virtualizing systems in the cloud. Unfortunately it was also running a release 3.x version of Xen rather than the new 4.0 release, and did not implement PCI passthrough or USB passthrough to fully virtualized VM's natively. There were hacks you could do, but once you start doing hacks, the XenServer environment is not really a nice place to do them. So I moved on.

ProxMox VE is a somewhat oversimplified front-end to KVM and OpenVZ. It looks like a nice environment for running a web farm via web browser, but unfortunately it does not support PCI passthrough natively either. Again, you can start hacking on it, but again once you start doing that you might as well go to a non-dedicated environment.

Ubuntu 10.10 with KVM was my next bet. I *almost* got it running, but the VM wouldn't attach the graphics card. It turns out that was another issue altogether, but looking at the versions of QEMU and KVM provided, it appeared that Fedora 14 had one version newer (as you'd expect, since Fedora 14 came out almost a month later), so I went to Fedora 14 instead.

I got close -- really close -- with Fedora 14. But two different video cards -- an old nVidia 7800GT and a new nVidia GTS450 -- both ended up with error messages in the libvirtd logs saying there was an interrupt conflict that prevented attaching the PCI device. I ranted to a co-worker, "I thought MSI was supposed to solve that!" So I looked at enabling MSI on these nVidia cards and found out that... err... no. Not a good idea, even if I wanted, the cards generally crashed things hard if you tried. So I went back to the XenSource.com wiki on VGA passthrough again and followed the link to the list of video cards, and... err, okay. an ATI Radeon 5750 has been reported as running wiith Xen's VGA passthrough.

So, I swapped that out, and tried again with Fedora 14. This time the KVM module crashed with a kernel oops.

At this point I'm thinking, otay, KVM doesn't seem to want to do this. Xen, on the other hand, seems to have a Wiki and all documenting how to do this. So let's use Xen instead of KVM. The problem is that Xen is an operating system. It relies on having a special paravirtualized kernel for its "Dom0" that handles the actual I/O driver work. Red Hat claims providing such a kernel would be too much work and that they won't do it until the Dom0 patches are rolled into upstream by Linus. This despite the fact that Red Hat has patched their kernels to the point where Linus would barely recognize them if someone plunked the source to them on his disk, but it's that whole Not Invented Here thingy again, Red Hat invented KVM and was looking for an excuse to not include a Xen dom0 kernel, and there you go. I looked at downloading a dom0 kernel for Fedora 14, but then... hmm. Look. OpenSUSE 11.3 *comes* with a XEN dom0 kernel. So let's just install OpenSUSE 11.3.

OpenSUSE 11.3 is what I eventually had success with. But to do that, I ended up having to fight Red Hat -- again. But more on that in Part 3.

-ELG

Pushing a graphics card into a Xen VM, Part 1

Part 1 Part 2 Part 3 Part 4 Part 5

One of the eternal bummers for Linux fanboys is the paucity of games for Linux. This is, in part, because Linux is not an operating system, Linux is a toolkit for building operating systems -- and each operating system built with the Linux toolkit is different, but all of them claim to be "Linux". Well, from a game designer's perspective there is no such thing as "Linux" -- each of the variants puts files in different places, each of the variants has a different way of configuring X11, and so forth. And talking about X11, that's another issue. Mark Shuttleworth got a lot of heat for saying that desktop Linux was never going to be competitive as long as it was saddled with the decades of fail that are X11, when he proposed moving Ubuntu Linux to Wayland. But the only Unix variant that has ever gotten any traction on the desktop -- Mac OS X -- did so by abandoning X11 and going to their own lighter-weight GUI library that forced a common interface upon all programs that ran on the platform (except for ported X11 programs, which were made deliberately ugly by the Mac OS X11 server that ran on top of the native UI in order to encourage people to port them to the native UI). Linux fanboys might talk about how OpenGL over X11 isn't theoretically incapable of handling gaming demands, etc. etc., but the proof is in the pudding -- if it's so easy, why isn't anybody doing it?

So anyhow, one of the interesting things about the Intransa Video Appliance is that it looks like Windows if you sit down at the console... but behind the scenes, it's actually VMware on top of a Linux-based storage system. So why not, I wondered, just push the entire video subsystem into Windows via VT-d? I mean, it's not as if Linux user interfaces run any slower remotely displayed over VNC than they do locally, they're pretty light-weight by modern standards. So if you could push the display, keyboard, and mouse into a Windows virtual machine that was started up pretty much as soon as enough of Linux was up and going to support it, you could have a decently fast gaming machine, *and* have a good Linux development and virtualization server -- all on the same box.

So, I assembled my selection of operating systems and started at it. I assembled the bleeding edge of Linux -- Ubuntu 10.10,Fedora 14, Citrix XenServer 5.6.0, ProxMox VE version 1.6, and OpenSUSE 11.3, and set to work seeing what I could do with them...

Next up: Part 2: The distributions.

-ELG

Tuesday, November 2, 2010

Microsoft in a nutshell

It's no secret that Microsoft is a company in trouble. At one time they had a significant portion of the smartphone market, now they're an also-ran with single-digit market share. Their attempt to buy consumer marketshare in the gaming console market has generated some marketshare, but also significant losses. Their Zune Phone experiment lasted only two months before ignoble abandonment. The only things they have that make money right now are their core Windows and Office franchises -- the entire rest of the company is one big black hole of suck, either technologically, financially, or both. And while their market share in desktop operating systems is secure for the foreseeable future, with no viable competitor anywhere in sight (don't even mention Linux unless you want to cause gales of laughter, Linux on the desktop is a mess), Office faces a threat from OpenOffice. Plus, their very profitable Windows Server franchise, which accounts for a small percentage of their unit sales but a large percentage of their revenue, is steadily eroding as it becomes clear to almost everyone who isn't tied to Microsoft Exchange that Linux rules the world. Amazon EC3 runs on Linux, not Windows -- as does every other cloud play on the Internet. 'Nuff said.

Today something happened which epitomized this suck. I opened up an email in Microsoft Hotmail. At the top of the email, in red, was the following message: "This message looks very suspicious to our SmartScreen filters, so we've blocked attachments, pictures, and links for your safety."

The title of the email: "TechNet Subscriber News for November".
The sender of the email: technote@microsoft.com

Siiiiigh... even their own spam filter thinks they suck.

-ELG

Thursday, October 28, 2010

Setting up bridging and vlans in Fedora 13

In the previous half of this, I talked about how Fedora's NetworkManager interfered with complex configurations, and discussed how to disable it. Now I'll show you how to define a network that consists of:
  • Two Ethernet ports, bridged:
    • eth0 - to public network
    • eth1 - to internal network transparently bridged to public network
  • VLAN 4000 on internal network, *NOT* bridged to public network
  • Bridge to VLAN 4000 for my KVM virtual machines to connect to, *NOT* bridged to public network.
Now, to do all of this we take advantage of an interesting feature of the Linux networking stack -- bridges aren't "really" bridges. If a packet arrives at the physical eth1 hardware, it gets dispatched to either eth1.4000 or eth1 based upon the VLAN tag. Only those packets that are dispatched to eth1 actually make it onto the bridge to go to eth0. In other words, the Linux bridging code is a *logical* bridge, not a *physical* bridge -- it is not the physical ports that are connected, it is the logical Ethernet devices inside the Linux networking stack that are connected, and eth1.4000 and eth1 just happen to be connected to the same physical port but otherwise are logically distinct with the dispatching to the logical Ethernet device happening based upon the VLAN header (or lack thereof).

So, here we go:

/etc/sysconfig/network-scripts/ifcfg-eth0:
DEVICE=eth0
BRIDGE=br0
ONBOOT=yes

/etc/sysconfig/network-scripts/ifcfg-eth1:
DEVICE=eth1
BRIDGE=br0
ONBOOT=yes

/etc/sysconfig/network-scripts/ifcfg-eth1.4000:
VLAN=yes
DEVICE=eth1.4000
BRIDGE=br1
ONBOOT=yes

/etc/sysconfig/network-scripts/ifcfg-br0:
DEVICE=br0
TYPE=Bridge
USERCTL=yes
ONBOOT=yes
BOOTPROTO=dhcp

/etc/sysconfig/network-scripts/ifcfg-br1:
DEVICE=br1
TYPE=Bridge
USERCTL=yes
ONBOOT=yes
BOOTPROTO=static
IPADDR=192.168.22.2
NETMASK=255.255.255.0

And there you are. Two bridges, one of which has a single port (eth1.4000) for virtual machines to attach to, one of which bridges eth0 and eth1 so that the machine plugged to eth1 on this cluster can also make it out to the outside world (via that bridge) as well as having the VM's on both systems communicate with each other (via the bridge attached to eth1.4000 on the 192.168.22.x network). Internal VM cluster communications stay internal, either within br1 or on VLAN 4000 that never gets routed to the outside world (we'd need an eth0.4000 to bridge it to the outside world -- but we're not going to do that). This does introduce a single point of failure for the cluster -- the cluster controller -- but it's one that's managable, if we need to talk to the outside world after the cluster controller dies we can simply plug in the red interconnect cable to a smart switch that blocks VLAN 4000 rather than into the cluster controller if the cluster controller goes down.

Now, there's other things that can be done here. Perhaps br1 could be given an IP alias that migrates to another cluster controller if the current cluster controller goes down. Perhaps we have multiple gigabit Ethernet ports that we want to bond together into a high speed bond device. There's all sorts of possibilities allowed by Red Hat's "old school" /etc/sysconfig/network-scripts system, and it won't stop you. The same, alas, cannot be said of the new "better" NetManager system, which would simply throw up its hands in disgust if you asked it to do anything more complicated than a single network port attached to a single network.

-ELG

Sunday, October 17, 2010

Opaque Linux

One of the things that is starting to annoy me is the increasing cluttering of Linux with opaque subsystems that have annoying bugs that are difficult to diagnose. Some are easy enough to work around -- udev's persistent-net-dev rule, for example, might make replicated virtual machines have no working network devices, but it's easy enough to simply remove the file (and fix the /etc/sysconfig/network-scripts files to remove any hardwired HMAC there too, of course, since using ovftool to push a virtual machine to ESX automatically gives it a new HMAC). But others -- like the NetworkManager subsystem that I mentioned last week -- either work or don't work for you. They're opaque black boxes that are pretty much impossible to do workarounds with, other than just completely disabling them.

I just finished rebuilding my system with the latest goodies to do virtualization -- I now have a quad-core processor with 12 gigs of memory and VT-d support. As part of that I just upgraded to Ubuntu 10.10, their latest and greatest. I've been using Fedora 13, Red Hat's latest and greatest, at work for the past month. My overall conclusion is... erm. Well. Fedora has its issues, but Ubuntu is getting to the point of utter opaqueness. For example, Ubuntu has a grand new grub system that generates elaborate boot menus. The only problem: Said elaborate boot menus are *wrong* for my system, they all say (hd2) where they should say (hd0). And the system that generates these elaborate boot menus is entirely opaque.... though at least I can go into the very elaborate grub.cnf file and manually edit it. Well, unless a software update has happened, at which point the elaborate boot menu generator subsystem runs again and whacks all your hd0's back to hd2s... even in entries that are old. Red Hat's grubby might be old and creaky, but at least it's never done *that* kind of silliness.

Now, granted, I am running rather unusual hardware in a far different configuration from what a desktop Linux user would want. If you want a desktop Linux system, I still believe Ubuntu is the best Linux you can put onto your system, I've run Ubuntu on the desktop for years and it serves well there. I especially like Ubuntu 10.10's ability to transparently encrypt your home directory similar to the way MacOS can, this will resolve a lot of issues with lost laptops and stolen data, this is an opaque subsystem but necessarily so. You can also put the proprietary Nvidia video drivers onto your system with a simple menu item, while with Fedora 13 you have to fight to put the proprietary driver onto the system (the GPL driver is loaded *in the initrd*, which makes it a PITA to get rid of). In short, if I were running Linux on the desktop, 10.10 is hard to beat. But for my purposes, doing virtualization research with KVM and VT-d, I'm wiping out Ubuntu in the morning and installing Fedora 13.

-ELG

Sunday, October 10, 2010

NetworkManager Sucks

Do a Google on the above search term, and you get over 20,000 results. NetworkManager is yet another attempt by the inept Linux desktop crew to make the Linux desktop look Windows-ish, and is as Fail as you might imagine anything desktop-ish from that gang. I mean, c'mon. We're still talking about an OS that cannot dynamically reconfigure its desktop when you plug an external monitor into your laptop, something that both Windows 7 and MacOS have absolutely no problem with. You expect them to get something as simple as a network manager correct? Uhm, no.

Stuck in Red Hat Enterprise Linux server-land for so many years, I didn't have to deal with NetworkManager. But now I'm doing some work with kvm/qemu which requires me to have the latest and greatest kvm/qemu, which requires me to get bleeding edge on my Linux distribution because mismatches between your kernel version and the userland tools can cause some weird issues, like seemingly random kernel panics where old tools don't fill out new fields and system calls end up randomly crashing (this should never happen, BTW, the kernel should check all inputs and reject any calls that would result in a kernel panic, but I have first-hand proof that this is happening). And bleeding edge either Ubuntu or Fedora means you get to deal with NetworkManager.

By and large folks who have a single network card plus a single WiFi adapter in their Linux box and don't intend to do anything unusual will have no problems with NetworkManager. But I wanted to set up a configuration that was unusual. My Linux server has two gigabit NIC's in it. One goes out to the corporate network. I wanted the other to be a direct connection to the Ethernet port on my Macbook Pro, and then have the two be bridged so it also appears I'm directly on the corporate network. I also wanted to set up a private non-routed VLAN tied to the specific network port between my Macbook Pro and the Linux server so I could set up a private network for file sharing between the two systems -- it's still far more pleasant to do editing, email, word processing, etc. on the Mac and use the Linux box as just a big bucket of bytes. Netatalk is not perfect but works "good enough" for this particular application.

All of this is functionality that Red Hat Enterprise Linux had implemented correctly by RHEL4 days (I know that because I backported most of the RHEL4 stuff to RHEL3 to get all of this functionality working in RHEL3 back in the old days for the Resilience firewalls). That is, we're talking about things that have worked correctly for over six years. Unless -- unless you're using a system that has its network cards managed by NetworkManager, at which point you're SOL unless you disable NetworkManager, because the system absolutely refuses to bring up bridges and vlans if you have NetworkManager enabled.

So my first inclination was to simply take an axe to NetworkManager and go back to the old way of doing things, which as I point out has been working for years. That is, "rpm -e NetworkManager". At which point I find out that half of bloody Gnome seems to have a dependency upon NetworkManager, albeit two dependencies removed, and while Gnome is evil it's the lesser of two evils (hmm, sounds like politics there, eh?). So, I settled for the simple way of disabling it:

  • # chkconfig NetworkManager off
  • # service NetworkManager stop
At that point I no longer have any networks configured other than localhost (eep!), so then I get to set up the bridging and VLAN by hand in the /etc/sysconfig/network-scripts files and bring everything up (GUI tools? Linux penguins don't need no freepin' GUI tools, we just cat - >somefile when we want to configure Linux ;-). But I'll talk about that in my next post.

-ELG

Wednesday, September 29, 2010

And the winner is...

The iPhone 4.

The Droid X is an awesome piece of hardware. But my experiments with the various Android-based phones said to me that Android is still a work in progress. Each of them had odd bits of user interface that seemed unfinished or clunky or just plain badly thought out. After my brief experience with the Google hiring process it's pretty clear why that's true -- Google's hiring process, other than for a few superstars, has a built-in bias towards young mathematical types who recently took an algorithms course in college and also has a built-in filter to get rid of those of us who've been around long enough to know what we like to do and are good at doing. Youth has its advantages, but also its disadvantages -- young arrogant mathematical types rarely give much thought to user experience.

Which reminds me of an incident at a prior job. Me and my office-mate combined had maybe 15 years experience in the industry at the time, had actually used our product in production environments before joining the company that made it, and were now working on taking it to the next level with a new GUI and a new management infrastructure around our core data engine to make it easier to use in modern network environments. Our boss had about 20 years experience in the industry. So my boss assigns one of these young arrogant math PhD types to mock up a user interface for our project, we gave him the basic architecture and workflow and told him "make it easy to use." So he produces this mock up and calls me in to take a look at it and I scratch my head because I can't make heads or tails of what he's done, it isn't oriented around the workflow of any site admin that I've ever encountered. I call in my office-mate. He can't make heads or tails of it either. We ask this young brilliant mathematical type just out of college questions about how to do various site-admin-ish kinds of things, and he takes us on this long complicated set of procedures through a number of incomprehensible dialogs. My office-mate and I say "This doesn't seem like an easy to use interface for site admins." He goes, "but it's obvious! It's simple!"

So we look at each other, think, "hmm, he seems really sure about this, maybe it's just us," and call in our boss, just telling him "You have to look at this" but not why we want him to look at it. By this time we have 35 years of experience in the room, people who've actually used the technology in question in production environments. He runs through the same thing as we did, and comes to the same conclusion. By this time the arrogant young mathematical type is in pure snit mode. How *dare* we question his impeccable user interface! I explain to him that there's 35 years of experience in this room who've actually used the technology in production environments, so if we can't make heads or tails of it our customers will be utterly lost. "Then your customers are idiots!" he shouts.

Indeed. Indeed. But they are *paying* idiots. Which is what Apple understands. The customer may not always be right, but the customer is what keeps you in business, and customers want something that doesn't require a math PhD to understand or use. And in that regard, the iPhone despite its occasional glitches is still the one to beat, and Android still has a lot of growing up to do. As for that math PhD guy? Eventually after another design disagreement months later (on internals, not GUI -- we knew not to put him anywhere near the GUI by then) he stomped off and turned in his resignation because we didn't properly respect his brilliance, and my manager's manager talked him into working in another area of the company on another product rather than resigning outright, and eventually he turned in his resignation from *that* position after his tastes in user interface proved to be equally daunting for them for the same basic reason. Oddly enough, after a few years of seasoning elsewhere in the industry to brush the edges off his arrogance to turn it into less-obnoxious self confidence, he turned out to be a decent engineer... just don't put him anywhere near a user interface, for cryin' out loud. Which Google would have done immediately, if the Android user interface is any indicator.

-ELG

Tuesday, August 24, 2010

Droid or iPhone?

On the left side, weighing in at 4.3 inches, the Motorola Droid X. This massive hunka hunka burnin' cell phone love is running a 1ghz TI OLED processor and has 8gb of built-in memory for programs and 16gb of Microsd memory for data, as well as an 8 megapixel camera.

On the right side, weighing in at 3.8 inches, the Apple iPhone 4. This sleek little beauty is running an 800mhz Samsung A4 with a 5 megapixel camera whose sensor is the same size as the Droid's (i.e., fewer, but more sensitive, pixels).

So who is the winner? The iPhone 4's camera, despite fewer pixels, is definitely better than the Droid X camera. As in, ridiculously good. On the other hand, the iPhone 4 is more tightly-walled than any previous iPhone. As in, jailbreaking it to run "unauthorized" software is ridiculously difficult. Given the way AT&T and Apple cripple the thing, that's a major problem. The iPhone's screen is too teensy for my not-as-young-as-they-used-to-be eyes. On the other hand, it also has the iPod ecosphere with it, and tight integration with my Macbook Pro, and all my current software will continue working with it.

The Droid X, on the other hand, is easily "rooted". Motorola has created their own ecosphere of sorts, with a car dock and charger, home docks, etc. so that you can do pretty much the same things as with the Apple ecosphere. It doesn't by default have any integration with my Macbook Pro, but a third-party program called The Missing Sync will do most of that. The big screen is nice for older eyes. But Android itself is ugly and clunky, though serviceable.

So who's the winner? Call it a draw -- for now. Which presents a problem, because my aging iPhone 3G really is not liking iOS 4.0, everything runs really slow and clunky. Maybe I ought to flip a coin... or maybe if I wait a few more months, the horse will sing. Hmm...

-ELG

Migration concluded

The penguin has now landed at a new employer. I'll update my LinkedIn profile with that information after I've had a few days to de-stress and relax... the past couple of weeks have been a wild, wild ride, reminding me a bit of the last couple of weeks before the deadline for a major new product. But having too much interest in your skills definitely beats the alternative :).

-- ELG

Monday, August 9, 2010

Action items

I had joined the company a few weeks earlier and was sitting in yet another raucous meeting. The latest attempt at a new product had failed, and the blame-casting and finger-pointing were at full tilt. Finally I sighed, and added my own say. "Look. I'm new here and I don't know what all has gone on, and really don't care who's to blame for what, blame isn't going to get anything done. What I want to know is, what do we need to do now?"

Person 1: "Well, we failed because we weren't using software engineering system X" (where X is some software engineering scheme that was popular at the time).

"Okay, so we'll use software engineering system X, I have no objection to using any particular system, as long as we use one. What's the first thing we need to do, in that system?"

Person 2: "We need to figure out what we want the product to do."

"Okay, let's do that. What is the product supposed to do?"

We discussed it for a while, then from there the meeting devolved into a list of action items, and eventually broke up with another meeting scheduled to work on the detailed functional requirements. But on the whiteboard before we left, I had already sketched out the basics of "what we want it to do", and eventually that turned into an architecture and then a product that is still being sold today, many years later.

So what's my point? Simple: Meetings must be constructive. One of the things my teacher supervisors told me, when I first entered the classroom, was to always ask myself, what do I want the students to be doing? And then communicate it. A classroom where every student knows what he's supposed to be doing at any given time is a happy classroom. Idle hands being the devil's workshop and all that. The same applies to meetings. Unless it's intended to be an informational meeting, meetings should always be about, "what do we want to do". And meetings should never be about blame-casting, finger-pointing, or any of the other negative things that waste time at meetings. No product ever got shipped because people pointed fingers at each other.

Everybody should have a takeaway from a development meeting -- "this is what I am supposed to be doing." Otherwise you're simply wasting time. So now you know why one of my favorite questions, when a meeting has gone on and on and on and is now drawing to a close but without any firm conclusion, is "what do we need to be doing? What are our action items?" We all need to know that we're on the same page and that we all know what we're supposed to be doing. That way there are no surprises, there are no excuses like "but I thought Doug was supposed to do that task!" when the meeting minutes show quite well that Doug was *not* assigned that action item, and things simply get done. Which is the point, after all: Get the product done, and out the door.

--ELG

* Usual disclaimer: The above is at least slightly fictionalized to protect the innocent. If you were there, you know what really happened. If you weren't... well, you got my takeaway, anyhow.

Sunday, August 8, 2010

Architectural decisions

Let's look at two products. The first product is a small 1U rackmount firewall device with a low-power Celeron processor and 256 megabytes of memory. It can be optionally clustered into a high availability cluster so that if one module fails, the other module takes over. Hard drive capacity is provided by a 120gb hard drive or a 64GB SSD. The second is a large NAS file server with a minimum configuration of 4 gigabytes of memory and with a minimum hard drive configuration of 3.8 terabytes. The file system on this file server is inherently capable of propagating transactions due to its underlying design.

So: How are we going to handle failover on these two devices? That's where your architectural decisions come into play, and your architectural decisions are going to in large part influence how things are going to be done.

The first thing to influence our decisions is going to be how much memory and CPU we have to play with. This directly influences our language choices, because the smaller and more limited the device, the lower level we have to go in order to a) fit the software into the device, and b) get acceptable performance. So for the firewall, we chose "C". The architect of the NAS system also chose "C". As an exercise for the reader, why do you think I believe the architect of the NAS system was wrong here? In order to get acceptable performance with the small module, we chose a multi-threaded architecture where monitor threads were associated with XML entries of what to monitor, and faults and alerts were passed through a central event queue handler which used that same XML policy database to determine which handler module (mechanism) to execute for a given fault or alert event, nothing was hard-wired, everything could be reconfigured simply by changing the XML. The architect of the NAS system had an external process sending faults and alerts to the main system manager process via a socket interface using a proprietary interface, and the main system manager process then spawned off agent threads to perform whatever tasks were necessary -- but the main system manager process had no XML database or any other configurable way to associate mechanism with policy. Rather, policy for handling faults and alerts was hard-wired. Is hard-wiring policy into software wise or necessary if there is an alternative?

The next question is, what problem are we going to solve? For the firewall system, it's simple -- we monitor various aspects of the system, and execute the appropriate mechanism specified by XML-configured policies when various events happen with the goal of maintaining service as much as possible. One possible mechanism could be to ask the slave module to take over. Tweaking policy so that this only happens when there's no possibility of recovery on the active module is decidedly a goal because there is a brief blink of service outage as the upstream and downstream switches get GARP'ed to redirect gateway traffic to a different network port, and service outages are bad. We don't have to worry about resyncing when we come back up -- we just resync from the other system at that point, if we had any unsynced firewall rules or configuration items that weren't on the other system at the point we went down, well, so what. It's no big deal to manually re-enter those rules again. And in the unlikely event that we manage to dual-head (not very likely because we have a hardwired interconnect and differential backoffs where the current master wins and does a remote power-down of the slave before the slave can do a remote power-down of the master), no data gets lost because we're a firewall. We're just passing data, we're not serving it ourselves. All that happens if we dual-head is that service is going to be problematic (to say the least!) until one of the modules gets shut down manually.

For the NAS system, it's quite a bit harder. Data integrity is a must. Dual-heading -- both systems believing they are the master -- requires either advanced transaction merge semantics when partitioning is resolved (transaction merge semantics which are wicked hard to prove do not lead to data corruption), or must be avoided at all costs by having all systems associated with a filesystem immediately cease providing services if they've not received an "I'm going down" from the missing peer(s), have no ability to force the missing peer to shut down (via IPMI or other controllable power), and no way of assuring (via voting, or other mechanisms) that the missing peers are going down. Still, we're talking about the same basic principle, with one caveat -- dual-heading is a disaster and it is better to serve nothing at all than risk dual-heading.

For the NAS system, the architectural team chose not to incorporate programmable power (such as IPMI) to allow differential backoffs to assure that dual-heading couldn't happen. Rather, they chose to require a caucus device. If you could not reach the caucus device, you failed. If you reached the caucus device but there were no update ticks on the caucus device from your peer(s), you provided services. This approach is workable, but a) requires another device, and b) provides a single point of failure. If you provide *multiple* caucus devices, then you still have the potential for a single point of failure in the event of a network partition. That is because when partition happens (i.e. you start missing ticks from your peers), if you cannot reach *all* caucus devices, you cannot guarantee that the missing peers are not themselves updating the missing caucus device and thinking *you* are the down system. How did the NAS system architectural team handle that problem? Well, they didn't. They just had a single caucus device, and if anybody couldn't talk to the caucus device, they simply quit serving data in order to prevent dual-heading, and lived with the single point of failure. I have a solution that would allow multiple caucus devices while guaranteeing no dual-heading, based on voting (possibly weighted in case of a tie), but I'll leave that as an exercise to the reader.

So... architectural decisions: 1) Remember your goals. 2) Make things flexible. 3) Use as high-level an architecture as possible on your given hardware to ensure that #2 isn't a fib, i.e., if what you're doing is doable in a higher-level language like Java or Python, for heaven's sake don't do it in "C"!. 4) Separate policy from mechanism -- okay, so this is same as #2, but worth repeating. 5) Document, document, document! I don't care whether it's UML, or freehand sketches, or whatever, but your use cases and data flows through the system *must* be clear to everybody in your team at the time you do the actual design or else you'll get garbage, 6) Have good taste.

Have good taste? What does that mean?! Well, I can't explain. It's like art. I know it when I see it. And that, unfortunately, is the rarest thing of all. I recently looked at some code that I had written when I was in college, that implemented one of the early forum boards. I was surprised and a bit astonished that even this many years later, the code was clearly well structured and showed a clean and well-conceived architecture to it. It wasn't because I had a lot of skill and experience, because, look, I was a college kid. I guess I just had good taste, a clear idea of what a well-conceived system is supposed to look like, and I don't know how that can be taught.

At which point I'm rambling, so I'm going to head off to read a book. BTW, note that the above NAS and firewall systems are, to a certain extent, hypothetical. Some details match systems I've actually worked on, some do not. If you worked with me at one of those companies, you know which is which. If you didn't, well, try not to read exact details as gospel of how a certain system works, because you'll be wrong :).

-ELG

Sunday, August 1, 2010

The migration of the penguin

I have added a link to my resume in the left margin, in case someone is interested in hiring a long-time Linux guy who knows where the skeletons are buried and, if you need something Linux done, probably has already done it at least once...

-ELG

Saturday, July 31, 2010

The world's most reluctant Linux advocate

In the fall of 1995, I had successfully brought to completion the project to comply with new federal and state reporting standards for school discipline for the consortium of school districts that our consulting firm served, and was busy cleaning up the master student demographics suite to properly incorporate the new discipline screens rather than have it be a stand-alone subsystem reaching into the student database. It was a hard slog -- the code was a mess. The guy who had written it, who I had replaced, was a math guy, not a computer guy, and he had no inkling of simple things like comments or code reuse, the product life cycle was a mystery to him. His notion of code reuse was to cut and paste the same code multiple places, and there were some significant bugs that I was cleaning up. The only good news was that the code was heavily componetized -- it was basically a hundred small programs tied together by a menu system and a common database, though some of the programs seemed to be bigger programs because they forked out to other programs to provide more screens to school secretaries. All of this was running on SCO Xenix or Unix, depending upon the school and when it had bought our software suite.

So during all this my boss calls me into his office and says, "one of our districts has asked if we're investigating Linux as a possible way to bring down costs for school districts. What do you think?" Now, one thing to remember is that my boss was a big old ex-IBM bull, a no-BS kind of guy, and we got on about like you'd expect from two people with strong opinions but mutual respect. "Linux is freeware downloaded off the Internet," I replied. "Don't we have enough trouble maintaining our own code right now without having to maintain some freeware downloaded off the Internet too?" And that was pretty much that. Still, I thought, "hmm, I have that new Windows 95 machine at home, I bet it'd run Linux." I'm a geek, and what geek wouldn't want to play with a free operating system?

So, after work I headed off to the local Barnes & Noble to grab a book about Linux. The one I bought had something called "Slackware 95" on CD in the back. I took it home with me that night and installed it on a partition on my home computer. So I installed it and figured out how to get "X" running and... well, it worked okay. fvwm was ugly and crude and limited, and there wasn't much desktop software, no real word processor, but I knew LaTex and it had LaTex, so that was good. It drove my laser printer fine too. So the next day at work, I went ahead and installed it on our eval machine at work, where we'd also installed Windows 95 to see what we could do about porting our UI to it. I compiled our source tree on Linux and... hmm, it just compiles, just like it compiles on SCO Unix? And it actually ran!

So I started developing on Linux instead of on SCO Unix, mostly because it was much easier to get Emacs up and going and I prefer Emacs to 'vi' (let the flame wars begin!), not to mention that the GNU tool suite is a lot nicer than the old-school Unix tools. When I finished a module and did initial smoke testing I'd then copy the code over to SCO Unix and compile again there. From time to time I'd also go into the menu system and create a Linux version of one of the SCO Unix system administration programs that we'd accumulated over the years to allow school technology coordinators to manage the system. But I still hadn't considered actually deploying Linux at schools. While it seemed the technology held up okay -- our software actually ran faster on Linux than on SCO Unix -- the business objections were formidable. "We don't want to trust our critical student data to some hackerware downloaded off the Internet!" was the least of it.

That changed in the Spring of 1996, however, when Red Hat Software came out with their 3.0.3 version of Red Hat Linux, which they marketed as "Linux for business". It came in a box! With a manual! From a real company! Complete with a shadow businessman logo wearing a red hat marching off to do business with his briefcase! For the first time, the possibility of actually using Linux as part of our business was not ridiculous. The only thing I really didn't like about 3.0.3 was that all of the system administration tools were TCL/TK GUI scripts, but given that I'd already written a number of menu-based system administration scripts, that didn't seem a fatal objection. I switched from Slackware to Red Hat 3.0.3, and kept on developing under Linux rather than SCO Unix.

So, early June 2006 came along, and we got another school district as a customer. My boss called me into his office again. "What would it take to port our software to Linux?" he asked. "I pretty much already have it ported," I said. "Maybe two weeks to do a thorough job of testing and filling in any system administration scripts that aren't yet rewritten, and it would be ready." "We have this new customer. With our winning bid we could make more money selling Linux rather than SCO Unix, the OS isn't specified in the bid, should we do SCO Unix or Linux?" "Well, Linux has some risks involved in it," I replied. "We still haven't tested it with real data, it should work, but there's no guarantee." He then said that we hadn't won the hardware bid, and there was no guarantee that it would work with SCO. I then suggested a dual-OS strategy -- plan on using *either* of the operating systems, depending upon which one worked with the hardware when the hardware came to us for us to install the OS and administrative software. Given the wholesale pricing I'd gotten from Red Hat Software, we could purchase an official copy of Red Hat Linux 3.0.3 for each school to counter the "hackerware downloaded from the Internet!" objection and basically it was lost in the noise compared to the significant cost of SCO Unix.

So the hardware came in, and the tape drive was supported by Linux, while it was not supported by SCO Unix. We had two options at that point -- delay the deployment until tape drives supported by SCO Unix could be procured, or deploy with Linux. We were scheduled in two weeks to have the machines at a high school gymnasium at the school district to train the school secretaries on how to use the software. It would take at least two weeks to argue with the school district and the hardware vendor about tape drives.

"We go Linux," I told my boss. And we did. I spent the next two weeks sweating the details, making sure everything worked, using real data from a real school district (with the student ID information masked out) to validate that all functions of the software itself did what they were supposed to do, going through all the management screens to make sure they worked properly with the hardware on the systems, and so forth. On the appointed day I drove the main Linux development machine to the school district myself, and stayed on hand while the secretaries all booted their machines, just in case something broke, and... it didn't. Everything Just Worked, without a hitch, all the demos went off as planned, and my inservice training on the discipline system went on as usual, the secretaries were quite attentive, laughed at the right points (i.e., when I produced an official state discipline form with a ridiculous discipline infraction for them to punch into their computers and made an offhand humorous comment about it), and... phew!

That's always the moment of truth: when the product hits the customer's hands. You either pass or fail at that point. I'm proud to say that we passed, and became one of the first of what eventually became a thundering storm of people migrating away from proprietary Unix systems to Linux. Over the next three years we transitioned all of our schools to Linux -- it simply made things easier only having to maintain one set of administrative tools, and it wasn't as if it cost any money, we usually did it when they were upgrading old hardware so we were getting paid for that service and Linux came along for the ride, and it Just Worked. And what more can you say?

I suppose there's a couple of lessons there. First, don't dismiss Open Source software just because it's "some hackerware downloaded off the Internet." Secondly: don't use Open Source software just because it's Open Source if you can't make a business case for it in terms of risks vs. benefits. We couldn't make a business case for it in the fall of 1995, we simply did not have the engineering cycles to handle a transition to Linux given the state of Linux at that time, the risks outweighed the possible benefits. By the summer of 1996, when the code base issues had been resolved, the primary objection of customers about using Linux had been resolved, and the issues of hardware compatibility and profit became key, Linux simply Made Sense. It still wasn't the safe choice. But the risks were limited enough at that time compared to the benefits to justify taking the risk.

-ELG

Thursday, July 15, 2010

The value of an education

For some reason Americans seem to believe education is something people receive. People go to college to "receive an education". This implies that students are simply receptacles. The professor opens up their heads and drops knowledge in, then sews them back up. This worries me when I'm looking at the quality of the young people entering the computer science field today, because while they've had exposure to a lot of technology, by and large it's been as users, not as participants. The actual technology is something they don't even think about -- it's transparent, just a part of their world, not something they actually see and think about.

The problem is that education isn't something you receive. Education is something you do. I graduated from a middle-tier university. Which means nothing at all, actually, because I knew my **** when I left there, I'd actually designed bit-slice CPU's and microcode for them and built hardware and written programs in microprocessor assembly language *for fun* while the guy across the street with the 4.0 GPA knew nothing except what was on the test, I mean, he'd been writing software on Unix minicomputers for four years and he didn't even know what 'nroff' was or that he was using Unix! So yeah, it's all about what use you make of the experience. I spent as much time in professor's offices talking about my latest projects as I spent studying, which hurt my GPA, but (shrug). I'm still employed in the computer field today. The guy across the street? Nope.

That is one reason why Open Source is exciting to me, and why people who have a background in the Open Source community interest me far more than people who have a 4.0 average from Big Name University. I'm looking for doers, not regurgitators. What gets software shoved over the transom isn't the ability to memorize what's going to be on tests, it's what, for lack of a better term, I call "get'r'done". The problem I see is that the technology has become so capable, so complex, so difficult to grasp, that the number of people who could learn the basics of some simple technology like a Commodore 64 then build up to writing significant Linux kernel subsystems has basically slowed to a dribble. Simple and relatively open technology like the Commodore 64 where you could grasp the entire design all by your lonesome (the programming manual came with a schematic of the computer in it!) simply doesn't exist anymore. For good reason in most cases, today's computers have far better functionality, but how are we going to get the people with the "big picture" today when there's no "little picture" like a Commodore 64 to build up from?

So anyhow: That's a problem. It's a problem I find with a lot of the younger software engineers. I've managed some very bright youngsters, but that lack of what I'll call big picture thinking hinders them greatly. They simply don't understand why a busy loop waiting for input is not acceptable unless there is no alternative and why it should have a timer to put the process to sleep between samples, or what the hardware looks like and how to program the front panel that's driven by a PIC processor. They're like ferrets, it's all "oooh, shiney!" to them, with no rhyme or reason or understanding of what's actually happening under the surface. And I have no idea at all what's going to happen when all us older farts get put out to pasture either via corporate executives calling us "too old and expensive" or simply getting too tired and retired... there just isn't enough of the young folks who have the slightest clue. Not that we were the majority even when I was 21, but at least there was a sizable number who *did* have a clue then... and you can find a lot of their names looking at the early Linux kernel patch sets. But even the Linux kernel crowd is graying today... and what happens when we're no longer around, given that the number of young people today who understand technology at the same comprehensive level we do -- or that we did at age 21 -- is essentially zero?

-ELG

Tuesday, June 29, 2010

That XKCD 619 feeling

A Linux advocate says:

The case for using Apple software of Microsoft Windows for something is so slim it tends to sound like the techno lust (sooo shiny ...) or the machinations of a mad man (I HAVE TO HAVE IE!!!!!).

Ah yes, I'm getting that XKCD 619 feeling again, where Linux advocates say about usable user interfaces, "why would anybody want that?!". I've been using and developing for Linux since 1995, so I'm not exactly a newbie. I have the latest Ubuntu on my big Linux development machine (the latest Fedora is similar in my experience) and you know the latest Ubuntu desktop with a high-end graphics card reminds me of? It's as if someone had described MacOS and Windows 7 to engineers in the old Soviet Union, and they sat down and wrote their own clunky half-a** clone based upon nothing but those descriptions. You can practically hear the clunks of heavy metal and whirring of primitive gyroscopes as you operate it. I'm sorry, but anybody who says that Linux has the usability of MacOS or Windows 7 on the desktop is drinkin' some mighty strong kool-aid.

KDE is a bloated incoherent resource-hogging mess (consider it the Windows Vista of Linux desktops), and Gnome's primitive old-school Windows 95 Meets Motif style desktop is usable compared to the competition only if you have a high-end graphics card and can enable 3D Effects and their CCCP-style Expose' and Spaces clones (I say CCCP-style because they have significant usability issues compared to the real thing). And both are limited by "X" which has significant problems dealing with the modern world and hot-pluggable monitors. As in, it doesn't do it. On the day when you can plug an external monitor into your Linux laptop and have the desktop automagically just extend onto the new monitor, with no "dead spaces" and no problems dragging and dropping things between monitors, let me know. Right now, due to Xinerama basically being abandonware, the only multiple-monitor setup that works properly is nVidia's, and only with two same-sized monitors (otherwise there are created "dead spaces" that can eat your windows so you can't get at them), and only if you manually set it up using nVidia's own setup program. Wow, how competitive with MacBooks (where it Just Works) or Windows 7 (one right-click to get Display Settings, then select "extends desktop" rather than "mirrors desktop" from the Displays options) is that? Err... not!

I use Linux where it is appropriate -- my web and email server is running Linux, and I'm developing on Linux for embedded servers that run Linux. But to say there's no reason to use anything other than Linux is just koolaid-drinking ... and, uhm, for the guy who says his HTC Evo 4G proves FOSS rocks, I might point out that the EVO 4G is running a proprietary closed-source "skin" (HTC's "Sense" UI). Yeah, that's "proof" alright... but maybe not of what the original commenter claimed :).

--ELG

Tuesday, June 22, 2010

The new alternative to VMware: KVM

Both the latest Ubuntu and the latest Red Hat are shipping with a new alternative to VMware Server called QEMU-KVM. I've been playing with it, and it is much faster and lighter weight than VMware Server, as well as being more flexible and easier to use.

To get started with QEMU-KVM on Ubuntu 10.04, first install kvm and qemu-kvm from aptitude. Then install virt-manager. After that, System Tools->Virtual Machine Manager will bring up your virtual machine management console.

You'll see two entries when you do this:

  • localhost (QEMU Usermode) - Not Connected
  • localhost (QEMU)
Double-click on localhost (QEMU) and it'll connect to the local root virtual machine manager. You could also connect to other machine's managers, if you're wanting to, say, manage the virtual machines on a host in your data center, by using File->Add Connection. Now you'll probably want to set up a data pool for use by your new virtual machines. Most of us put the virtual machines on their own partition, not on the root partition, but the default data pool is in /var/lib/libvirt/images -- which is on the root partition. Ick. Never fear, right-click on the localhost(QEMU) and select 'Details', then click on the 'Storage' tab when you get the details. Click "+" to add your new storage pool, once you define its location click the green 'play' button to make it active, then hit the red delete button to get rid of the 'default' pool. You now have a new default storage pool at the location you desire.

Okay, so you have your data pool, now what about creating a virtual machine? Easiest way to do that is to use an ISO image of your favorite distribution. Just right-click on the localhost(QEMU) entry again, and select 'New'. The resulting wizard is ridiculously easy to navigate as long as you remember that it's going to create it in whatever your enabled data pool is when you tell it to 'create a disk image on the computer's hard drive'.

So, after this you should be able to run the virtual machine and install your ISO on it. Remember that ctrl-alt gets you out of the QEMU console back into the regular Linux desktop environment, and you'll be fine. To open a console, just right-click and select 'open'. Or once you have a VM set up and installed, you can shut it

Okay, so what's the limits of QEMU/KVM right now? First of all, don't expect to run graphical environments via the normal console with any kind of responsiveness. It emulates a very slow/old display card which is then screen-scraped by a vnc server. KVM is mostly useful for running non-GUI setups, such as Asterisk servers or hosted virtual web servers. Secondly, some operating systems might not install at all into KVM due to driver support issues. Finally, there is no equivalent of "VMware tools" to integrate with your host environment so you can move your mouse freely between the virtual machine terminal and the host OS. Your best bet there, if you want a graphical console inside a virtual machine, is to install VNC in the virtual machine and then use VNC to view your graphical console.

But aside from those limitations, KVM appears to be working quite well. It is definitely better on Linux than VMware Server, and if you need to create a vmdk to import into VMware on some other non-Linux host, it's easy enough to just 'qemu-img convert -O vmdk VbAst32.img VbAst32.vmdk' and voila, the new virtual machine will import cleanly into VMware. And of course VPEP runs inside a KVM virtual machine just fine... :).

-ELG

Wednesday, June 2, 2010

Doing an installer right: Microsoft Office 2010

So out of curiousity I downloaded and installed Microsoft Office 2010 today (don't freak out about piracy, folks -- I'm a Microsoft TechNet subscriber and this copy is a quite legit eval copy). I haven't had a chance to use the software yet, but one thing I have to say is about the installer: Microsoft did it right.

A good installer must do the following things:

  1. It must be SIMPLE. People don't want to select lots of stuff, they just want to click one button and have it happen. With the Office 2010 installer you click the 'Install' button (or 'Upgrade' button if Office 2007 is installed on your system), it prompts you for the license key, validates it right then and there, you click 'Next', accept the license, and then it just does it. It's basically four clicks (assuming you can cut-and-paste the license key from the TechNet site of course, if you have to type it in then there's a few keystrokes too). If your geeks or marketroids insist on all sorts of additional functionality, hide it behind a little "+" sign or something where people won't get freaked out about it, users just want it to Just Work, they don't care about all that stuff.
  2. It must handle both upgrades and fresh installs in a clean manner. So if a prior version is already installed, it should give you the option of upgrading it and keeping your configuration settings as much as possible.
  3. If possible, it should offer to import settings from a prior program, or from a competing program, much as the latest IE will import settings from an install of Firefox or Safari.
  4. It must handle aborted installs gracefully. The installer should be idempotent -- you should be able to run it regardless of what state the system got left in, and it will just Do The Right Thing. If the process fails halfway through removing the old version of the software due to something out of your control -- like the moron behind the keyboard accidentally hitting the shutdown button when he was trying for another button -- you should be able to run the installer again and have it Do The Right Thing, knowing what part of the process was last successfully finished and continuing from there, or unwinding back to the original conditions and starting from scratch again, but either way it should Just Work.
  5. Once it starts actually installing, it should just do it, not bother you anymore, until the end of the process where, if a reboot is required, it can prompt you for that.
Microsoft has accomplished all of these things with the Office 2010 installer. And you should do the same when you write yours.

So let's state that principle one more time: End users want it to Just Work. Geeking out with oodles of settings and such might make marketroids drool with all the checkboxes they can fill in on the inevitable "competitive comparison checklist", and might make geeks drool over all the cool widgets they can play with, but for 99% of the people out there all you're doing is a) confusing them, and b) making your technical support people pull their hair out trying to deal with end users who want it to Just Work rather than have all these options to select. Especially now, with 2 terabyte hard drives selling for $130 at Fry's and most computers shipping with a minimum of 4 gigabytes of RAM, it doesn't make sense to do anything other than install the whole tamale in the default place. For 99.9% of your users, that's going to be all they want. For the other 0.1% of your users, put that little "+" if you want... just put it somewhere out of the way so someone has to *want* to click on it. And realize that in reality, nobody cares other than a few fellow geeks.

Thinking like an end user. That's what it takes to make a program that Just Works. That's something I've had to pound into my team's head over and over and over again over the years, think like an end user, not like a geek... and Microsoft, at least, appears to have finally learned that lesson in at least this one instance. At which point I must congratulate them, because it's *hard* for geeks to think like end users, but in this one instance, at least, they managed it.

--ELG

Thursday, May 27, 2010

Configuring Compiz to emulate Spaces and Expose

If you have an OpenGL-capable video card and driver for Linux, like the GeForce 7900 GS in my big box, you can run Compiz on Ubuntu 10.04 and emulate Spaces and Expose'. So here's how to do it in Gnome (I do not recommend KDE on Ubuntu 10.04 due to some serious bugs I found):
  1. Install the latest proprietary driver via System->Administration->Hardware Drivers. Without this installed, my GeForce 7900 simply would not do 3D, and Compiz wouldn't run.
  2. Install the Compiz settings manager:
    • # apt-get install compizconfig-settings-manager simple-ccsm
  3. Identify the X11 mouse buttons you wish to use. Sorry, I identified those via trial and error. On my Logitech Anywhere MX mouse, here is the map from mouse action to X11 mouse button:
    • Button 1: Left click
    • Button 2: Right click
    • Button 3: The 'menu' button
    • Button 4: scroll wheel forward
    • Button 5: scroll wheel back
    • Button 6: scroll wheel left
    • Button 7: scroll wheel right
    • Button 8: backmost-arrow button (on left side of mouse)
    • Button 9: forwardmost-arrow button (on left side of mouse)
  4. Select Preferences->Appearances and select your theme that you want, then click the Visual Effects tab. Select 'Normal' unless you want the wobble-on-window-move effect (which I hate because it makes it hard to accurately place the window). It should do some work, then ask you if you want to keep it. Say yes :). Then exit out of that.
  5. Select Preferences->CompizConfig Settings Manager
  6. Under Desktop, choose "Expo". This is half of the Spaces look-alike, though it doesn't *quite* work like Spaces. Under Expo Key, set it to whatever key you wish to use to enable Expo, either Windows-E (the usual setting, note that the Windows key is called 'Super' in this UI because the Compiz folks apparently hate Windows ;), or a function key of your choice. Note that I recommend using the 'Super' prefix for that function key, because otherwise you end up conflicting with applications, since normal keyboards don't have a 'fn' key like Mac keyboards that can be used to access regular function key codes of function keys assigned functions in the GUI.
  7. Expo won't work with four workspaces all in a row, so right-click on the four workspaces in a row at the bottom right of your screen, select 'Preferences' from the resulting pop-up menu, and set it to a 2x2 grid.
  8. Now you need to set your arrow keys left/right/up/down to move you between the workspaces. Click 'Back' in the Compiz Settings UI, and select 'Desktop Wall'. Click the 'Bindings' tab. Expand the 'Move within wall' collection, and set the Move Left/Right/Up/Down keyboard shortcuts. I suggest Super-left, Super-right, Super-up, Super-down.
  9. Okay, click Back, and now let's set our Expose'-lookalike. This is under 'Windows' and is called 'Scale'
  10. The question is, which one of these do you want to use? "Initiate Window Picker" allows "Expose" on all windows on the current workspace. Unfortunately, if you have a multi-monitor system, it puts all those windows onto the monitor where your mouse is currently residing. This gets very cluttered if you have two large monitors (I'm running a pair of 1050p monitors). In that case, I suggest using 'Initiate Window Picker for Windows on Current Output', which does it just on the monitor your mouse pointer is hovering over. I selected Mouse Button 9, the forward-arrow button on the left side of my mouse, to do this.
The end result: Something that approximates what Expose' and Spaces would have looked like if implemented by someone in the Communist-era USSR who'd approximately heard descriptions of how they worked, but had never actually seen them. You can practically hear the clunks of heavy metal and whirring of primitive gyroscopes as you activate them. The Expose-clone doesn't work well with multiple displays because of its bad habit of trying to collect all the windows onto the current display, thus requiring the 'on Current Output' kludge to avoid getting window overload. The Spaces clone requires more mouse clicks to move windows between "Spaces" (it doesn't simply exit to the workspace you just moved the window to, it stays activated until you double-click on a workspace), and doesn't have a handy icon to get to it quickly in case you're using a laptop that lacks a 9-button mouse (!). And like virtually all things dealing with Linux user interfaces, it takes a jillion-step process to get it configured and set up, with oodles of trial and error to figure out which X11 "mouse button" corresponds to which actual button on a mouse.

In short, Linux programmers still haven't figured out that users just want things to work. I've had to whack my own teams on the hands a few times when they brought back a design prototype that had oodles of screens, buttons and widgets to tweak -- "No, I want one input box here, one submit button there, that's all the user cares about, he just wants to do the job, he doesn't want to adjust all the internal stuff you're exposing here." Complexity is the enemy of user interface usability and consistency -- something which geeks seem to not understand, I've had to do this (shoot down too-complex user interfaces) repeatedly over the past ten years. Sadly, there is nobody to do this for Linux (well, except what Nokia did for Maemo, which works really well at giving a consistent user interface to all Maemo apps, but Maemo is pretty specific to its particular environment) -- and thus Linux continues being an incoherent mess for the average end-user.

Still, for my purposes, it works fine at keeping my workflow working while I run a bunch of KVM virtual machines with VNC viewers into them. So I'm a bit less grumpy today. But I sure wish there was a benevolent dictator for the Linux user interface the way there is for the Linux kernel itself... it's frustrating, the technology is there, but nobody has actually turned it into a coherent whole, and the distribution vendors seem either overwhelmed by the situation or just don't care. Oh well, back to work...

-ELG

Wednesday, May 26, 2010

Attack of the Linux Penguins

Recently I've been using Ubuntu 10.04 Lucid Lynx as my desktop development system. This is the first time in a long time that I've used Linux on the desktop -- for the past three years I've been using MacOS on the desktop, sharing the source code tree via NFS to a Linux VMware client to do compiles and testing. Since what I've been principally doing during this time is appliance and distribution design, the fact that VMware is not the speediest of environments did not make much difference.

However, I needed more virtual machines to look at potential client configurations than I could easily fit on a Macbook Pro, even the very well endowed Macbook Pro that I own (which has 8 gigabytes of memory), so clearly I needed to throw more hardware at the problem (heh! Typical software engineer answer!). So I installed 10.04 on a well-endowed desktop/server system complete with RAID5 array and 8GB of RAM, and set out to configure clients using the KVM (Kernel Virtual Machine) system that comes with Ubuntu 10.04.

So, the Good:

  1. KVM rocks. The performance I get out of KVM is far superior to anything I've ever experienced with VMware even on faster hardware. I am currently running a load on my system that would have both cores maxed out under VMware due to VMware's per-VM overhead. Under KVM, I'm at around 30% on both cores.
  2. The Virtual Machine Manager that ships with Ubuntu for managing KVM is far easier to use than the latest editions of VMware Server's management UI, but not as good as the VMware Fusion management UI on MacOS in that it doesn't have any equivalent, as far as I can tell, of the VMware Tools module that allows transparent window pointer moves between the KVM virtual screen window and the host OS screen.
  3. Gibber is a really cool Twitter/Facebook/etc. client, easily as good as anything on MacOS. I especially like its multi-pane capabilities for viewing multiple streams in parallel.
  4. Performance in general rocks. It feels snappy and things happen quickly, even though it's running software RAID5 so you'd expect significant overhead from the disk driver. But there isn't -- Linux's cache-block elevator does very well at optimizing RAID access.
Now for the bad:
  1. The user interface in general is reminiscent of mid-1990's Windows. It looks and feels dated and obsolete, and lacks many of the modern navigational aids of Windows 7 or MacOS 10.5.
  2. While I got multi-screen support working for two displays attached to my nVidia 7900 video card (which has two outputs), the dated UI gives no easy way to navigate between the various applications open on the two screens. The window boxes clustered at the bottom of the leftmost screen are not easily accessible from the rightmost screen, and there's no equivalent of Apple's Expose' or Windows 7's similar application-picker function that can be bound to a mouse button so selecting an application is a mouse button away.
  3. Neither the KDE nor Gnome filesystem browser allowed providing a user name/password to a CIFS server whose shares you wished to list. As a result, I could not use the CIFS browser to browse to shares on my Mac or on our Windows servers here on the office, neither of which will provide you with a share list unless you authenticate first. On the Mac, this Just Works -- the initial attempt to get the share list fails, but then you click the "Connect As" button, put in the user name and password, voila.
  4. In general, integration with Apple and Windows networks was pathetic. I was reduced to doing manual mounts via the CLI mount.cifs command, which I should never have to do on a supposedly modern desktop operating system.
And finally, the ugly:
  1. KDE on Ubuntu 10.04 is atrocious. There is an important directory service under KDE that sucks up gigabytes of RAM. I had to switch back to the unsatisfactory, but at least functional and efficient, Gnome UI to get work done. KDE used to be fast and clean, but it has become as bloated and dysfunctional as Windows.
  2. Multi-screen support on 10.04 has taken several steps backwards. Window managers crashed when I used the standard X11 Xinerama extension. So I used the proprietary TwinView nVidia driver for my nVidia 7900 video card (a high end card from two generations back, somewhat obsolete today but I use it for Linux because its support is mature) and now multi-screen support is back... but only for the displays attached to this single video card.
My conclusion: Ubuntu 10.04 is thus far the closest that Linux has gotten to a usable desktop operating system. It is a state of the art desktop environment -- if you have been locked in a server room with pizza slid under the door for the past 15 years and still think Windows 95 is the be-all and end-all of user interface design. Both Windows 7 and MacOS 10.5/10.6 put it to shame on all measures of appearance and usability.

I can appreciate the effort that has been put into Ubuntu 10.04, and the difficulties involved in trying to turn a mass of random software from miscellaneous strangers into a coherent operating environment. But that does not change the fact that Linux on the desktop remains stuck in a time warp, fighting the battles of two decades previous in an era where time has moved on. It is especially sad that KDE today is no more usable than KDE 2.0 was ten years ago -- indeed, is *less* usable because of the bloat that has been put on top of what was a clean fast simple and well-integrated user interface. KDE has become a 1959 Cadillac, with fins the size of a aircraft tailfin and a thousand pounds of chrome weighting it down as it staggers down the highway like a bloated whale. Windows 7 suffers from some similar user interface bloat, but it has an excuse -- it's Windows. KDE has no such excuse.

So for the meantime, here's what I say: If you want a coherent, usable user environment, buy a Mac. If you want a fast server environment, use Linux. And if you want something that's neither as coherent as a Mac or as good as a server as Linux, then run Windows 7, which gives you a mediocre implementation of both worlds. As for me, I'm going to stick with my Mac on the desktop, and continue using Linux on my server. That gives me the best user environment *AND* the best server environment... but not, alas, in the same box. So it goes.

-ELG

Monday, April 5, 2010

SaaS and the Dot-com Set

One of the hilarious things that has come about with cloud computing and the advent of large scale SaaS is that I'm seeing the same kinds of arguments I saw back in the dot-com days, that this is a fundamentally different business model that doesn't obey the same rules as traditional software development. Which, of course, is utter nonsense. The method of delivery to customers has changed, but customers remain customers.

My response to all this:

1. Our primary requirement is to meet the needs of the customer. Some customers have legal requirements which preclude SaaS in the sense of SaaS in the cloud, but still wish to have the advantages of SaaS. Think doctors, schools, etc. -- it is actually illegal for them to host patient / student data outside of their own facilities on shared servers. But if we can give them the benefits of SaaS inside their facility using the same software that we have deployed into the cloud -- i.e., *not* a separate version of the software -- then we've fulfilled their needs without any (zero) additional development overhead. And BTW, just to counter one of Marko's points, anybody who structures their sales commissions structure to reward selling private rather than SaaS in the cloud is an idiot and deserves to fail, cloud is generally *much* less support on our part, it just doesn't meet the needs of certain customers.

2. I have not encountered any customers who want rapid updates of critical applications, ever. My boss once ordered me to deploy a new update to a school scheduling program in the middle of a school year. I pointed out that a) school secretaries and counsellors were currently doing mid-term schedule changes, b) school secretaries and counsellors had not been trained on the changes, which were significant (I had re-written the entire scheduling system from scratch going back to first principles, because the old system was incapable of handling some of the new scheduling paradigms that had come out, such as multi-shift scheduling and quarter-system scheduling), and thus c) it would be a fiasco. He said the old system was broken, so deploy the new one anyhow. I did. And got to say "I told you so" to my boss when it turned into the fiasco I'd predicted. My point: Users are fond of *saying* they want the latest, greatest features, but what they actually want is to get their job done. Paying attention to what users say, as vs. what they actually want, can be a huge mistake costing you a lot of money in additional support costs and losing a lot of customer goodwill. Not only were our support lines clogged solid for a week, my boss had to eat a lot of crow at the next user group meeting to get some of that lost goodwill back.

3. I am on Twitter. Yes, customers will Tweet stuff. 140 characters doesn't exactly get you in-depth commentary though. If you let Twitter guide your product development, what you get is a product designed by tweets, which is indistinguishable from a product designed by twits. I have only met one customer in my entire life who actually knew what he wanted (a school discipline coordinator who said, "I want a computerized version of these three state-mandated forms, and reports that fulfill the requirements of these four legally-required forms that I must submit at the end of the school year"). The rest have some vague idea, but you must get with them and engage them in a lengthy discussion complete with design proposals that include sample screen displays of what the application might look like. For one clustered storage product I actually spent more time talking to potential customers, writing proposals, and getting feedback than I spent implementing the actual product. Needless to say, that is *not* a process that occurs in 140 characters.

4. Anybody who goes into a business where there is an entrenched incumbent expecting to compete on features is an idiot in the first place. The incumbent has basically infinite resources at his disposal compared to you and is capable of implementing far more features than any newcomer. He will simply steal any features you innovate in order to stay out ahead. In the old days incumbents like IBM weren't capable of innovating rapidly. But this is the Internet era, and the successful giants have become much more nimble. If there's a feature you have that the incumbent doesn't have, expect him to have it soon. The way to win today is to change the game -- to do something so novel, so different in a fundamental way, that the incumbent could not match you without re-writing his entire product from scratch and ditching his entire current customer base. In short, competing based on features is a fool's game in today's era unless you're the incumbent. The way to win is to change the paradigm, not attempt to compete on features within an existing one.

5. Yes, selling "private SaaS" means we basically end up having to support multiple versions. But that is true regardless, unless we want to force customers into a death march to new versions. Some customers are comfortable with that, the majority, however, arrive at a version they like and just want to stick with that, much as the majority of Windows users are still using Windows XP, or the majority of Linux users are using Red Hat Enterprise Linux 5 (basically a three-year-old version of Linux) rather than the latest and greatest Fedora or Ubuntu. They'll accept security fixes, but that's it -- if you attempt to death march them, they'll go to a competitor who won't.

I've been dealing with satisfying customer requirements for around 15 years now, and my actual experience of those 15 years is that customers are ornery folks where what they say and what they actually want are two different things, and your job as an architect and designer is to try to suss out what they actually want, which is often *not* the same as what they say. Young engineers tend to take customers at face value, then not understand why the customer rejects the result as not meeting his needs. I get called conservative sometimes because I call for slowing down release cycles, maintaining backward compatibility wherever feasible, extensive trials with customers prior to product release, etc., but my experience is that this is what customers want -- as vs. what they say they want, which is something else entirely.

For the record - my phone is an iPhone, and the only paper maps in my car are detailed 1:24000 USGS topographical maps not available on current GPS units with reasonable screen sizes. Just sayin' ;).

-EG

Monday, March 29, 2010

Cryptography engineering

A new post at the VPEP blog.

I'm currently reading Bruce Schneier's new book Cryptography Engineering (actually the second edition of Applied Cryptography), and the above was just me riffing on some thoughts I had while reading the first chapter.

-EG

Monday, March 22, 2010

About work...

One thing you'll notice, reading this blog, is that I haven't blogged about anything happening at work. There's a reason for that: It is, in general, a bad idea. If an employer believes a post puts the company in a bad light or simply decides that you have leaked proprietary information without permission, it's a great way to get fired -- dozens of bloggers have been fired over the past decade for posting about things that happened at work.

So anyhow, who I work for is no secret -- you can click on my LinkedIn profile and see -- but now I will be blogging about things I'm doing at work on my employer's own group blog. My first posts are up. You might recognize one of them as a revised version of one of the posts on this blog, except now I can say what I could only hint at then :).

-ELG