Seibel: And what kinds of things do you print out?
Thompson: Whatever I need; whatever is dragging along. Invariants. Butmostly I just print while I’m developing it. That’s how I debug it. I don’twrite programs from scratch. I take a program and modify it. Even a bigprogram, I’ll say “main, left, right, print, hello.” And well, “hello” isn’t what Iwanted out from this program. What’s the first thing I want out, and I’llwrite that and debug that part. I’ll run a program 20 times an hour that I’mdeveloping, building up to it.
Seibel: You print out invariants; do you also use asserts that checkinvariants?
Thompson: Rarely. I convince myself it’s correct and then either commentout the prints or throw them away.
Seibel: So why is it easier for you to print that an invariant is true ratherthan just using assert to check it automatically?
Thompson: Because when you print you actually see what it is as opposedto it being a particular value, and you print a bunch of stuff that aren’tinvariants. It’s just the way that I do it. I’m not proposing it as a paradigm.It’s just what I’ve always done.
Seibel: When we talked about how you design software, you described abottom-up process. Do you build those bottom-up pieces in isolation?
Thompson: Sometimes I do.
Seibel: And do you write test scaffolds for testing your low-level functions?
Thompson: Yeah, very often I do that. It really depends on the programI’m working on. If the program is a translator from A to B, I’ll have a wholebunch of As lying around and the corresponding Bs. And I’ll regress it byrunning in all the As and comparing it to all the Bs. A compiler or atranslator or a regular-expression search. Something like that. But there areother kinds of programs that aren’t like that. I’ve never been into testingmuch, and those kinds of programs I’m kind of at a loss. I’ll throw in somechecks, but very often they don’t last in the program or around the programbecause they’re too hard to maintain with the program. Mostly justregression tests.
Seibel: By things that are harder to test, you mean things like devicedrivers or networking protocols?
Thompson: Well, they’re run all the time when you’re actually running anoperating system.
Seibel: So you figure you’ll shake the bugs out that way?
Thompson: Oh, absolutely. I mean, what’s better as a test of an operatingsystem than people beating on it?
Seibel: Another phase of programming is optimization. Some peopleoptimize things from the very beginning. Others like to write it one way andthen worry about optimizing it later. What’s your approach?
Thompson: I’ll do it as simply as possible the first time and very often thatsuffices for all time. To build a very complex algorithm for something that’snever run is just stupid. It’s just a waste of time. It’s a bug generator. And itmakes it impossible to maintain because you’ve got to have 50 pages ofmath to tell the next guy what you’re actually doing.
Ninety-nine percent of the time something simple and brute-force will workfine. If you really are building a tool that is used a lot and it has some sort ofminor quadratic behavior sometimes you have to go in and beat on it. Buttypically not. The simpler the better.
Seibel: Some people just like bumming code down to a jewel-likeperfection, for its own sake.
Thompson: Well, I do too, but part of that is sacrificing the algorithm forthe code. I mean, typically a complex algorithm requires complex code. AndI’d much rather have a simple algorithm and simple code than some bighorror. And if there’s one thing that characterizes my code, it’s that it’ssimple, choppy, and little. Nothing fancy. Anybody can read it.
Seibel: Are there still tasks which, for performance reasons, people stillhave to get down to hand-tuned assembly code?
Thompson: It’s rare. It’s extremely rare unless you can really get an orderof magnitude and you can’t. If you can really work hard and get some littlepiece of a big program to run twice as fast, then you could have gotten thewhole program to run twice as fast if you had just waited a year or two. Ifyou’re writing a compiler—certainly 99 percent of the code you produce isgoing to be run once or twice. But some of it’s going to be in an operatingsystem that’s run 24 hours a day. And some of it’s going to be in the inner,inner loop of that operating system. So maybe 0.1 percent of theoptimization you put into a compiler here is going to have any effect onyour users. But it can have profound effect, so there maybe you want to doit.
Seibel: But that would be a result of generating better code in the compilerrather than writing the compiler itself in assembly.
Thompson: Oh, yes, yes.
Seibel: And presumably part of the reason writing programs directly inassembly is less important these days is because compilers have gottenbetter.
Thompson: No. I think it’s mostly because the machines have gotten a lotbetter. Compilers stink. You look at the code coming out of GCC and it’sawful. It’s really not good. And it’s slow; oh, man. I mean, the compiler itselfis over 20 passes. It’s just monstrously slow, but computers have gotten1,000 times faster since GCC came out. So it may seem like it’s gettingfaster because it’s not getting slower as much as computers are gettingfaster underneath it.
Seibel: On a somewhat related note, what about garbage collection? WithJava, GC has finally made it into the mainstream. As Dennis Ritchie oncesaid, C is actively hostile to garbage collection. Is it good that folks aremoving toward garbage-collected languages—is it a technology thatdeserves to finally be in mainstream use?
Thompson: I don’t know. I’m schizophrenic on the subject. If you’rewriting an operating system or a C compiler or something that’s used bylots and lots of people, I think garbage collection is a mistake, almost. It’s acheat for you where you can do it by hand and do it better—much better.What you’re doing is your sloughing your task, your job, making it slowerfor your users. So I think it’s a mistake in an operating system. It almost justdoesn’t fit in an operating system. But if you are writing a hack program todo a job, get an answer and then throw the program away, it’s beautiful. Ittakes a layer of stuff you don’t want to think about, at a cost you can afford,because computers are so fast, and it’s nothing but a win-win-win position.So I’m really schizophrenic on this subject.
Part of the problem is there are different garbage-collection algorithms andthey have different properties—massively different properties. So you’rewriting some really general-purpose thing like an operating system—ifyou’re going to write it in a language that garbage-collects underneath, youdon’t even have the choice of the algorithm for the operating systems.Suppose that you just can’t stand big real-time gaps and you have a garbagecollector that runs up to some threshold and then does mark and sweep.You’re screwed before you start.
So if you’re doing some general-purpose task that you don’t know who yourreal users are, you just can’t do that. Plus, garbage collection fights cachecoherency massively. And there’s no garbage-collection algorithm that isright for all machines. There are machines where you can speed it up by afactor of five or more by messing around with the cache. They should betied to the machine much more than they are. Usually they treat them asseparate algorithms that have nothing to do with machines, but the cachecoherency is very important for garbage-collection algorithms.
Seibel: Do you think of yourself as a scientist, an engineer, an artist, acraftsman, or something else?