You need to know your language well. Read the reference. Read the booksthat tell you both the mechanics of language and the whole enterprise ofdebugging and testing: Code Complete or some equivalent of that. But I thinkthere are a lot of different paths. I don’t want to say you have to read oneset of books.
Seibel: Though your job now doesn’t entail a lot of programming you stillwrite programs for the essays on your web site. When you’re writing theselittle programs, how do you approach it?
Norvig: I think one of the most important things is being able to keepeverything in your head at once. If you can do that you have a much betterchance of being successful. That makes a small program easier. For a biggerprogram, you need extra tools to be able to handle that.
It’s also important to know what you’re doing. When I wrote my Sudokusolver, some bloggers commented on that. They said, “Look at thecontrast—here’s Norvig’s Sudoku thing and then there’s this other guy,whose name I’ve forgotten, one of these test-driven design gurus. He startsoff and he says, “Well, I’m going to do Sudoku and I’m going to have thisclass and first thing I’m going to do is write a bunch of tests.” But then henever got anywhere. He had five different blog posts and in each one hewrote a little bit more and wrote lots of tests but he never got anythingworking because he didn’t know how to solve the problem.
I actually knew—from AI—that, well, there’s this field of constraintpropagation—I know how that works. There’s this field of recursivesearch—I know how that works. And I could see, right from the start, youput these two together, and you could solve this Sudoku thing. He didn’tknow that so he was sort of blundering in the dark even though all his code“worked” because he had all these test cases.
Then bloggers were arguing back and forth about what this means. I don’tthink it means much of anything—I think test-driven design is great. I dothat a lot more than I used to do. But you can test all you want and if youdon’t know how to approach the problem, you’re not going to get asolution.
Seibel: So then the question is, how should he have known that? Should hehave gone and gotten a PhD and specialized in artificial intelligence? Youcan’t know every algorithm. These days you have Google, but finding theright approach to a problem is a little different than finding a webframework.
Norvig: How do you know what you don’t know?
Seibel: Exactly.
Norvig: So I guess it’s two parts. One is to recognize that maybe there is aknown solution to this. You could say, “Well, nobody could possibly knowhow to do this, so just exploring randomly is as good as everything else.”That’s one possibility. The other possibility is, “Well, probably somebodydoes know how to do this. I just don’t know what the words are for it, so Ihave to discover those.” I guess that’s partly just intuition and saying, “Itseems like the kind of thing that should be in the body of knowledge fromAI.” And then you have to figure out, how do I find it? And probably hecould’ve done a search on Sudoku and found it that way. Maybe he thoughtthat was cheating. I don’t know.
Seibel: So let’s say that is cheating—say you were the first person ever totry and solve Sudoku. The techniques that you ended up using would stillhave been out there waiting to be applied.
Norvig: Let’s say I wanted to solve some problem in biology. I wouldn’tknow what the best algorithms were for doing gene sequencing orwhatever. But I’d have a pretty good idea that there were such algorithms.Then I could start looking around. At another level, some of these thingsare pretty fundamental—if you don’t know what dynamic programming is,then you’re at a severe disadvantage. It’s going to come up time and timeagain. If you don’t know this idea of search in general—that you can make achoice and backtrack when you don’t need it. These are all ideas from the’60s. It was only a few years into programming that people discovered thesethings. It seems like that’s the type of thing that everyone should know.Some things that were discovered last year, not everybody should know.
Seibel: So should programmers go back and read all the old papers?
Norvig: No, because there are lots of false starts and lots of mergerswhere two different fields develop completely different technology andterminology, and then they discover they were really doing the same thing. Ithink you’d rather have a story from the modern point of view rather thanhave to follow all the steps. But you should have them. I don’t know whatthe best books are for that since I picked it up the hard way, piecemeal.
Seibel: So back to designing software. What about when you’re working onbigger programs, where you’re not going to be able to just remember howall the code fits together? Then how do you design it?
Norvig: I think you want to have good documentation at the level of overallsystem design. What’s the thing supposed to do and how’s it going to doit? Documentation for every method is usually more tedious than it needsto be. Most of the time it just duplicates what you could read from thename of the function and the parameters. But the overall design of what’sgoing to do what, that’s really important to lay out first. It’s got to besomething that everybody understands and it’s also got to be the rightchoice. One of the most important things for having a successful project ishaving people that have enough experience that they build the right thing.And barring that, if it’s something that you haven’t built before, that youdon’t know how to do, then the next best thing you can do is to be flexibleenough that if you build the wrong thing you can adjust.
Seibel: How much do you think you can sit down and figure out howsomething ought to work, assuming it’s not something that you’ve builtbefore? Do you need to start writing code in order to really understandwhat the problem is?
Norvig: One way to think about it is going backwards. You want to get toan end state where you have something that’s good and for some problemsthere’s roughly one thing that’s good. For other problems there are roughlymillions and you can go in lots of different directions and they’d all beroughly the same. So I think it’s different depending on which type of thosetypes of problems you have.
Then you want to think about what are the difficult choices vs. what are theeasy ones. What’s going to come back to really screw you if you make thewrong architectural choice—if you hit built-in limitations or if you’re justbuilding the wrong thing. At Google I think we run up against all these typesof problems. There’s constantly a scaling problem. If you look at where weare today and say, we’ll build something that can handle ten times morethan that, in a couple years you’ll have exceeded that and you have to throwit out and start all over again. But you want to at least make the right choicefor the operating conditions that you’ve chosen—you’ll work for a billion upto ten billion web pages or something. So what does that mean in terms ofhow you distribute it over multiple machines? What kind of traffic are yougoing to have going back and forth? You have to have a convincing story atthat level. Some of that you can do with calculations on the back of theenvelope, some of that you can do with simulations, and some of that youhave to predict the future.
Seibel: It seems for that kind of question you’ll be far more likely to answercorrectly with either back-of-the-envelope calculations or simulation thanwriting code.
Norvig: Yeah, I think that’s right. Those are the kind of things where thecalculations are probably a better approach. And then there are these issuesof some vendor says they’re going to have a switch coming out next yearthat will handle ten times as much traffic; do you design to that? Do youbelieve them? Or do you design to what you have today? There are a lot oftrade-offs there.