But what makes big software manageable is having some global invariants orbig-picture statements about what it’s supposed to do and what things aresupposed to be true. So, to take GHC as another example, having thisinvariant that each of these intermediate programs should be well typed.That can be checked, actually, at runtime if you like. That’s quite a powerfulinvariant about what’s going on. So I’m not sure it’s really necessarily to dowith size.
Certainly interconnectedness makes big programs eventually crumble undertheir own weight. Sometimes one of the luxuries that you get from workingin research is that you can sometimes take a chunk of code and simplyrewrite it in the light of your improved insights into what you were trying toachieve and how you might try to achieve it. We talked about this businessof refactoring GHC’s back end. If I was working in a more commercialsetting, I might not be able to afford to do that. But I’m hoping that it willmake GHC more maintainable and understandable in the long term.
Is there an upper bound on the size? I don’t know. I rather suspect that aslong as we can go on building good abstractions we can keep buildingbridges across the Atlantic. We have software that works—not perfectly,but surprisingly well considering how big it is.
Seibel: So the question is, can you build an edifice that’s that large, andworks, and is also beautiful.
Peyton Jones: It’s hard for it to maintain its beauty. Bits are often beautifulor at least acceptably non-ugly when they’re first built. In the face ofprotracted life—maintenance—it’s quite difficult to maintain that. That’s theworst thing about long-lived programs… that they gradually become ugly.So there’s no moment at which they become disfigured but neverthelessafter a while they just become crappy.
Seibel: And is the only way out to say, “OK, this thing has lived longenough, start over”?
Peyton Jones: I think that eventually you have to reengineer chunks of it.And if you can afford to reengineer a bit as you go along, then—if you don’tdo anything for ten years then the result may just be so daunting that youthink, “I just have to throw it away and start again.” If you can afford toreengineer a bit as you go along, like the human cells regeneratingthemselves—that’s what I hope is happening to GHC.
The most depressing thing about life as a programmer, I think, is if you’refaced with a chunk of code that either someone else wrote or, worse still,you wrote yourself but you no longer dare to modify. That’s depressing.
Peter Norvig
Peter Norvig is a broad thinker and a hacker at heart. He once wrote a programto find in Google’s search logs series of three consecutive searches by the sameuser that, when put together, made a haiku (one of the most memorable: “javaECC / java elliptical curve / playboy faq”).
On his web site Norvig has links to the usual stuff: books and papers he’s written,slides from talks he’s given, and various bits of his code. But there are also links toitems he’s had published in McSweeney’s Quarterly Concern, his witty recountingof writing a program to generate the world’s longest palindromic sentence, and his“Gettysburg Powerpoint Presentation,” a send-up of Microsoft’s PowerPointsoftware, which has been cited by Edward Tufte and which appears on the firstpage of results if you Google “PowerPoint.”
He is now the Director of Research at Google, after having been the Director ofSearch Quality. Prior to that he had been the head of the Computational SciencesDivision at NASA Ames Research Center and before that, an early employee at thelate-’90s Internet startup Junglee. He won the NASA Exceptional AchievementAward in 2001 and is a Fellow of the American Association for ArtificialIntelligence and the Association for Computing Machinery.
Between Google, NASA, and Junglee, Norvig has experience with both the“hacker” and “engineer” approaches to building software and talks in thisinterview about the advantages and disadvantages of each. As a former computerscience professor and now an insider at one of the biggest industrial softwareshops in the world, he also has an interesting perspective on the relation betweenacademic computer science and industrial practice.
Other topics in our conversation included how programming has changed in recentyears, why no design technique can make up for not knowing what you’re doing,and why NASA might be better off with less-reliable but cheaper software.
Seibel: When did you start programming?
Norvig: In high school. We had a PDP-8, I think it was, at my high school,and there was a class I took—we started in BASIC programming and I wentfrom there.
Seibel: What year would that have been?
Norvig: I graduated high school in ’74, so it must have been ’72 or ’73. Iremember a couple of things, going back to those early days. I rememberthe teacher trying to teach shuffling of a deck of cards. Her algorithm was,use a random-number generator to pick two locations and then swap themand keep a bit vector that said, these were swapped, and keep going untilthey’re all swapped. I remember my reaction being, “That’s stupid. That’sgotta be the stupidest thing in the world. It could take forever becausethere could be one pair that you never happen to choose.” I didn’t knowenough then to say it’s n squared when it could have been order n. But Iknew that that was just wrong. Then I was able to come up with, I think, theKnuth algorithm of swapping from 0 to 52 and then from 0 to 51 and soon—an order n algorithm. And I remember the teacher defending herapproach. That helped me think, “Well, maybe I have an aptitude for thisprogramming stuff.” It also helped me say, “Maybe teachers don’t reallyknow everything.”
Seibel: Was it as soon as she described it that you just said, “Wow, this iswrong”? Or did you play with it for a while and then say, “Gosh it seemslike we’re doing a lot of work here”?
Norvig: I think I noticed right away. It’s hard to know what I was reallythinking back then but I think right away I noticed there’s a finite possibilitythat this might not terminate. I’m not sure I knew as much about theexpected runtime.
I also remember finding my father’s back issues of Scientific American in theattic and going through them. There was this article by ChristopherStrachey on software engineering in which he said that people are going touse these higher-order languages. And he had invented this language thatthere was never a compiler for—it was a paper language. And he said, “I’mgoing to write a checkers program using this language.” I remember readingthat—it was the first nontrivial program I had ever read because in schoolwe were just learning how to shuffle and so on. I read it again recently andthe first thing I noticed was that there’s a bug in it. And it was great becauseyou figure this is Christopher Strachey, he should know what he’s doing,and it’s Scientific American, they’ve got editors and so on—they shouldprobably get those bugs out. But in the prose it says there’s a function makemovewhich takes a board position and returns a move and then you look inthe code and there’s make-move and it takes a board position and an extraparameter. Apparently they wrote the prose first and then they wrote theimplementation. And they found out you can’t search infinitely deep so theyadded an extra parameter which was depth of search and you recurse downto a certain level and then you stop. They had added that in afterwards andhadn’t gone back and fixed the documentation.