Tuesday, December 06, 2005

Computer Science vs. Computer Practice

So you're an accomplished computer scientist are ya? We'll see about that.

The difference between Computer Science and Computer Practice

What is a magician but a practicing theorist? -- Obi-Wan Kenobi

As we all know, there are a lot of ranty articles running around the net about US IT jobs going overseas. Of course its a valid concern and a lot of the posts are based heavily in frustration (which can sometimes undermine facts). One point I see raised again and again is the statement that programmers overseas produce "crap code" or "have no idea how to design quality code". Or, the old fall back "don't understand true computer science principles!"


Partly by force, hmm.. actually, completely by force, I'm still a professional graduate student at Syracuse University. I don't live there, but I travel there every few weeks. I meet with members of my Ph.D committee whom, as you can imagine, are an interesting bunch of smart professors. They see my dissertation (that I've worked on for several years now), read it for the first time, and pull some obscure embedded issue out of the depths of the idea. I'll never stop being amazed at their ability to do that. Its like someone else telling you your 10 year old kid has a 3rd arm - I never noticed that before!

Then again, that's their job. They are scientists - they can wrap their head around new, complex concepts very quickly and immediately assess the problem from many angles. Despite their brilliance, some of them still don't use email. That's kind of the running joke isn't it? The brilliant computer scientist that doesn't use email - or hardly uses computers for that matter. The fact that this is a joke is a clear misunderstanding of the line between computer science and computer practice.

A lot of programmers seem to credit themselves pretty quickly with understanding computer science. Surely some do. In some respects its like the guy who has skiied all his life and then at middle-age takes a skiing lesson. He does it to join a friend or just for fun. Often enough he learns plenty. If nothing else, he learns other people's viewpoints on things he figured out for himself. In fact, hopefully a ski instructor is teaching from a source that has combined knowledge from the world's best skiers over many years. Chances are you don't figure out something on your own the same way that pooled knowledge from hundreds of people before you figured out. You're bound to pick something up.

The real truth is that knowing a lot about computer science isn't seriously required for much today's programming needs. Personally, I couldn't sleep if I knew an O(n^2) algorithm sat where a clean O(nlogn) would fit - but thats my now inborn nature. Making a webpage simply doesn't require much knowledge about complexity theory, algorithms, or data structures. Now, before you go posting at the bottom of this article that this is poop - I think you're forgetting something. Surely there are implementations that would benefit from running the best algorithms and the best quality code, but by far most of those applications will run just fine with "good enough" implementations too. That's the rub really.

Don Knuth (a pre-eminent computer scientist/mathematician for those newbs) made a famous quote something to the effect of "I have proven that this program is correct, but I haven't tested it". Sort of a funny joke to prove something is correct but not be sure it will actually work. If you've actually ever done any program proving (probably in some computer science cirriculum), you know it can be a tedious mathematical exercise. The only case I know of where commercial software was actually proven correct was in the fly-by-wire Boeing 757 aircraft (this is from memory, if anyone needs to correct me please do so). That is, they didn't just write the code and test it a bit to see if it works - they mathematically proved that the software would work as prescribed. Sounds good to me since I'm on a 757 often enough.

There's a pretty interesting dividing line between science and practice. Scientists prove, practitioners test. With all the hoopla in our industry about "test-first" its pretty plainly a practical world. Testing is fine, but its empirical and can miss details. Its rare that you can test all possible input in an application. Maybe we all should be more like computer scientists. Maybe we can abandon this "test-first" mentality and adopt a "prove-last" one. I'd rather have proven software than tested software. You can still test if you like (as Knuth mentions). Then again, for a practical world its obvious that despite the fact that empirical testing of software is almost never perfect, clearly its "good enough".

Hopefully at this point, we can stop calling things like servlets and web-services "computer science". I wear both hats, some days I'm a groveling academic student and other days I'm a practicioner providing solutions for a client. The most distinct element between science and practice comes when I write academic versus business prose. As my professor says about my dissertation, "you must be able to defend every sentence". I've learned quickly (although sometimes I still make mistakes when I context switch) about what and what-not to say in an academic paper. Here's some examples:

* Never say anything is "optimal" unless you have concrete proofs. We're not talking about amazing facts here, we're talking about obvious stuff like "the shortest possible method is an empty method". And darn it, your thingie better be optimal.

* Assume your reader is an accomplished computer scientist, but if you use a word that isn't normal english, it better have a definition first. Again, I'm not talking about "non-polynomial running time", I'm talking about "parameter list".

* Empirical evidence is cute. But you better have a lot to make any claims off it. Even then, you can only say what you have is of worth "in practice". Unit tests won't get you anywhere with professors.

* For any algorithm that has any value in it - you best know whats going to happen if that value were replaced by infinity.

Me: So the class has several methods and the algorithm examines each one.
Professor: How long does that take? (he means algorithmically, not wall time)
Me: Its linear - runs in O(n)
Professor: What if the class had BILLIONS AND BILLIONS of methods!??? HMM?
Me: Uh.. it can't. Java wouldn't support that. I think you can only have like 256 methods or 32767 or something.
Professor: And what if it had an unbounded INFINITE number of methods? What would happen THEN? hmm?
Me: It wouldn't because even if every Java programmer started typing in methods in some grand coordinated method-creation effort and every computer ran some method-creating program to help make more, we'd still never get an infinite number of methods. Not to mention no compiler or VM would accept such a class. Is this relevant?
Professor: And what if each of those methods had an INFIINTE number of instructions and ran for an INFINITE amount of time? How long would your little algorithm take THEN!?
Me: hrf.. um.. its linear.. runs in O(n). Which I suppose if everything was INFINITE - it would be a really, really long fricken time. But you and me and your dog too would be DEAD by then and we wouldn't CARE anymore and my damn tombstone would say "DIED WAITING FOR PHD DEFENSE" and yours would be like "DR. BOB PAINGIVER - I SUCKED". Besides, most of the classes in question have only like 3 methods or so.
Professor: oh.

Unlike things like biochemisty, computers (and hence some computer science) is actually fun to learn. Few people become accomplished biochemists in their spare time. But plenty of folks gain useful computer skills just by toying around. Its a pretty thin definition to say that programmers with no formal computer science can't make good computer programs. Of course they can. Languages like Java and C# have built-in sandboxes to protect us from ourselves. Good algorithms are buried away in collections libraries. How many programmers really know how much faster a binary tree is over a hashtable for searching? Does it really matter in commercial computer software? You might say yes, but the truth is most software in existence today is written far from algorithmically optimally. In other words, its "good enough". That may stink, but its also the truth. It also may NOT stink - whats the return on investment pristine computer science in business applications?

The joke of the computer scientist not using email is not very interesting to an actual computer scientist. Its like asking a petroleum engineer if he knows how to drive an oil truck. Most programmers today are practitioners because thats what businesses pay for. If overseas programmers "suck" - don't worry, they'll get better. And it probably won't mean anyone learns a whole lot more about "computer science".
Printer Friendly Page


originalfnerd said...

The way I heard, programming jobs are going overseas partly because of the number people in, e.g., India who do have Comp Sci degrees. Meanwhile Comp Sci enrollments in the U.S. are falling. Merry Christmas!

originalfnerd said...
This comment has been removed by a blog administrator.
Anonymous said...

The "joke" about the computer scientist not using email is not really that much a joke, see for example Dijkstra's life story (http://en.wikipedia.org/wiki/Edsger_Dijkstra). One of the greatest computer scientists.

Computer Help said...

Thanks for sharing this info post.

Hîthwen Fëadür said...

originalfnerd, and because in some countries overseas people who do have CS degrees are underpayed :p