Thursday, March 19, 2020

Tech Book Face Off: Facts And Fallacies Of Software Engineering Vs. Programming Pearls 2

Since I've been hitting the tech books pretty hard for a while now, for this Tech Book Face Off I wanted to take a bit of a breather and do a couple of relatively easy reads. These books have been on my to-read list for some time, so I decided to finally check them out. The first one, Facts and Fallacies of Software Engineering by Robert L. Glass is a book in a similar vein as The Pragmatic Programmer in that it relates various tidbits of advice on the craft of software engineering. As for Programming Pearls 2 by Jon Bentley, this book surprised me. I thought it would be somewhat similar to Facts and Fallacies, just more directly related to instructive programming examples than to the software engineering field at large, but it turned out to be quite a bit different, as we'll see in this review.

Facts and Fallacies of Software Engineering front coverVS.Programming Pearls front cover

Facts and Fallacies of Software Engineering

Robert L. Glass is an odd duck. His writing is at the same time strongly opinionated and calmly easygoing. He adamantly argues for each of his observations as 55 facts, with 10 fallacies thrown in at the end, but his discussion of each one is quite conversational. He admits that not everyone will agree with his facts, even though he provides evidence and sources for them. (Well, he does for a majority of them, at least.) I found this book a quick, enjoyable read, and it was worthwhile even if I didn't always agree with his propositions because they always made me think. I'd much rather read a book that I sometimes disagreed with if it challenges me, than a book that's poorly written and says everything I want to hear.

I'm not going to relate every fact and fallacy from the book here, since that would make this review nearly as long as the book, but they do cover the gamut of software engineering. I'll describe them in broad strokes and discuss a few that I found especially interesting.

The first chapter deals with software management related things. The facts are broken up into smaller sections of people (the software engineers), tools and techniques, estimation, reuse, and complexity. A full 22 facts are covered in this chapter, including one about how tools and techniques are over-hyped that I found particularly thought-provoking. Even back when this book was written in 2002, tools were being promoted as a panacea for software development problems while they were simultaneously showing diminishing returns, and Glass was having none of it:
Time was, way back when, that new software engineering ideas were really breakthroughs. High-order programming languages. Automated tools like debuggers. General-purpose operating systems. That was then (the 1950s). This is now. The era of breakthrough techniques, the things that Fred Brooks (1987) referred to as silver bullets, is long since over. Oh, we may have fourth-generation languages ("programming without programmers") and CASE tools ("the automation of programming) and object orientation ("the best way to build software") and Extreme Programming ("the future of the field") and whatever the breakthrough du jour is. But, in spite of the blather surrounding their announcement and advocacy, those things are simply not that dramatically helpful in our ability to build software.
What's most interesting is that 17 years later, the hype machine hasn't stopped, and it shows no signs of slowing down. I wouldn't say that's surprising, given that software engineering is such a massive sector, and people continue to try to make money off of it however they can, but it shows how relevant this book still is. Advances in software engineering ideas do still happen, but they are still incremental. It isn't that incremental is bad, but it is all that we should expect now. Productivity free lunches aren't likely to come about anymore until the next breakthrough technology happens, and that may not be software but something else entirely.

Chapter 2 is about the software life cycle with sections on requirements, design, coding, error removal, testing, reviews and inspections, and maintenance. Maintenance in particular is a fascinating subject in software because it ends up taking the majority of the time and cost of any given project, mostly without our realizing it. It's also quite difficult and no one wants to do it because the documentation sucks. There's a reason for that:
To solve those problems, software people have invented the notion of maintenance documentation—documentation that describes how a program works and why it works that way. Often such documentation starts with the original software design document and builds on that. But here we run into another software phenomenon. Although everyone accepts the need for maintenance documentation, its creation is usually the first piece of baggage thrown overboard when a software project gets in cost or schedule trouble. As a result, the number of software systems with adequate maintenance documentation is nearly nil.
Here we can see both Glass' conversational writing style and the reasonable way of thinking that he shows in most of his facts and fallacies. It's hard to argue with this one, especially because I think most of us have been in similar situations.

The next chapter is about software quality, including sections on quality, reliability, and efficiency. Like the other chapters, these facts are mostly obvious to anyone who has been working in the field for more than a few years, but it's always good to refresh your memory on these ideas. The last chapter of facts is just a single fact on research: many researchers advocate rather than investigate. I can't say whether or not this is true from my own experience or reading, but I'm not too concerned about it.

Starting with chapter 5, the next three chapters deal with the 10 fallacies. Chapter 5 loops back around to management with similar sections to the first chapter. The fallacies include things like, "you can manage quality into a software product," and "software needs more methodologies" that are hard to argue with. You can't, and it doesn't. The next chapter mirrors chapter 2 with some fallacies on the software life cycle. I thought he was unfair in his discussion of the fallacy, "given enough eyeballs, all bugs are shallow" by saying:
This is probably just wordplay. But it is patently obvious that some bugs are more shallow than others and that that depth does not change, no matter how many people are seeking them. The only reason for mentioning this particular reason here is that too many people treat all bugs as if their consequences were all alike, and we have already seen earlier in this book that the severity of a bug is extremely important to what we should be doing about it. Pretending that turning scads of debuggers loose will somehow reduce the impact of our bugs is misleading at best.
This seems like a deliberate misinterpretation of the idea of the quote. It's not saying that more eyeballs will make bugs less critical. It's saying that any given bug is easier to find if more people are looking for it because the more people that look for the bug, the more likely that a person with the right expertise will see it and be able to fix it quickly. Or it will be more likely that someone looking for the bug will randomly happen to look in the right place and spot its signature quickly. However, I think there are still issues with depending on this approach for open source software development, but for other reasons. Most open source projects don't have the luxury of having hundreds or thousands of programmers working on it. Frankly, most projects are lucky to have more than one programmer working on it, so the responsibility of finding and fixing bugs still falls on the programmer writing the code. Even if the project has high visibility, pull requests need to be of high quality, or more bugs will be introduced over time than will be fixed and the whole project will degrade.

I also took issue with the fallacy in the last chapter: "you teach people how to program by showing them how to write programs." Don't get me wrong, I do think this statement is false, but for different reasons than Glass gave. He thinks it's more important to teach budding programmers how to read code and that academia isn't doing that:
I know of no academic institution, or even any textbook, that takes a reading-before-writing approach. In fact, the standard curricula for the various computing fields—computer science, software engineering, information systems—all include courses in writing programs and none in reading them. 
I disagree on three counts. First, let's get the easy one out of the way. Programming is about more than writing or reading code. Teaching programming involves teaching critical thinking skills, problem solving, systems thinking, user psychology, and so much more! Second, and more directly related to his reasoning, the exact same approach is used in mathematics. Schools teach students how to write math solutions well before teaching them how to read math solutions. Most people will never get to the point of reading and understanding mathematical proofs, but everyone starts with solving basic arithmetic problems. That's how we start in programming, too—by writing basic programs.

Third, I would argue that universities and textbooks are, in fact, teaching students to read programs at the same time as writing them. The examples in the books are all there to read, and the student must read and understand them in order to write their own functioning programs. The first example programs may be short and simple, but I would hardly expect someone brand-new to programming to read hundreds of lines of code without first being taught the syntax. Learning to read is unique in that the student is being taught a skill that is required in order to learn most other skills, so of course we learn to read before we learn to write, but not much earlier. We start learning both in kindergarten, right? Besides, most people don't know how to read effectively anyway, even though they technically learn to read first, so I don't see why a pure reading-before-writing approach to programming would necessarily be better than what is currently being done.

As you can see, some topics in this book are quite controversial, and that's why I really enjoyed it. No one who reads this book will agree with everything, and that's okay. It's meant to raise a debate, and Glass does a great job of presenting a strong case that you can take a stance against if you disagree. It got me thinking over ideas that I've held on to for a long time, and we all need that from time to time. It's also a fairly short book, so there should be no excuse to not read through it. I highly recommend it.

Programming Pearls 2

In the introduction I said this book surprised me, and what I meant by that was that it was not at all the book that I expected. From the title alone, I was expecting a book similar to Facts and Fallacies of Software Engineering in that it would relate a number of experiences and advice from Jon Bentley's career about how to do software development more effectively. I suppose that's what this book is, in a way, but it's more of a combination of an informal algorithms book and a practice set of programming problems.

It's not nearly as thorough or rigorous as Introduction to Algorithms (CLRS) or Algorithms by Sedgewick, but it gives a passable review of most of the fundamental algorithms for sorting, searching, and storing data. The first chapter starts off with an interesting little algorithm that I had never seen before on sorting a list of numbers with a restricted range using an array of bits. In this case it was telephone numbers, and it was accompanied by a story about how he was helping a new colleague with a problem, but the algorithm itself could be useful in plenty of situations. 

The next chapter continued the thread from the first chapter with a few more problems that could be neatly solved with novel algorithms, like finding all possible anagrams or swapping unequal halves of an array. The third chapter covered ways to make programs much more efficient by correctly structuring the data that was being manipulated. The classic example is using an array instead of a set of numbered variables, but Bentley gave other examples as well, like creating a simple template language for form letters. 

Chapter 4 was all about program correctness, using the binary search algorithm as a conduit for discussing the pitfalls of complexity and how to formally verify a program (or at least a function, since formal verification just doesn't scale). The last chapter in this first part of the book quickly covers testing, debugging, and performance timing. That completed the preliminaries, which is what the first part of the book was about, and throughout these chapters Bentley had short, direct advice about how good programming was about balance:
Good programmers are a little bit lazy: they sit back and wait for an insight rather than rushing forward with their first idea. That must, of course, be balanced with the initiative to code at the proper time. The real skill, though, is knowing the proper time. That judgment comes only with the experience of solving problems and reflecting on their solutions.
I think some later writers in the field took this idea of the lazy programmer to the extreme, but I like this nicely moderated perspective more. He had a similarly measured view about performance optimizations:
Some programmers pay too much attention to efficiency; by worrying too soon about little "optimizations" they create ruthlessly clever programs that are insidiously difficult to maintain. Others pay too little attention; they end up with beautifully structured programs that are utterly inefficient and therefore useless. Good programmers keep efficiency in context.
There are no universal answers in programming, so there shouldn't be any universal advice, either. These comments make the point that you can't turn off your brain when programming. You have to constantly consider everything that would have an effect on the problem at hand in order to come to a more optimal solution.

The second part of the book is all about performance. Chapter 6 kicked things off with a look at the n-body problem in physics for simulating the forces that bodies exert on one another. Making the simulation fast was not only about developing a good algorithm, but also using the right data structure, tuning the code for the machine it was running on, and optimizing performance critical loops in assembly. A recurring theme in the book was that there's no silver bullet, and this case study exemplified that with its multifaceted optimization process.

The rest of the chapters in this section expanded on the ideas brought up in chapter 6. Chapter 7 talks about how to estimate with back of the envelope calculations. Chapter 8 discusses various algorithm design techniques including the all important divide-and-conquer approach. Chapter 9 delves into when and how you should do code tuning to get the biggest benefit. Chapter 10 looks at how performance can be improved by reducing space, both of the program code and the data that it operates on. Every chapter had succinct little examples and highly condensed code to show the essence of optimized programs, along with interludes of advice like this:
[E]very now and then, thinking hard about compact programs can be profitable. Sometimes the thought gives new insight that makes the program simpler. Reducing space often has desirable side-effects on run time: smaller programs are faster to load and fit more easily into a cache, and less data to manipulate usually means less time to manipulate it. The time required to transmit data across a network is usually directly proportional to the size of the data. Even with cheap memories, space can be critical. Tiny machines (such as those found in toys and household appliances) still have tiny memories.
It's good to remember that not all programming is done on massively powerful processors, even today with a supercomputer on every desk and in every pocket. We have just as many small processors with limited memory and a tight power budget doing plenty of complex calculation tasks.

That brings us to the last section of the book on five interesting programming problems. The first one is on sorting with quicksort. The second one is about how to draw a random sample from a larger collection. The last three are on searching, heaps, and strings. The last problem dealing with strings was a great way to end because it was actually about how to teach the program how to generate english words and sentences using example material. It was a chapter on machine learning written before it was cool.

On the whole, this book was a fairly enjoyable, quick read. Bentley is clear and to the point, even to the point of being abrupt. As an algorithms book, it's not up to the level of the more formal algorithms books, and it wouldn't be very useful as a first book on that material. It is a good summary of the basic sorting and searching algorithms with some interesting unique algorithms thrown in to spice things up, making it a decent review for the more experienced programmer. If you're looking for an easy second or third book on algorithms to peruse, it's worth picking up.


So it turned out that these two books are not directly comparable, but hey, it happens. They were both enjoyable reads. I would say Facts and Fallacies of Software Engineering was somewhat more enjoyable than Programming Pearls 2 with Glass' way of challenging your assumptions and long-held beliefs, but both books are worth a look in their own right. Programming Pearls 2 does a good job of reviewing the field of algorithms with a few novel ideas thrown in the mix. It all depends on what your interested in at the moment: high-level software engineering issues or algorithmic programming problems.

No comments: