2021-02-23

TDD practice project

Among the programming videos I've watched in the last few months there was a entire multi-day workshop given by Robert "Uncle Bob" Martin. He's an entertaining speaker. Of course, most of the presentation covered the topics he's been writing about for the last couple of decades, but the part that really caught my attention was his demonstration of the coding process for test-driven design.1 I'd like to take the idea for a test drive.

Now, I've been conscientiously adding testing to my personal process for years, and I've long included bug-report-as-test in my toolbox but I've never used the "write a failing test first" scheme that Uncle Bob demonstrated. I completely buy his claims that (a) you have to practice to learn how to do it and (b) you can't learn it on the job. I need an exercise.

Probably there are sample problems out there, but I've thought of one of my own: I'm going to write a bracket nesting validator. Moreover, I'm going to do it in stages as if I were refining an agile project. In all cases the program will accept a arbitrary number of files to process.2 The stages I intend are:3

  1. Work on a fixed set of bracket pairs ((), [], and {}) and reports the filename of each file that has an error. The program will be silent by default on correct files. A report-on-correct switch is optional.
  2. Expand the error report to include the line and column number at which the error was detected.
  3. Expand the error report to include the class of error (close doesn't match open, close with no open, reach end-of-file with one or more open brackets). Bad match reports should also include the line and column number of the non-matching open bracket; end-of-file reports should include the line and column numbers of the unmatched open bracket.
  4. Allow the user to specify arbitrary character pairs to be treated as open/close pairs (including canceling or overriding the default pairs). Attempting to specify a single character more than once represents a error and suitable diagnostic must be produced.
  5. Allow a single character (such as ' or ") to server as both open and close for a pair; note that these pairs can not nest without an intervening scope. That is 'a'b'c' has two single levels pairs not one pair inside another, but 'a"b'c'd"e' is three levels deep.

I haven't even set up a repository yet and I'm already struggling with the conflict between by habitual way of working and the Process-with-a-capital-P I'm suppose to be exploring. On the white-board in my home office is a sketched out scheme for five data structure that will collectively support at least version (4) of the spec which I started drawing automatically almost as soon as the idea occured to me. But I think I'm suppose to let most of that "just happen", aren't I?

Argh! This is going to be, uhm...fun?


1 I've also adopted his "Yeah! I'm a programer!" bit as pick-me-up for those days when even the little victories seem few and far between.

2 I'm intending to do a command-line tool, but there is nothing in here that requires it. Feel free to get that list of files from a file-picker widget and report in GUI list of some kind.

3 The professor in me feels obliged to note that there are even more basic versions of this exercise available. Notably:

  1. Perform the matching on exactly one kind of pair. This can be done without a stack.
  2. Just count that the number of open characters equals the number of close characters without caring about order.

However, I'm not going to bother with these variants unless the "do as little as you can" process happens to pass through one of them along the way.

2021-02-21

A sad end for "the party of Lincoln"

I had written a longish screed--full of invective and disdain--here. But in the end that kind of thing just isn't helpful, so let's just go with the bare facts.

I've contacted my County Clerk's office to change my party affiliation, and I shall never again vote for a candidate running as a Republican. Not one. No matter how good the candidate.

I could still vote for a person running on a platform full of the better Republican ideas and ideals, if they could earn my respect. But only if they choose a different party association.

2021-02-18

Abstraction and debugging

Working on a new feature the last couple of days, and my naive initial implementation isn't working correctly. Trying to track the problem took me into some code that we've written to interface with a legacy system we rely on. The code in question was written by a junior dev and while it is funcitonal it has a number of raw loops that perfrom simple actions. We're talking about things like constructing a container of objects used by the legacy code from a container of objects used by our core code and vice versa.

These things can also be done using the algorithms library and a lambda.

So compare two ways of doing the same thing.1 Both assume the existance of a std::vector<coreGadget> named input. First, using a raw loop:

std::vector<LegacyWidget> output;
for (size_t i=0; i<input.size(); ++i)
{
	const CoreGadget & gadget = input[i];
    const LegacyWidget widget(gadget.data(), gadget.info());
    output.push_back(widget);
}

Second using std::transform:

std::vector<LegacyWidget> output;
std::transform(input.begin(), input.end(), std::back_inserter(output),
	[](const CoreGadget & gadget)
    {
    	const LegacyWidget widget(gadget.data(), gadget.info());
    	return widget;
    } );

Now, highly respected speaker and bloggers have been encouraging the latter over the former for some time, but I've wondered if this was really a home run or just a convincing gound-ball single. Basically because the std::transform version requires you to understand a lot. Iterators may be more general than indexing, but they are a thing you have to know. Similarly lambdas may be a cleaner way of doing function pointers, but they have their own syntax and in may cases you have to understand captures.

Personally I enjoy writing code that uses the algorithm header and once I started doing that I quickly became comfortable and adept at reading it. But you code for your teammates (and future teammates) as much as for yourself and my junior devs seem a little hesitant at times. So it's nice to have a quality of life argument in favor of the modern way of doing it, and that's what I discovered recently.

Imagine stepping your debugger through a function that performs this transformation in search of a bug. Imagine it's the third (or fourth or fifth or whatever) time through and you already know the transformation works as intended. How do you skip it?

Oc course you could just set another break point beyond the transformation and continue, but with the std::treansform version you could also use your debugger's "step over" feature. I'm not familiar with a dbugger tha has "step over loop".


1 There is also the intermediate option of using a iterator-based loop and the extra flourish of using an immediately invoked initializing lambda but I don't think they change the trade-offs here.

2021-02-13

Little surprises #7: more Qt versus the preprocessor misery

Trying to be a good kid. Writing tests as I code. Testing the edge cases. "Hey, this should throw, does Qt have a test for that?" Yeah, its QVERIFY_EXCEPTION_THROWN. Great, let's use that!

void suspisiousFunctionCallThrows()
{
    // Define badInput and otherParam;
    
    QVERIFY_EXCEPTION_THROWN(suspicisouFunction(badInput,otherParam), std::runtime_error);
}

It doesn't compile. Why not? Commas again, of course.

And it is conceptually easy to fix: you just create some blind wrapper than make the offending call without taking multiple argument.

void suspiciousFunctionCallThrows()
{
    // Define badInput and otherParam;
    
    std::function f = [&]{
    	suspiciousFunction(badInput,otherParam);
    }

    QVERIFY_EXCEPTION_THROWN(f(), std::runtime_error);
}

Not exactly a featured stop on the "Look at our transparent tests" tour, is it?

2021-02-03

Maybe type aliases are part of the happy medium.

Tension between conflicting goals is as much a part of software as it is of life in general. Today I am thinking in particular of the tension bewteen planning ahead and generality on one hand and KISS and YAGNI on the other. Plan too little and you either end up with multiple slightly incompatible implementations of parts of your design or you metaphorically paint yourself into a corner and have to redsign at a large scale. Plan too much and you both over complicate and waste time writing features you never use. Somewhere in the middle is a sweet spot that you aim for.

I've been working on reining in my tendency to overplan for some years, and I'm doing a lot better these days. Except for one one paricular case: making thing type-generic. If I'm writing a class and ask "Hmm ... what type of underlying data should this use?" and don't find an immediately obvious answer my first reaction is to type template <typename T>.

Looked at naively this is a good trade-off: the mental cost of writing and reading a simple template that just serves to defer the choice of underlying type is barely more than that for the untempalted code, and it increases the generality of the code. What's not to like?.

But there are hidden costs: increased compile times; latent bugs,1 and the need to chose between explcit declaration of desired instances and header-only code. And of course, header-only code makes the compile time issue worse. Now I'm aware of techniques like thin templates, but that negates my claim that the template isn't any harder to write or read than the untemaplated option.

Today, while I was waiting on yet another overly long build I had a long overdue insight. In most case that template class starts something like:

template <typename T>
class Thing 
{
public:
   using part_t = T;
   //...

That is I've borrowed the habit of naming types related to my classes using type aliases (seen in the standard library and elsewhere). But here is the crucial observation: I could write Thing without templates and still use the type alias with a interim choice for the type:

class Thing 
{
public:
   using part_t = float; // Just pick something for now...
   //...

If I change my mind about the type fixing it is a one-line edit.2 And if I find that I need multiple choices later coverting the class to a template is a straight forward refactoring step, but I don't pay the template costs until I actually make that call.


1 Even latent compile time bugs. I've had to fix compile-time bugs in code that's been in the repositoy for months because we finally instantiated it with a type where the bug surfaced.

2 As long as I am consistent in using the alias, but I'm trying to do that for readability anyway.