2020-05-29

Worlds in conflict

The world of adults has fragile things of significant monetary or emotional value. It has collections of small related objects that need to stay together. It has complex systems that need to be operated correctly lest they break. The world of a couple who've been together but childless for more than twenty years has a lot of these things tucked into nooks a crannies all over the place.

The world of toddlers has none of these things. Instead it has fascinating percussion instruments and whatsits that can be swung around. It has piles and full containers that are fun to scatter across the floor and abandon in search of the next entertainment. It has buttons and dials and levers that cry out to be fully exercised as they come to attention. And of course it has lots of climbing challenges and various mobile objects that can be used to surmount them.

Which means that the adult world does not have enough secure high-places for all the things that need keeping out of the way of an agile toddler.

I short: we're losing.

2020-05-20

Why I still write low-level tests

I've seen on-line a few places where people recommend against writing "unit tests". [1 2 3 4]

At first blush that seems like a strange recommendation as writing tests is a pillar of modern software engineering, isn't it? But when you read on you find that they have a various well thought out—if rather more specific—recommendations.1

To take the TDD argument for instance, you are working to meet a set of predefined tests that tell you if your program is working. In other words the the test which are driving the design are (by their nature) acceptance tests. Such tests (should!) specify the user's view of what the program does but not the low-level implementation details. Testing the implementation details locks you into a particular implementation and you don't want to do that.

And people have other good reasons for their advice, too. Traditional low-level tests give you very little coverage per test, don't test the interactions of components, and don't mimic user behavior. Put into less inflammatory language, the recommendation are more like
  • Don't lock yourself into a particular implementation with your tests.
  • Write tests with non-trivial coverage
  • Write tests that mirror actual use patterns
All of which effectively push against testing low-level code units.


Why I not going to stop writing low-level tests


I am employed as a scientific programmer, which means that the codes I work on tend to have fiddly requirements for reproducibility and precision in floating point calculations, tend to execute moderately complex decision making,2 and are often pretty deeply layered. None of that is ideal, but the domain drives codes in that direction.

So, a common enough scenario in my end of the business is
  1. Write a set of utility routines handling a domain specific computation.
  2. Start using them to solve parts of your main problem. End up with several different bits of code that depend on the utility code and seem to work.
  3. Sometime later, make a change in one of the upper layers and a bug emerges.
  4. Waste a lot of time looking at the new code without luck.
  5. Eventually trace the issue to a poorly handled corner or edge case in the utility code.
  6. Bang head on desk.
Much of this pain can be avoided by knowing that you can reason about the high level code because you know that the low-level code does what it says on the can. Which you accomplish by having a pretty exhaustive set of tests for the utility code.


Re-framing


As above, I'm going to dig into the arguments related to TDD as an example.

In the farming of some TDD proponents that would be testing a "implementation detail" and a bad thing. I assert that the problem isn't testing low-level details, it's treating the tests of low-level details as having the same immutable character as the acceptance tests (that is, treating them as enforcing a particular implementation instead of understanding that they test a particular implementation and will be replaced if you change the implementation).

We can frame this is a couple of way
  1. We can maintain a categorical separation between acceptance tests that state what we have to accomplish and implementation tests which provide assurance that the-way-were-are-doing-it-right-now is working. The former are non-negotiable and the latter are fungible and exist only as long as the related implementation exists.
  2. We can conceive each sub-module as it's own TDD project with it's own tests, but be prepared to swap underlying modules if we want to (which means swapping the tests along the way because the test go with the module).
Fundamentally both points of view accomplish the same thing. Which framing is most appropriate depends a bit on the structure of your code and how your build system works.3

Either way I end up with two kinds of tests.

And when I look at the other arguments I end up with the same conclusions:
I still need my low-level tests, I just need some other tests as well.

And we have names for them. I've already used "acceptance tests" which are typically end-to-end or nearly, generally have relatively large coverage per test, and often mimic actual use patterns. "Integration tests" check that components play well together and test a lot more code in one go than traditional "unit" tests. "Regression tests" help your audit changes in your user-visible output, tend to have high coverage, and are generally driven by use cases.

In any case "don't write low-level test" is the wrong lesson. The lesson you should take here is that tests come in multiple kinds with multiple purposes. Some tests are for the clients, some are for the developers, and some are for both. You need to figure out which are appropriate for your project and include all that qualify.


1 The authors can be forgiven for phrasing the title in so jarring a way. We all hate click-bait but we do want to draw the audience in.

2 Really complex algorithms are usually limited to a few places and almost always exists in libraries rather than written bespoke for each project.

3 I suppose that if your Process-with-a-capital-P is Test-Driven Design you want the build system to enforce the distinction for you so that you can't just wipe away an acceptance test thinking it belongs to a implementation detail.

2020-05-15

Paradoxical success

Many parts of the US are starting to significantly relax public health restrictions related to the virus. To judge from the infection rate data one finds batted about the web some of them are being more responsible than others.

My particular corner of the world is in a weird place. The state has been largely successful in slowing the spread of the virus. Excepting one locality there haven't been any high-intensity outbreaks and the hospitals have not been overwhelmed.

Yeah! Go us! Whoo! Hoo!

But with something as contagious as this a large fraction of the population is going to get it eventually, so slowing the spread means it hangs around longer. Our hospitalization rate is actually still rising, and we're not ready for large scale relaxation of the restrictions. The Governor's existing order has been tweaked and extended.

My employer has established rules for returning work at the office (though they would be physically challenging at the actual facilities we have in this state) but we are allowed and encouraged to continue with full time telework.

2020-05-10

On the sleep habits of celestial beings

Judging from my daughter's behavior I conclude that her shoulder-devil is a night-owl and the angelic counterpart is a early riser.

2020-05-09

Chosing a new build system

I've started a couple of little software projects which might evolve into something worth putting out in public. Indeed, if I'm committed to open source approaches I should probably put them out there fairly early on. But even with the bare bones, just-preparing-the-ground state of these projects I'd like potential users and contributors to have a smooth path to seeing what little functionality is actually done. Which means having a well designed and at least somewhat cross-platform build.

Now, I'm a Unix guy. I can write a simple makefile starting from a bare editor without breaking a sweat. I've maintained or extended the make-based build on projects with several hundred-thousand lines of code and many hundreds of source files targeting Unix, Windows, and MacOS. I respect make. But...I've maintained or extended the make-based build on projects with several hundred-thousand lines of code and many hundreds of source files targeting Unix, Windows, and MacOS. So I know how hard it is to get a make-based build to scale up without hiccups and what a pain it is to get it to do cross-platform smoothly and reliably. There ought to be something better.

XKCD Standards
Hat-tip to Randall Monroe
I've started looking into what options are available. If you've asked this question any time in the last fifteen years or so you know exactly what I've found.

Lots and lots of contenders, lots and lots of reviews and comparisons, and lots and lots of opinions. But nothing like a consensus. A great many of the contenders are stagnant or abandoned, while others never seem to have evolved beyond some niche or another. I suppose we can conclude that a lot of people think this issue is a pain point and that the problem is harder than it looks. Certainly building has a lot of inherent complexity and and cross-platform building is more complex still.

I'm tempted by meson (and I really appreciate Evan Martin's philosophical position about separating a fast DAG walking component from a intelligent decision making component), but I'm far from confident that it is the best choice. I'm also tempted by tup simply because I'm impressed by the improved asymptotic performance of the underlying process1 but I think the syntax is a little wonky.

Any thoughts?

Aside: If any one of my handful of occasional readers thinks that cmake is the obvious choice, I'll warn you that you're facing an uphill battle to convince me.2 I've had the dubious pleasure of getting to tweak the build of projects supported by that thing and I like it rather less than qmake (which I don't like much and don't consider a contender for projects not involving Qt).




1 There is an aspect of the linked paper that I really don't like. A couple of times the authors criticize Recursive Make Considered Harmful for insisting that you have to have the full DAG for correct build and they claim that they've proved that this isn't true. Except that they do have and use the full graph (remember that they have to cache the graph). They've found a way to avoid walking the whole thing (by using the change list to select the affected sub-trees), but they have to have it to start with or they have to build it before they begin. This is basically a problem with language and the way they form they claims, but it is also basically wrong. When writing a paper you certainly want to promote your work but you shouldn't over claim which is exactly what they've done there.

2 I've read many claims that the syntax of cmake 3 is so much better than that of earlier editions that people who dismissed the previous version should really look again. I find that scary because I've only ever worked with the "new" syntax and it's more than enough to put me off. Can you get Stockholm syndrome from working with difficult software tools?

2020-05-08

I didn't know that I could do that!

So, I'm watching some videos on c++. Call it continuing education.

Now, I've never formally studied c++. It wasn't part of my education, and I learned it "on the job". I have tried to read and learn as I encountered things I didn't understand, and I like to think I know a lot of things about the language. I mean, I even got a gold tag badge for it on Stack Overflow. Surely I'm not completely ignorant. Right?

One of the talks I'm watching is highlighting some parts of the Core Guidelines. The presenter brings up a case that I actually faced recently.
  1. You have a class with a const accessor method that returns the result of a computations (i.e. is not accessing a feature of the current implementation but a feature of the model that can be deduced from stored elements of the implementation)
  2. That deduction is expensive enough that you want to cache the result.
  3. You run into a conflict between the need to update the value of the cache and the desire to have the accessor marked const because it isn't changing any feature of the underlying model.
I'm happy to say that I didn't do the very bad thing (casting away the const), but I did lose the const marking on the accessor because I didn't know about mutable.

Crap.

Learn something every day.

2020-05-06

Human exception handlers

I've done a pretty good job of not contributing to StackExchange over the last few months, but I haven't been able to break myself of the habit of looking in. At first I could claim to be holding out hope that I would see something on meta.stackexchange that would let me change my mind, but at this point it's pretty clear that I just don't want to give it all up yet.

So I see things that, in the past, would have required me to do something and I have to sit on my hands instead. (Aside: they still haven't pulled my diamond. It's been more than three months!)

Today I see a situation that requires, in Jeff's words, exception handling: there is a user on a site who is posting dozens of pretty bad answers to old questions. By "pretty bad" I mean a range of things including "more wrong than right" through "pointless and not adding anything" to "possibly a useful insight but so badly phrased as to be more confusing than helpful". And they are putting up several of these an hour (I guess they have some time at home...). It's not being very productive on the rep front but in the past two days they've managed to net more than one hundred rep this way (averaging a bit over two points per post).

It's not something a ordinary user can do much about because any attempt to do so would be targeted voting or a sustained pattern of negative comments (which is to say "not nice"). It needs a moderator to step in a put the brakes on it. Stack Exchange moderators can use the contact users and suspension tools for on-going poor quality contributions, and this seems like a exemplar for the need.

A lot of things that call for moderator action are like this: a user exhibits a behavior that could normally be treated by other mechanisms with unusual intensity or in a manner to reduce the effectivness of the usual controls.

2020-05-04

Color me suspicious...

My employer strives to maintain a pretty high level of IT security. Our issued computing devices have encryption at rest and two factor sign in. The network in our offices pipe everything through a VPN to corporate and we're not to connect the laptops to public network excepting that we enable the VPN which requires two-factor sign-in. Accessing corporate mail through the web interface requires two factor identification. And so on.

It's probably not perfect,1 but there are clearly some professionals thinking hard about this issue back at the big office, which is reassuring.2

They've recently added the on-line time reporting system to the two-factor menagerie.3

But here's the thing: when I sign into the OS on my laptop, the VPN, and the webmail, I enter the code off the two-factor fob first and my password second. This new system is doing it the other way 'round. I don't consider myself expert enough on these issues to have an opinion about which way is better, but I'm pretty sure that doing it both ways can't be right.



1 As a working postulate I think "It's never going to be perfect" is a good way to understand IT security.

2 Actually, the bit where I don't have root on my development machine is pretty &^%$ annoying. But IT isn't being unreasonable for it's own sake on this: I can have root on a virtual machine and develop there as long as the virtual machine is behind the firewall on the "real" one.

3 And a good time, too. It's been getting a lot more access from outside the office lately.