2020-11-20

Success! Kinda.

So, my efforts to flatpak clang-format have succeeded. If you are willing to interpret 750MB for one program as success.

But hey, the first attempt generated a 17GB package (and took three hours to build on a fast four-core machine compared to 15 minutes for the current iteration), so this looks pretty good by comparison. But I'm still remembering that my first $N$ linux boxes had smaller disks than that. Including the one I analyzed my dissertation data on.

Anyway, this being something I did for work I can't share the actual build manifest with you, but I can offer some pieces of advice:

  • Get the code from the git repository rather than downloading tarbals. You can do this in the source section of the manifest.
  • Don't use buildsystem: cmake (or cmake-ninja) which will just build everything, but instead use buildsystem: simple and issue the build commands yourself. This lets you do ninja clang-format and thereby get the build system to pick out the minimal set for you.
  • To figure out exactly what was needed by way of installation I put some finds amung the build commands and used flatback-builder -v so that I could see all the output. With the build down to a quarter of an hour, several passes through the build process to get it right is cumbersome but tolerable. Remember that you are installing libraries as well as the executable.

I imagine that some size could be saved by a fully static build, but this will do so it's no longer worth my time.


Followup: A simple "static" build of the program occupies between 2.3 and 6.8 MB on the platforms I care about. I think this may be the way to go. I put scare quotes in there because on gcc doesn't like to statically link a number of libraries (glibc for one) so these things are not completely static and accordinagly not cross-platform. But the build is less than fifiteen minutes on my fast test machine, so we can afford to build it as needed while setting up a new repository.

2020-11-19

Likea?

Should we be dynamically or statically linking?

That question used to come up for public debate and reconsideration in computing circles every once in a while, and was generally answered by the fact that resource constraints simply let you do more with your hardware when dynamic linking is the norm. Of course, that decision doesn't come free. Library updates introducing uncaught bugs to a host of dependent programs is a thing. So is DLL Hell and it's close cousin all-or-nothing-upgradeitis.

And though Moore's Law may be dead in the form that we knew and loved it decades ago, the real cost of all kinds of computing resources has come way, way down. So sometimes we have diskspace and even working set (well at least in RAM; not so much in cache) to burn.

Enter flatpak. Not just a program, but also it's whole dependency chain as a separate resource!

OK. It solves a problem for software providers ("I have to figure out how many different packaging systems and get this past how many different policy review committees to make sure my product is available to all my potential users!?!") and the flip side of the problem for users ("I want to try Newthing version 2.0, but my distribution is still stuck on 1.7!"). Granted.

But the cumulative effect of using it for everything is or would be vastly wasteful. Thankfully I don't hear of very many people trying to use it for everything. Or am I just ignorant?

Anyway, my personal policy runs along the lines of "Use your distribution's single, integrated, resource-sharing environement for as much as possible, but there is this thing you can do in a pinch."

That said, I mentioned recently that I'm preparing to enfoce the use of clang-format on my project at work. Only you have to pick a single version of the program or else chaos ensues, we have a divierse set of build environments, and they don't share a common version availble from the official (or semi-official extended) repositories.

So I am actually trying to build a flatpak to provide clang-format1 so that my team has an easy way to install a compliant tool.

Wish me luck.


1 For future references: left to it's own devices llvm builds everything by default to the tune of almost thirty gigabytes (and takes several hours on a fast, quad-core machine to do it, too). Must tell it to build less. Probably need to do the same with clang, too.

2020-11-18

Covid getting close

Despite the notable success of my part of the country during the early months of the pandemic we're in the thick of it now. Some combination of luck running out and people getting careless through exhaustion with the continuous state of low level emergency, I guess.

My training as a physical scientists left me with an irresitable urge to put things into perspective in a particular numeric manner. I note that at this point the numbers in the news correspond to approximately 3% of the US population having been diagnosed with Covid19 at some point this year and a little less than one-tenth of a percent having died from it.

So it is very easy to not know anyone who has died from this thing and not particularly hard to not know anyone who has gotten it. Especially if you live in an area that had not been hard hit until recently.

On the other hand this means we can expect Covid19 to be the third largest cause of death among Americans in 2020, outstipping all causes of violent death.

Anyway, with the virus surging in our community we've been affected in two ways this week. First, my daughter's preschool (which had re-opened in mid summer) has closed again after two cases in a month (but apparently no transmission in the building). They won't be re-opening until after the new year. Secondly two members of the team who care for our honorary Grandma are quaratining after a relative of theirs was diagnosed with the virus. The contact tracers have not called us so *we* may be in the clear, but there are three adults and four kids who are in our immediate circle at risk. Scary times.

We're trying to tighen up our practices again, and we had also started wearing masks in the common areas of the house last week (though neither thte toddler nor Grandma pay much mind to that). The medicos have learned a lot about how to treat this thing in the last eight months, but our hospitals are running just about at capacity right now which is not promising.

2020-11-13

The virtues of not being clever

I encountered a discussion of FizzBuzz (as a a test of basic programming competence rather than in other contexts) out on the internet the other day. In particular the discussion covered the annoying fact that the naive ways of implenting the thing end up giving up a slight degree of inefficiency: it's hard not to have at least one "redundant" condition evaluation.

And I had a brainstorm in passing: a slightly less naive implemnetation that avoids that pitfall. I'm positive I'm not the first person to have through of this but I don't recall seeing it written down anywhere before. So here is the decision code in c++:

#include <cstdint>
#include <string>

std::string FizzBuzz(uintmax_t n)
{
    switch (n % 15)
    {
    case 0:
        return "FizzBuzz";
    case 3:
    case 6:
    case 9:
    case 12:
        return "Fizz";
    case 5:
    case 10:
        return "Buzz";
    default:
        return std::to_string(n);
    }
}

It has only one test and the switch is likely to be efficient. So, it's great right?

Well, no. I think that it suffers relative the naive approach in terms of clarity and conveyance of intent and would generally avoid this kind of cleverness in production code unless there was a big gain from using it.

You see, the switch is horribly opaque. To understand the task by reading the code one has to puzzle over the selection of cases, while naive implentations have the conditions written out explicitly.1 If performance was known to be an issue, I might look at the switch-based implementation, but until then I'd stick with one of the simpler but clearer ones.

Well, look at that!

While looking up links for this post I ran into a blog post that exhibts a switch-based solution albeit in ruby which means the switch has different semantics2


1 Here I am assuming that if ( i % N == 0 ) will be read as "if i is divisible by N", which I suppose is a unusual outside of programming circles. Failing that I suppose the implemntor could write a predicate isDivisibleBy(uintmax_t numerator, unitmax_t denominator).

2 I believe ruby's switch is syntactic sugar for a chain of if statements, while the C++ construct's (often annoying) limitations allow it to be implemented under the hood in a variety of ways that may include jump tables.