2023-12-29

The limits of "Fix it when you touch it."

My main projects at work have been on minimal-spend for a few months which means I've been shifted to some feature adds for our biggest product. This thing goes back to the mid-eighties and is coded in C (updated to ANSI syntax, at least), Fortran (updated to f90, at least), C++ (with the standard containers, at least, but lots of it predates the "modern" era), and python (recently ported to python3, at least). So, yeah, it has all the issues you'd expect in a legacy codebase. Some of them in spades.1

We're basically a contract shop, so we don't do "Let's fix this entire module because it's grotty enough to be a pain", because who would pay for that? On the other hand, we really would like to have nice code, so we have a "You can fix issues with the bits you touch." policy.

Not complaining about that. My last feature add actually removed net lines of code because I replaced some really wordy, low-level stuff with calls to newer library features and factored some shared behavior into utility code to reduce repetition. So the policy makes me a happy (and perhaps even productive) programmer.

But it has it's limits. The short version is some legacy issues span a lot of code and you can't fix them locally.

Case Study

The bit of code I'm working on right now has a peculiar feature: in several places I find a std::vector<SomeStruct> paired with a count variable.2 They appear in an effectively-global3 state object and they're passed to multiple different routines as a pairs. This is very much not what you'd expect in code originally written in C++.

Sometime in the past this was almost certainly a dynamic array coded in plain 'ol C. And not even a struct darray {unsighned count; SomeStruct * data}; one paired with some management functions, but a bare, manage-it-yourself-you-wimp pairing of a count and a pointer.4 But why wasn't the count discarded when transitioning to std::vector?

Finding out requires a lot of tedious, close reading of code where the pairs are used. And, of course, seeing the old C code behind the current C++.

There are several places where the vector is resized (meaning multiple extra entries are added to the end in one fell swoop) to some "bigger than we need" value. Those extra entries are filled with default data, which the code then overwrites one at a time with freshly calculated "actual" data. This is (a) a performance optimization insofar as it prevents the possibility of multiple re-sizes and copies that exists if you added entries incrementally, (b) a fairly faithful transliteration of what would have done with the dynamic arrays in C, and (c) the wrong way to perform the trick with std::vector.5

When this was originally done in C, the "count" variable would have tracked how many entries had "good" data while the allocated size would have been known because you knew the maximum expected size was. But the container doesn't know that you want to do that and it's size() method will always return the number of objects it has (including the default valued one). So the manual count was still needed in places, and they kept the (effectively) global version because it was easier.

Result: getting rid of the extraneous count variable means fixing half-a dozen routines elsewhere in the project and I end up touching scores of lines in a dozen files. That's not "fix what you are touching anyway".

Takeaway

Some legacy maintenance is too big for purely local fixes.

Today was actually the second one of these I've looked at in the last couple of months. I was able to fix the first one mostly with global search-and-replace and only touched five files; I felt that was "local" enough for the payoff in terms of making the code more comprehensible. So I was optimistic on this one, too, but it quickly grew out of hand. None the less, I may be finished and if the regression tests are clean I'm going to commit it.


1 But, honestly, I worked with worse in my physicist days.

2 For those who don't do C++, the standard vector container is a extensible array-like data-structure. It maintains its own count.

3 Possibly a subject for another day, and another example of something you can't re-factor in the small.

4 In the Bad 'Ol Days, a significant number of programmers would begrudge the cycles lost to function calls for that kind of things when it was "easy" to do it inline at each site. Of course, they could have used a macro DSL for the purpose, but those are tricky and (even then) had a mixed reputation.

5 It's wrong for two reasons in general. The less important one here is that it default constructs the new values which can take cycles (in a C dynamic array using realloc means you just get whatever garbage was in the memory occupied by the new spaces so you don't pay for that). The more important issue here is that vector has reserve which unlike resize just makes sure you'll have room for the new stuff if you want to use it, meaning you can then emplace_back for the best of both worlds.

2023-10-29

You. Have. Got. To. Be. F*<#!^&. Kidding. Me.

There must be things in the communication world that make me angrier that a computerized voice prompt system that makes "typing" noises at you while it works, but right now I can't think of one.

2023-10-16

Didn't expect it to be quite that hard

I'm getting my first non-trivial experience with JavaScript. It seemed like a good idea at the time.

Backing up to explain, I wanted to write a talk for my colleagues for some time. On C++ templates. Another thing that seemed like a good idea at the time.

Anyway, there is some interest around the office in a brown-bag lunch series, so I've actually gotten serious about writing it. And I want to own it, so I"m doing on my own time and resources, but that means I don't have a copy of PowerPoint. Not that I'd want to use PowerPoint, anyway, of course. Now in my academic days using $\LaTeX$1 would have been a no-brainer, but I've been watching conference talks given in various HTML/CSS/JavaScript engines recently and it seemed like an ideal time to try something like that out. Yep, seemed good idea again.

Enter reveal.js, but also a headache brought on by perfectionism.

You see, it's to be a talk about a feature of a programming language and that means code samples sprinkled throughout. No problem, reveal has support for that. Only it would be nice if the code samples were syntactically correct.2 So, I'd like to...

  1. include the text of external files in my presentation
  2. wrap that text in whatever html is needed to get reveal to mark it up as syntax highlighted code blocks
  3. be able to compile the code in a separate tool to check it's correctness

Well, number three is entirely on me and it's easy. But both of the other gave me some trouble though it turns out I didn't need to know how to do the second item at all.3

It was the inclusion problem (item 1) where I got the surprise. You see, html doesn't have an inclusion primitive. You have to use JavaScript if you want to do that on the client side.4 Fine. Apparently that was messy in the past, but modern JavaScript has fetch to get the contents of an external asset and lots of DOM nifftiness to stick it where you want it with various kinds of preprocessing (as html, as script, as text, and maybe some others).

So, in the first place I want a code sample I write a little script that runs code = fetch("myfile.h") and then inserts it in a useful place. Reload. Nothing appears. Force reload. Still nothing. Open the inspector and find the tag. The tag is empty. What? Check paths. Force reload. Still empty. Try with "./myfile.h". And with "/home/noswampcoolers/projects/template_thing/myfile.h".

At this point I took some time off to curse many things. Including my own ignorance, but mostly other people and tools. Debugging by rant. Or something.

Finally notice the tabs in the developer tools. Particularly the one named "Console" and the red decoration thingy. Oh, there's an error; something about a "same origin policy" not letting me do what I want. Even though the data file resides not only on the same filesystem but in the same directory.

More ranting, then hit the web for some education.

Turns out the same origin policy says what external data can be loaded without a security negotiation. But it's not terribly well named because while it does put limits on the origin it also imposes limits on the protocol.5 Short version: I'm going to need a web server running on my machine to fetch stuff from the local filesystem.

The first solution I found for just serving a named directory was python3 -m http.server --directory "$PWD" 8080 > server.log 2>&1 &, but I think I could probably do the same thing with some npm invocation which would be more elegant in the sense of using the same tool for both halves of the task. It's also not worth my time right now. Python it is.

So at this point I could load some external code into my presentation.6 But it wasn't getting highlighted the way it did if I wrote it right in the html file. Argh!

Maybe it has to do with the order various scripts run in. Check the web and try moving my bit around. Lots of places. No dice.

More reading suggested that I needed to integrate my code with the framework by writing plugin. And just for once in this saga I was smart enough to find at least three existing plugins before starting down that rabbit hole. No idea if the one I picked is the best, but it seems to work.

All-in-all a enlightening but rather disheartening epic to achieve what sounded like a little thing. I'm sure there is a deep story in how it got to be that way, starting with "first to market is more important that getting it right" and moving on through various pitfalls of the standards process. I'd love to read about it if someone else would research it.


1 For a long time I had a custom set of extensions to slides, but eventually I graduated to beamer.

2 Semantically as well, of course. For those fragments that have semantics. But that's a lot less amenable to automated validation.

3 The issue I had with the second item is that I wanted to show C++ template code, and that means things surrounded by angle brackets: < and > which is, of course, the tag syntax of html. Before I solved item one, I learned a trick for that. You can include the text as the contents of a script tag marked with attribute class="text/template".

4 Lots of choices on the server side, but lets not go there.

5 I was too focused on solving the problem in front of me to read into the details, but I suppose this has to do with preventing externally supplied code from reading your locally stored data and sending the it back to the mother-ship. Which would make sense but it still leads to a minor absurdity.

6 Somewhere along the way I'd also taken the time to run down a DRY solution to including many snippets and adapted it to produce the same "pre-code-script" nesting I had learned for including code directly in the file. Shout out to Mustafa even though I didn't end up using the code. Plus, I enjoyed recognizing the immediately-invoked lambda from it's use as an initializing idiom in C++.

2023-09-14

Really, Qt?

Qt Creator is poluting my nice clear build results window with a complaint that my build directory is not on the same level of the filesystem as my source directory. It's a new complaint even though I've been using Qt Creator for ages, because I recently had to re-oganize my work a little. I have, for a long time, used a layout where by source, build, and distribution directories for each project are sub-directories of $HOME/source, $HOME/build, and $HOME/dist which actually meets Qt's strange requirement.1 Alas IT issued a new edict concerning the centalized backup system and on windows I've been forced to move my source hierarchy to $HOME/Documents/source on that platform. Which broke QT's little brain.

The policy is, presumably, intended to enfore a prefernce for out-of-source build, which is a good idea (every build should be out of source). But it is also way, way too restrictive a way to enforece that behavior.


1 This lets be share the source directory across the Windows host, several WSL linux installations, and the few VMs I still run while each platform maintains it's own build and distribution locations.

2023-08-29

Why isn't there a slider version of QInputDialog?

I need a quick-n-dirty UI element in my QT-based application for selecting a number. A number with a well defined lower limit (certainly not less than one, but honestly even 5 is kinda silly) and a fuzzier but still very finite upper limit (somewhere between 60 and 100, I'd say).

Now, there is QInputDialog::getInt which is what I'm going to end up using, but ... it's a natural use-case for a slider and there seems to be no similar convenience class supplying that particular functionality. Grrrrrrr!

2023-08-28

The invention of fanfic

I've been reading The Hobbit to the child for bedtime story recently. Bit of a slog at times because (a) she falls asleep in the most surprising places in the text, (b) frequent stops to discuss new vocabulary and previously unknown customs, and (c) lengthy interruptions to hear how she things the story should proceed. That latter bit came to a head tonight as we neared the end of Chapter 8. She believes that the story would be better if we dispensed with the spiders and added some unicorns and has decides to make up her own, improved version.

2023-08-05

What if it's untestable?

I've been working on evolving some code down in my project's foundation to enable to user visible improvements. My stategy involves the C++ standard library facility variant which allows the user to stores a single value at any given time but from multiple different types. For those of you with the proper background this is an improved union.1 It also provides some basic infrastructure for the situation when it does not hold a value at all, and I've been diligently covering my a all the bases as I go along.

Now I want to test so aspects of the work I've done so far.

Testing the "variant is in a bad, bad place" code means making a variant in a bad, bad place. Only it turns out that is not at all easy.2 There isn't a facility to make one. Not brand new to made-to-order, nor from an existing one by any usual means. cpp reference says

A variant may become valueless in the following situations:
  • (guaranteed) an exception is thrown during the initialization of the contained value during move assignment
  • (optionally) an exception is thrown during the initialization of the contained value during copy assignment
  • (optionally) an exception is thrown when initializing the contained value during a type-changing assignment
  • (optionally) an exception is thrown when initializing the contained value during a type-changing emplace
and it includs a sample of code to do the job, and a link to an online platform to compiler and run and see that it does the job. Now that code uses a custom type so I can't use it directly, but where I try to ape it using one of the types in my variant ... it doesn't work. I'm able write code that will hit all the last three conditions, but apparently my compiler is too smart for that.

Well, given how hard it is to access that state maybe testing the code that handles is houldn't be a high priority, but I"m annoyed.


1 Nor is it anything like a new idea, but it wasn't made part of the C++ standard library until the 2017 standard. Thankfully, by project has moved that far forward in time.

2 Of course this is a good thing. Except that it isn't quite impossible, so there is all that infrastructure there for handling the special case. Argh!

2023-07-31

Vicarious Endings

Shortly after we moved into our present dwelling we started sending my then two year old daughter to a local preschool. That was nearly four years ago and today was her very last day there. I've managed not to think too hard about it for the last few week, but driving over to pick her up today is all hit me like a ton of bittersweet bricks. We're talking Time Stands Still playing in a loop in my head and flashbacks to my own transitions through the years.

Yikes!

The funniest part of it is that we've been scaffolding this change for her for more than a year. I just somehow forgot to scaffold it for myself.

2023-07-24

Experimental

The other day I happened upon Heb Sutter's closing talk from CppCon 2022, about cppfront, his experimental attempt to provide a new syntax and set of defaults for C++.

Intriguing, I thought. If only I had a not very important project to try that out on. Well, that and some time, of course.

I've also been reading—but not working—Crafting Interpreters by Robert Nystrom.1 Up to a point not working it was going just fine because this is not my first exposure to the art of compilers and I could see what he was doing because I understood why.2 But last week, around section 25.1, I started to feel that I was losing the thread in pretty significant ways.

A marriage made in ... well, some other plane of existance, I'm sure.

I give you cpp2lox. Or at least the raw beginnings of it.

Observation: It's easy to forget how many nice things a good development environment does for you until you have to do without. Right now I have no auto-formatting and no syntax highlighting for cpp2, much less IDE support behaviors like completions, on-the-fly static analysis, etc. And it shows in the condition of the code. Well, you've been warned.

Observation the second: It's also surprisingly hard to notice violations of expectations in the experimental syntax that I'd have noticed pretty quickly in the official syntax. Trying to populate a std::vector of a user defined type was throwing some very unclear and mysterious errors that kept me confused for far longer than it should have given that the problem was simply that I hadn't defined a copy constructor for the type.


1 I highly recomend the book which is available for free online (and you should read part of it that way to get a sense of the book's mixture on technical clarity and humor), but I recommend buying it if your financial circumstances allow. It goes through two complete builds, line-by-line. The first is implemented in Java, and parses the source to an abstract-syntax-tree the walks the tree to interpret the source. The second one is done is plain ol' C and eschews the AST in favor of generating a bytecode for a custom stack machine, which it implments to run the bytecode. I'm not done with the second half yet, and still have the closure implementation and the garbage collector ahead of me. The interesting thing to me is that despitre the title the book gives you all the skills to build a source-to-metal compiler if you are so inclined.

2 Mind you my main prior exposure was working through Jack Crenshaw's series of articles Let's Build a Compiler which does a single-pass, immediate-code-generation, recursive-descent job on a Pascal-like language targets at m68k assembly.

2023-07-17

Vicarious success

Smiles and selfies after the committee introduces the new doctor.

I was thrilled to be invited to attend a former student's dissertation defense last weekend. Even better, another colleague from those heady days showed up as well, so it was reunion and memories all 'round as we celebrated. I'd like to say that the talk itself was enlightening, but honestly it was magnetohydrodynamics so I was hopelessly lost from the first moment. But at least she got a really nice seminar room for the talk.

Seeing someone develop from a bright but untrained newb into a compentent young professional is one of the best parts of teaching at the colege level. Getting to peak in on their further success is icing on the cake. I know I've been grinning a lot when the subjct comes up and can only hope I've managed to avoid being insufferable.

Even better I know of another former student also moving through the pipeline. You know who you are. And I am definitely angling for a invitation to that defense as well.

2023-07-04

More registration bullshit

I ordered something online last month (hardly an uncommon occurance). The vender shipped it and gave me a tracking link which I've been checking periodically. As of July 22nd, the tracking website lists it's status as

Shipment Received, Package Acceptance Pending [name of my town]
and it claims it will be delieverd by the very next 7pm. Every time I look it's coming by the next 7pm. For nearly two weeks, now.

Now, I have to say that "Acceptance Pending" is mildly worrying,1 but the ongoing lack of progress is a bigger deal. A few of days ago (when the status hadn't changed2 for eight days) I ran out a patience and decided to get a human being involved. I don't think that's unreasonable, do you?

Anyway, you might expect there to be some kind of link or button explcitly for escalating a case to a person. There isn't. Presumably that's a behavior engineering thing: if you make it too easy to escalate, people will avail themselves of the option enough to cost you real money. Disappointing but not terribly surprising.

So try drilling down both from the tracking page and from the home page.

The tracking pop-up does offer to let me log into the system so that I can see my "full shipment progress",3 but I have to give them lots of personal data to do that. Why? Why is knowing my name and where I'm recieving package not enough, huh?

The "Contact Us" part of the home page has a varienty of toll-free numebrs to use t oget put in a queue to talk to one of their call centers, which I suppose is what I'm going to have to do.


1 Did they lose the package between pulling it off the truck and sorting it for local delivery? Is the box more dmaged them they're comfortable with? Something else?

2 Except for updating the project deliever date every day to keep it at the next 7pm, of course.

3 That offer is on the "Shippment Progress" tab of the tracking pop-up. I guess it's really a Partial Shipping Progress tab. Or something.

2023-06-21

IDE feature sweetner

My most common IDE knows how to extract selected block of text as a function or method.1 Which is great even when it misses a couple of parameters and almost magic when it works exactly as intended. But I've enountered a problem in the current bit of legacy code in front of me: I need to extract a loop body that is more than six hundred lines long. Selecting the thing is enough of a pain as to present a barrier to getting anything done.

I could really use a feature to "select current loop body" or a extract variant like "extract loop body".

Ah! Ha! Found it!

In Qt creator go to move the index inside the block, then use menu item "Edit:Advanced:Select block down".


1 Well, usually. For some reason Qt creator occassionally just doesn't offer that option in the re-factor sub-menu. No idea what controls the availability.

2023-04-20

Yep. That was kinda dumb of me.

Consider the canonical Meyer's Singleton implemetaion in C++ (2011 or later standard): class Singleton { Singleton(); ~Singleton(); public: Singlton(const Singleton &) = delete; Singleton & operator=(const Singleton &) = delete; Singleton & instance() { static Singleton s; return s; } }; It has all the properties you might want of a Singleton:1 things like guaranteed single initialization and thread safety. Plus, it's easier to remember than any of the other functional variants I've seen.

But there is a subtle trap in the fact that I haven't shown you my implementation of the c'tor.2 One that I just tripped on in a big way. What happens if you were to call instance() in the constructor?

Now, it is unlikely that anyone will write Singleton::Singelton() { instance(); } and overlook the silliness of it for long after the program hangs during testing. However, writing Singleton::Singelton() { supportRoutine() } where supportRoutine() calls helperWidget() which calls instance() is a lot easier to miss for a few frustrating hours. Trust me.

This is about the object lifetime model in modern C++. And honsestly I didn't know enough about the hoary details to figure out what the standard requires when I realized what I had done, but MSVC 2019 appears to enter an infinite recursion. Luckily the internet is full of people who have more (or at least different) expertise than me, so I was able to get some help. The first source that my google-fu found was a blog post by mbedded.ninja which includes an explicit call-out to the standard3 telling us this is undefined behavior.

So, yeah. Don't do that.

Late addition: on reflection, the reason I got into that pickle in the first place was that I was using the Singleton both to manage an OS resource and and as a repository for utility code related to the resources. Having them in the same context made it too tempting to trigger some of the utility code from the c'tor and involke the recursion. Re-writing the code to better folloow the Single Responisibility Principle made the loop imnpossible in the first place and made me re-think when to invoke the utility code.


1 Assuming, of course, that you want a Singleton; more on that in another post.

2 I've left the d'tor unspecified as well for a philosophical reason: you should always decide on their implementaion as a pair. I don't want to commit to = default; for the d'tor until I know what I'm doing in the c'tor. In the instance that brought the subject up I was grabbing OS resources in the c'tor and so needed a non-trivial d'tor that would nicely give them back. Cue "but technically's" about the destruction order fiasco. I said I would discuss the (often ill-)advisability of Singletons in another post.

3 Section 6.7 paragraph 4 which says in part:

If control re-enters the declaration recursively while the variable is being initialized, the behavior is undefined.

2023-03-31

How Tabs versus Spaces affects the display of code and the tool chain

I know, why don't I just reintroduce emacs versus vi, right? But bear with me, I'm going somewhere with this.

I'm getting to do another round of "generate a format to fit an existing code base". At least this time I'm not (entirely) responsible for starting the project without having this stuff in place: we recieved the starting code from upstream without auto-formatting support. Anyway, when I ran whatstyle over the existing code it told me we're using tabs. Which came as a total surpise to me. But the project is being built natively in Visual Studio and that thing seems to default to tabs. After cursing for a few minutes I calmed down enough to examine why I have a strong preference for spaces and decided that I didn't, in point of fact, have a reason. Just a habit. Which sent me off to the web to do some reading.

My reading showed me several things in a broad scope:

  • Spaces are the dmoninate choice in most programming communities (even python which seems strange to me).
  • Tab proponents are passionate.
  • Lots of space people seem to be interested in giving the author control of the code presentation.
  • Tab people seem to be interested in giving the viewer control, but there is also a streak of pedantic focus on meanig ("tabs are for indentation, spaces for alignment" is an idea I saw several places).

I also came across a really interesting argument: tabs are an accessibility issue. That is to say that folks with perceptual difference may be better served by controlling the representation. This might be people with sever focus issues wanting to increasing the visual indentation or a braile display using one explicit tab per indent to reduce wasted space. Not a point to be blithely ignored.1

So lets assume, arguendo, that I am convinced. I going to start using tabs in all my new code bases. What does that mean?

Implications of using tabs

  1. Line-length limits Many sets of coding guidelines include line length limits. These may be rigid, modestly flexible (you can overun by up to $N$ spaces to prevent other formatting ugliness), or purely advisory. In any case, they are suppose to provide a limit on how much horizontal space is needed to display the code in it's entirety. Only now different coders are starting different distances in on the same code. I don't see a simple solution to this issue that doesn't require going to the "third way" mentioned below, though I will conceded that it is less bad than the implications that follow.
  2. Alignment of broken lines If you have line-length limits you may have to break long statements or expressions across more than one line. Typically the "extra" line(s) are displayed indented relative the initial line often taking their alignment from operators on the first line. If that alignment indentation is also achieved using tabs, then when viewer changes the tab-width they will mess up the aligned formatting. The fix for this is mentioned above: you use tabs at the start of the line to indicate indentation (and only indetatation) and then do alignment beyond the indentation with spaces.2
  3. Avoid mixed cases at all costs With the exception of the post-indentation alighnment spacing mentioned above any mixed case is a nightmare in which almost no one sees anything reasonable. Automatic tooling should be provided to prevent either spaces at the start of a line or tabs after a space.3

A third way

Honestly, most of the problems identifed above are caused by mixing levels of control: the viewer is given control of the indentation but not of other aspects of the presentation. I'm not the first one to notice that, and not the first to suggest that the optimal solution is to give the viewer complete (or almost complete) control of the presentation. Editors should simply autoformat the incoming code to match the viewers preferences.

This isn't without it's own issues, of course:

  1. Communicating about position in the code At least some of the time, coders discuss position in a file using line numbers, which will break if two programmers are looking at the code in different views. You can, of course, use references to named entities such as method definitions for many things, but that isn't always fine enough. Perhaps the tooling can provide a notion of addressible units (statements?) and the editor can display them in the view. Or you can display the "as stored" line number. But you need something.
  2. Controlling churn on the repository You don't want formatting changes to generate activity in the repository, which means you can't let programmers check-in code formatted to their own preferences. You need to enforce a rigidly defined formatting for the purposes of storage.

Given the power of tools like clang-format, the stored-format/viewed-format part of the idea is entirely feasible, but I'm not aware of a tool that supports the addressibility requirement.

What will I do

I really think I need to talk to my team on this one, but we have tabs in the repository and a tool that can do something reasonable with them. We may be stuck with them.


1 The accessibility argument is the viewer control argument, only with the weight of "we need to be fair to people having a hard time" behind it.

2 And how does this interact with the accessibility argument?

3 Emacs provides something like this in Makefile mode

2023-03-29

A Nudge too far

There is a fine line between being a clever, benevolent technocrat or entrepreneur on one hand and a controlling, supercilious asshole on the other.

Title in reference to the book, of course.

It's hard to define the line (or more likely, several lines) that you can cross to get from one to the other because, well, people. But I can tell you one way to get to the wrong side of the line: slide from making the choice you prefer easy into actively standing between me and the choice I want. That's asshole territory plain and simple.

DOS?

My calendar for tomorrow appears to have been the victim of a denial of service attach against my having a life (or even a little breathing room). How the heck did all that stuff pile up?

2023-03-22

Don't save!

I do a lot of programming at work in Qt Creator which is a perfectly acceptable IDE,1 and up until recently we used qmake to manage the build for basically the same reason we use Qt Creator: because these are Qt projects and the native tools understand them.

But several things have happened recently2 that collectively have made us move to cmake for most projects. And that means we've bumped into a quirk of the IDE: there is a big time-cost to changing the CMakeLists.txt files. Not I hasten to add because its any slower than anything else in running cmake. It does that just fine. But after it re-configures the build, it re-scans the project to locate files that should be listed in the project tree-view. For more than thirty second in the case of my main project. And for some reason it does this as a synchronous operation: preventing you from taking any other action in the IDE.

Now, most of the time this is a non-issue: I'm working on the code not the build, so it rarely comes up and is a minor annoyance.

But if you are working on the build (as I was this afternoon), it is critical to overrule that "hit the save key-combo everytime you pause to think" reflex that makes so much sense at other times.


1 I mean, it can't refactor template code very well, but you can at least learn how it fails and know how to fix up the results.

2 Qt has deprecated qmake with Qt6, we've started using some of our code (including a underlying library) with third party projects that use cmake, and we have found that cmake is slightly better at parallelizing bigish builds resulting in slighlty but noticiably faster edit-compiler-test cycles.

2023-02-09

Unintentional perfection

We were following1 a luxury car down the road this afternoon and it kept suddenly slowing to a crawl for no apparent reason before resuming normal driving for a few blissful minutes. My spouse wondered aloud if the issue came with driving an "XFinity"?

It's just as well she was driving because I couldn't control my laughter for minutes afterward. That outfit of jokers provides what I have to laugh to call our internet "service" and it behaves just like that car.

She swears it wasn't premeditated, but it was beautiful.


1 At a respectful distance. This didn't seem to be some kind of "crazy Ivan" stunt.

2023-02-06

A Modest Proposal

I believe proper consideration should be given to the benefits of creating an affirmative defense to charges of assault and battery in the case that the assaultee1 is a telemarketer.

Now, I'm aware that some people will maintain that only the most hard-up and vulnerable of people will actually take some jobs which is a valuable point. Perhaps the defense could be limited to people of authority in such anti-social enterprises. Say, starting with shift-managers and working up from there certainly to include the technologists who support and enable the whole industry.


1 Note: not "victim", they are very much at fault in the whole interaction.

2023-02-03

Now what?

Challenge of the day:

Search the web for a git workflow that is suitable for cases where you need to take a sub-module through a non-trivial evolution (that is something you'll be working on for a while, making multiple commits on and want to share with your colleagues as you go).

Go ahead. I'll wait. Probably for quite a long time.

You see, the web is teaming with tutorials and workflow articles on submodules, but they rarely get farther than setting up a repository and/or checking one out. At that point the article is already too long for a blog post: you readers are bored and have deleted the browser tab displaying your hard-won expertise.

I thinking of trying something like

  1. Create a pair of custom branches with the same name in the parent and child repositories.
  2. In the parent, edit .gitmodules to point at the branch im the child.
  3. In the child do the first step of your evolution.
  4. Push both and inform your colleagues
  5. Continue working in the child; push and inform your colleagues when appropriate.
  6. Once your colleagues have approved the work, merge your branch in the child to whatever the main development branch is.
  7. Edit .gitmodules in the parent to point to the merged code.
  8. Push both and inform your colleagues.

Idle curiosity

What was the ratio of torches to pitchforks when the mob came to make Linus pay for git submodule?

2023-01-30

Developing independence

The child is five.

Recently she's started closing her door when she goes in her room, and sometime issues the instruction "leave me alone". All of which I expected, though I didn't really know when to expect it.

The bit where she locked her door behind her is what surprised me.

Can you use fixed-point?

I do scientific computing. Mostly in c++ which offers a host of places to have problems, but that isn't what I want to talk about today. Instead I want to talk about language-independent issues with math.

From a science or engineering point of view, the formulae that we look up and equations we write down and manipulate all assume we're working with real or complex values (or at least with integers). Notably all those fields are infinite and we represent a working sub-set of them with not-very-big sets of bits. And that's where the trouble sets in.

Now, many language provide types that offer a "floating-point" representation of some of the reals. Think about a binary version of scientific notation: $1.xyz \times 2^{abc}$. In modern times this stuff is actually well standardized with most hardware implementing IEEE754.

A not-at-all exhaustive list of the common problems for floating-point representations include

  • Easy to write down fractions like $\frac{1}{3}$ don't have exact representations in floating point because the format is finite.
  • Worse, even fractions like $\frac{1}{5}$ that have finite representations in decimal notation don't have one in binary notation and so are also inexactly represented.
  • The Commutative and Associative rules for basic operations like addition and multiplication are lost in some circumstances.
  • It takes special care to insure that you can accurately round-trip a in-memory value through a textual representation.
  • As a result of the above, it is very easy to write down an expression that has an equal sign in the middle on paper, but when you compute the two sides in code and compare them with == it returns false.
  • As a result of library differences in IO routines and some functions even if you get it right on one machine/compiler combination it can break if ported to a different machine/compiler combination even if they both implement the same standard!
As a result of these and other details floating-point math is notoriously hard to use correctly. The more so if you worry about unreasonable inputs (as you must).

We use floating-point math anyway because it supports values over a huge range of magnitude (for the number of bits used in the representation) and often has a fast, hardware-supported implementation. Still, sometimes, if you know the use domain well enough you can select a more limited range of necessary values and use fixed-point math to avoid some of the problems with floating-point.

Recently at work we dealt with "it's not comparing right" problems for angles on the sphere by coding azimuth and elevation in terms of a integer numbers of arc-minutes which provided more than sufficient precision for our needs, gets along nicely with the domain practice of describing angles in degrees, means that each value fits into a 16-bit field, and can be reliably round-tripped through a customer-specified text format.

Alas, many languages (most that are promoted for scientific computing) don't have bulit-in types or library support for fixed point so it isn't always practical: you have to ask how you will implement any special functions you need before you make that choice.

But it is worth asking right at the start.

2023-01-26

Unanticipated milestone

It's like a coming of age ritual, except not as fun.

There will come a time when your doctors describes your situation in reference to "a [wo]man of your age".

2023-01-07

Uhm...

I got a Kindle Scribe for Christmas. Yeah, now I hav an e-reader I can actually read papers on! Cool.

Of course, this just means I've been sucked into Amazon's ecosystem. You can only use a few formats on it, but that includes PDF. Except...

  • Annotation support on most PDFs is limited.
  • To get full annotation support you have to submit the file through "Send to Kindle"1 where I suppose it is pre-processed in some way and them made available for downloading to your machine.

Fine. I'll pretend I don't think they're spying on my content, and I'll even try to believe that they are spying responsibly. But honestly they already know a lot about me.

But that doesn't mean that there aren't any surprises here. You see, the machine has a web browser. It's not very good, but it's there. That browser won't save PDFs directly to the machine, which is a little weird, but the big oddity is that it also has no facility to route them through Send to Kindle. To somehow manage to make matters worse, the "Send to Kindle" page has a link for a "Send to Kindle" browser extension!

Seems like a no-brainer to me. Perhaps I should send them a note, but they make that weird, too.2


1 More "Send to Kindle" breakage: it will convert ebook files to a Kindle friendly format for you, but will silently drop on the floor any files already in Kindle preferred format (that perhaps, you downloaded from a DRM-free source like Standard EBooks).

2 The device supports a "Contact Us" feature, which lets you send missives into the void, but they specifically disavow any intention to hold a conversation as a result (or even just telling you what they are doing about it). There doesn't seem to be a customer facing issue tracker where you can see that your suggestion is already in progress or follow deliberations on how to handle an issue. I'm spoiled by the culture of the open-source world.

2023-01-04

This, too, will pass

For most of my politically aware life it was possible to count on the Replublicans being disciplined team-players, while you never knew when the Democrats were going to suffer a flash resurgance of the feud between the dreamy idealists, the hard-core misguided utopians, and the just trying to get something to work pragmatists.

That's not the look we've been seeing for the last decade or so, culminating in the chaos around this year's Speaker of the House election.

Of course, falling at the last hurdle on the way to his lifelong dream couldn't happen to a better guy than Mr. McCarthy, but wishes for a centrist coalition aside I worry about the alternative.