2026-02-28

Progress in the artificial "comprehension" of humor

A minor side-line in my on-going investigations of how well or poorly LLM perform has been teasing them with jokes. Back when I started they were consistently abysmal at explaining why jokes (even very simple ones) were funny, though they could fairly consistently categorize them into wordplay, dark humor and similar bins. Some models would confidently assure me that I wrote the jokes wrong or that they didn't make sense.1

When the models started to get a little better (able to do a passable job on the easy ones) I added a couple of relatively subtle jokes to my list and those promptly stonkered the largest models I had access to.

Until today.

Aside

My 64 GB RAM framework laptop (which I had been using to run mid-sized models locally) was stolen last fall and I haven't replaced it. I was unwilling to send my questions to the AI companies lest they train to the test; so I didn't have access to large or mid-sized models to try for a while. Then ollama started offering cloud services. I think it is much less likely that queries made through that channel are making their way back to the model vendors (ollama says they don't and I suspect the massive social cost of getting caught would deter them even if they were inclined to cheat), so I started trying the big models on their servers from time to time.

Meanwhile, back on the ranch

The big open-weights GPT (gpt-oss:120b) is a really impressive model—I've been using it for a lot of chat tasks I might previously have lobbed at ChatGPT—but it still failed at both my "trick" joke questions. In fact it gave almost the same wrong answers as phi4, gemma3 and so on. Maybe the result of a effectively common training set?

On the other hand its answers to the technical question I ask models were much more like those the big (300B+ parameters) leading edge models were giving six months ago than the one mid-size models (30-70B parameters) were giving while I still had the framework. So I concluded that progress was being made on several fronts including coding2 and what I call "deep-search"3 but not necessarily on making connections for which its training set had few examples.

Today, after doing my regular Saturday chores, I updated my ollama and looked at the recent models. Hmmm ... I'd seen a youtube about this gfm model and they made a variety of brags that would be impressive if true. So I tried it. It aced one of my favorite, easyish-but-out-of-the-mainstream coding questions. Not screaming fast, but fast enough that running the model and looking through the output would be faster than my solving it by hand if I hadn't worked it out in advance to use as a key for the test.

I got ambitious and asked it about the jokes.

One of the prompts includes a couple of hints: it notes that the joke was current "in mid to late 2011" and asks for an explanation of the physics behind the humor. For a human not familiar with the episode that generated the joke it would probably require some web searches to answer, but I suspect most educated people would get there. Gfm-5 if the first model I've put this to that nailed it.

The other prompt is a little more blind. The model has to recognize the relationship between the scenario in the joke and a much more common, but generally not humorous, scenario in fiction. Then it has to examine the change in the joke's version and work out why people do a double take and then laugh or groan. The answer I got from the model was not great, but it was the first LLM answer to ID the underlying story fragment, ID the crucial change, and write that the change represent an unexpected or twist ending. Close enough for government work.

Wow. Just wow.


1 Why do LLMs hate Moby Pickle and Smokey the Grape, anyway? Parke Godwin may have been dead for ten years, but he still had the power to baffle GPT-4 with children's riddles.

2 Generally something that is vaguely like a common example problem but differs in significant ways. My prompt for writing a wavefront model (.obj and .mtl files) is enough like the usual example (a cube) that many models start hallucinating cube half-way through.

3 That is, digging into a topic and giving me a top-level explainer such as you might get from a academic colleague in a different department who knows you are smart but not familiar with the domain.

2026-02-26

Modifier precedence in English

Languages often build more complex ideas by combining symbols for simpler ideas. How this works in any particular language is governed by some set of rules or another. It's pretty typical to divide the rules into (at least) two groups: some tell you what symbols can go where (grammar), and other tells you what it means when you put them there (semantics).

To clarify the difference, let's some example rules from each family across a set of natural and synthetic languages. Some example of syntactic rules are:

English
Prepositions are generally followed by objects or object phrases
Algebra
An equal-sign, other equivalence symbol, or inequality has an expression on each side either explicitly or implicitly
C
A declaration consists of one or more identifiers (with optional initializers) and information about their type1
Notably these are all about the grouping (and sometimes order) of language symbols (words and punctuation) in the text. By contrast, semantic rules are about the meaning of combinations of symbols.
English
Appending "ly" to many nouns converts them into associated adjectives2
Algebra
Compound expressions are reduced by respecting grouping symbols to identify sub-expression, followed by applying exponentiation, then applying multiplicative operations, and finally applying additive operations
C
A declaration gives the identifier(s) meaning within the program and instructs the compile on how it (they) can be used (the type information).

The fun part of this is that none of it is forced on it. Both sets of rules are devised by people for people reasons. In "natural" languages this comes about slowly and often organically for reasons that I certainly don't understand. Talk to a linguist. In programming languages some person or small group of people sat down and consciously decided them (though after the first couple of decades there came to be some broad consensus understanding to build upon).3

An advantage of someone making a deliberate decision shows up when you have complicated rules. The originator can write down an authoritative description of the method and that's that. For instance, the c-declaration int (*normalized_comp)(unsigned, const char *, const char *) may be pretty complex,4 but by looking up the procedure in The C Programming Language, the standard document or some website, we can know with certainty that "normalize_comp" is a pointer to a function taking three arguments (one unsigned integer, and two pointers to const characters) and returning a integer value.5

My beef today is about the rules for expressing frequency in English. In particular, we can use the "ly" suffix formation discussion about to modify a time-period into a frequency. Monthly. Daily.

Fine.

But we also have access to some prefixes that modify the number of thing: "bi" and "semi" for two and one-half are common in this use.

Alas, there is no authoritative author's document to tell us if "biweekly" should be interpreted as "twice weekly" or as "every two weeks". I'm fairly sure it's the former, but...


1 I'm going to ignore the wrinkle in which multiple identifiers in a single declaration can have different type when some of them are pointers. Those of you who need to know, know. And for the rest of you it doesn't add anything to the discussion.

2 I'm also going to largely ignore the irregularities of English. They are other ways to make adjectives but, again, it doesn't add anything to the discussion.

3 The algebraic symbols and order of operations is an intermediate case. It came to be through an organic process of push-n-pull in a community, but it was a small community and the candidates were generally formed deliberately by one or a few participants. Fun stuff.

4 This kind of thing is hard enough that a typical course in c includes a bunch of exercises in how to read these thing, but in case you fall out of practice there is a tool (cdecl) just to help you out.

5 Experienced c-programmers will likely intuit still another layer of meaning. Guessing that the pointer arguments are probably meant to point to character buffers (that is strings) rather than single characters, and the return value probably takes on values in the range -1 to +1 ala strcmp. The name of that layer is "idiom", and like in natural languages it is required to be really fluent. Our hypothetical experienced program might also have a guess about the initial argument (unsigned, so probably a size, so probably the max number of characters to compare...), but that is not so well established in the idiom.

2026-02-02

Hey, ya wanna help?

Here's the thing about smart phones: you cannot reasonably prop them between your shoulder and you ear. Not only will you get an instant muscle cramp (and probably scoliosis within minutes if you persisted), but the thing won't actually stay there. In that respect they really, deeply suck.

But whatever. Price you pay for the benefits of the form factor. Or whatever.

That said, this has a consequence: if someone calls and (a) you don't feel you can skip it, (b) you still need to have both hands for something (anything) other than the phone, and (c) you don't currently have your buds in then you must, in short order:

  1. answer the call
  2. switch to speaker
  3. prop the phone somewhere

Presumably the people who write the UI for these things have this experience, too.

But recently with my phone, when I tap to answer, the UI goes through some flashy, battery-draining, nonsensical animation which results in the hang-up control landing right where the change-the-audio-button was a moment before. I have no words.

Random musings

If you found yourself in the same room as whatever self-satisfied twit is responsible for foisting "liquid glass" on us and asked the two nearest other iPhone users if they wanted to help administer a swirly, what do you think the odds would be?

I put them over 2/3, personally.

2026-01-31

Baby steps in declarative style

I had occasion, recently, to look at some code one of my younger colleagues had produced. The code works. It's clear. Future maintenance will be straight-forward if a little tedious. It provides a very useful and non-trivial feature. But it is really—extravagantly, even—wordy. The engineer who wrote it knew that: he left a comment to the effect that it was ugly but he didn't know a better way to accomplish the task.

I do know a better way, so this post is intended to put the solution out there for those who need it.

Case study

The feature is simple:

Detect what columns of a CSV file contain specific data by searching the header for pre-defined strings, and subsequently look up that specific data for use while processing the remaining rows.
With which we can ingest the same data from files written by different producers that disagree on what extra columns should be present and what order any of them should be in. At least the headers fields are consistent across all the writers.

To do this we need some storage that will hold the indices (or indicate that columns are not present, so we must support a not-applicable value), and we need some code that walks the list of header fields examining each string and setting the appropriate index (or skipping with a diagnostic if the header is unrecognized). Some of the indexes are required to be set, but we check that after we've set all the ones we find.

A naive approach would be (a) define a named variable for each header index we want to capture, and (b) use a a big if-else if chain (with the terminal else handing the unknown header case) for the "set the appropriate indices" bit.1 That is something like this

//...
    int statusIdx = -1; // -1 for "not found"
    int thingIdx = -1; 
    std::vector<std::string> headerFields; // Setup somehow
    for (size_t i=0; i<headersFields.size(); ++i)
    {
  		headerText == headerFields[i];
        //..
        else if (headerText = "Status")
        {
            statusIdx = i;
        }
        else if (headerText == "Thing")
        {
            thingIdx = i;
        }
        //...
}
As I said above this works, but it's not ideal. Let's start by talking about what is probably the smallest issue: it's long. with that big chain of conditionals living inside the loop over fields (possibly inside two loops if we've built the "iterate over lines" behavior into the same function). And because each index has it's own named variable running "extract-function" on the code would result in an unreasonable argument list for the new routine.2

The bigger problem is that the association of the string names with the related indices only exists in the form of the comparison statements in that big chain of conditionals (that is, far from the variable declarations), so you end up scrolling around to understand what's going on. Good naming can help here, but...

To simplify our life we make the connection between desired header string and the index storage explicit. In the simplest form that looks like

std::map<std::string, int> indices {
    //...
    {"Status", -1},
    {"Thing", -1},
    //...
};
though we could elaborate this in various ways. Having done that, our index setting code becomes a loop3
for (size_t i=0; i<headerFields.size(); ++i)
{
    const auto it = indices.find(headerFields[i]);
    if (it == indices.end()) 
    {
        LOG_INFO("Skipping unknown CSV header field " + headerField[i]);
        continue;
    }
    it->second = i;
}
which is easily extracted to a named function and new supported fields can be added by extending the indices map.

This is much better.

Mind you, it still isn't great and depending on how long the rest of the file processing loop is we may want to elaborate on the system a little. We only use the indices to look up fields in subsequent lines, so rather than writing dataField[indices.at("Thing")] every time, we might want to provide a lookup-field-in-line routine that hides the icky syntax. But that's just gravy at this point.

Elaborations

Extending this is easy, and up to a point is a good idea. Though, you can over-do it. Bring your common sense.

In the case study above the first thing I would note is that our read requires certain columns to be present or the data can't be used. So I want to associate that info with the rest of the data. And my typical approach to that would look something like this:

struct columnData 
{
    int index;
    const bool required;
};

std::map<std::string, columnData> indices = {
    //...
    {"Status", {-1, false}},
    {"Thing", {-1, true}},
    //...
};

int columnIndex(const std::string & key, const std::map<std::string, columnData> &headerData);
bool columnRequired(const std::string & key, const std::map<std::string, columnData> &headerData);
if I wanted to use free functions or if I preferred a class it might be
class HeaderInfo 
{
	struct columnData 
	{
    	int index;
    	const bool required;
	};

	std::map<const std::string, columnData> indicies;

public:
	columnData(std::initializer_list<std::map<std::string, columnData>> l) : indices(l) {}
    
    bool has(const std::string &key) const;
    int index(const std::string &key) const;
    bool required(const std::string &key) const;
};
with the initialization data provided in the read-CSV routine by way of that initializer list constructor.

Critically I don't use two separate data stores that can get out of sync. All the information about our header handling is collected in one place where it can be edited without confusion if needed. In practice many of my data stores in this form go a long time between edits.

Nomenclature

I waffled a bit on the title for this post because I'm uncertain what other people would want to call this approach. There are several obvious candidates out there: "Data-driven design", "data oriented design", and "declarative".

In my mind, the first two are used for the idea of paying attention to access patterns and cache behavior in the way you lay data out in memory. They're a group of optimization techniques, which is not what we're about here.

I went with "declarative" in the title because I think it get much closer to my intent here (the programmer says what they want and counts on the infrastructure to make it happen), but I have reservations in the sense that C++ has very little declarative nature: we have to engineer to the relationship between declaration and behavior each and every time. In this case, that's pretty easy,4 but it can add up to a lot of code if you push the idea hard.

A tooling issue

In this case we want the indices to reset every time we call the "read a csv file" routine, so the data map is scoped to that routine, but I often use some variant of this pattern for static data, with the map placed in global storage. Which valgrind's memcheck tool doesn't appreciate.

You see, various containers in the C++ standard library (definitely including std::map) store some of their data outside the class proper. With any static data store you make this way, those allocations are created on the heap at runtime (though before main is invoked) and they persist until the program ends. But "dynamic allocation still extant at the time the program ends" is one of the things that memcheck detects as a memory leak.

In light of my on-going goal of (not to say obsession with) seeing completely clean reports from compilation, static analysis, leak checkers and so on, this is an annoyance.

You can work around this. Rather than std::map use an array (either c-style or the C++ container) of structured data, and provide your own search abstractions. Rather than std::string use string literals. And so on.5 Or just live with it; by keeping a list of "known okay" reports you can minimize the time wasted on this kind of false positive.


1 In a language with match or a more flexible switch-case construct you could use that instead of the if-else if chain, but C++ is not our friend on that front.

2 The "hard to refactor" bit could be solved by collecting the indices in a single named object struct indices { /*...*/ int statusIndex; int thingIndex; /*...*/}; but we're going to end up doing better than that.

3 Written here in a "modern" but pre-C++20 form because (a) I'm still using C++17 at work so for me maps still don't have contains and (b) because if you did use contains you'd have the search the map twice the in (common!) case where the field is a known.

4 When I tried a full implementation of the elaborated class (just because I wanted to get it right) it came to about 40 lines of code.

5 Or try living in the present: newer standards are supporting constexpr for much more of the standard library, including strings and containers. However that works under the hood, the allocations are not runtime heap entries, so memcheck should be happy.

2026-01-30

The limits of manipulations for "their own good"

It's OK, because Duggee has his gas-lighting badge!

It's an endless question for parents about their children, isn't it? How much pressure and distortion can I, legitimately, use to teach them things,1 to buy a little space, and so on? Some, I suppose, but it must be an ever moving target as the kiddo grows, develops, and just gets better at seeing through our BS. We try to keep in mind that the kiddo must one day stride forth to meet the world with her own skills, opinions, and point of view. We'd like that to go pretty well, so the scaffolding must be dismantled and some kind of model of good-person-in-a-hard-world needs to be offered.

I'll just get right on that.2

But, wait! There's more! My wife and I are smack is the middle of the sandwich, so it's also applies to interactions with our elders. And the answer to that, too, will be an evolving thing. Right now it's just one set, but there is every reason to suspect the others will need support sooner or later. So that's a whole different take on the same kind of questions.


1 In my prior, professional life, it even had a name: "lies to children".

2I wonder who in their right mind would sign off on our being parents in the first place?

2025-12-27

Having your "Smart" and your "Open", too

This is the third of a group of of posts on "Smart" appliance. You don't strictly need the first and second entries one to read this, but they're meant to be building up a common context. In this episode I talk about a mechanism that would relieve most of my worries about these devices while allowing manufactures to maintain control of their trade secretes and the presentation of their interfaces.


My complaints in the first post aren't intrinsic. Instead they are complaints about a particular implementation of the smart model where the physical device requires a specific (rather than generic) piece of paired software; in that implementation our intellectual property regime around software puts the manufacturer in control your ability to have that software, and consequently puts the manufacture in control of your ability to use an physical device that you bought. That is totally unacceptable.

Out strategic goals are

  • Possession of the physical device grants access to its functionality1 because the requisite software is generic enough to be re-implemented on any suitable platform.
  • The manufacture gets to keep their trade secrets to themselves.2

That's an interesting pair, because the first requires that the trade secrets be physically embodied in the machine (otherwise you can't guarantee that the functionality moves with the physical object). But that seems to be in tension with the second requirement, because how can the manufacturer control trade secrets if they're trading freely around the economy?

The thing is that the control software (which is already on board) also encodes the trade secrets. Anyone with the right tools and knowledge can already can extract the firmware, decompile it, and sus out the meaning of resulting code. It's just that that's a pretty hard trick (those boards don't need debug ports and I'd be unsurprised to find that many don't have them) and consequently time consuming and expensive. Then, as always, the legal regime puts up further barriers to someone trying to compete using that approach.

And we deal with systems that have those properties all the time. I'm describing a client-server architecture. The appliance is the server, your phone or tablet is the client, and the manufacturer can hide as much detail as they want server-side. By using open protocols for interchange and open standards to present the interface consumers get a extortion-free way to talk to the appliance, and in return manufacturers get reduced software development costs and all-platform3 functionality for free. It's that easy.

My hot-take, off-the-cuff, proposal for the whole thing:
HTML5.

Literally put a little web-server inside every appliance. They already have non-trivial computers and at least one of WiFI or Bluetooth, so this is not a stretch. In the worst case users can use a plain web-browser to access it.4 With HTML5, manufacturers can control every aspect of their interface (I mean their pages can just be a canvas if they're that hung up on controlling the appearance), control how much functionality is pushed to the client-side, and so on.

What's not to like?5


1 All of it. If you own the thing, you own it. This is non-negotiable.

2 Within reason. There is no way to guarantee this absolutely, and (importantly) never has been. Anyone with the tools and expertise has always been able to reverse engineer a product, and it was only ever the high cost of reverse engineering and IP law that prevented them from taking that route to develop a competing product. We're looking for a regime where the level of difficulty in obtaining and profiting from the trade secrets remains similarly high.

3 And not just iOS and Android, either. Everything, everywhere, all at once. As it were.

4 I imagine the ecosystem will rapidly spawn a genre of specialized appliance control apps with a HTML renderer at the core, and featuring convenience functions for organization and access, but a plain web-browser provides a fallback position.

5 Okay, so there is the addressing problem for devices that use WiFI. Some kind of discovery mechanism will be needed, and I don't know— off the top of my head—what the options for that are. But it's not like this is a new problem: printers and scanners, especially, already handle this in a variety of ways. Similarly bluetooth has a pairing problem to solve, but that's an issue for existing bluetooth devices as well.

2025-12-20

Why "Smart" appliances might make sense

This is the second of three related articles. If you haven't seen it already, perhaps you should read the first installment.


Some devices have very simple interfaces: they're either running or not, so they need a switch. But then, maybe there is a thing you make stronger a weaker, so you add a dial or a slider. And maybe it can run a couple of ways, so you add a selector dial. Maybe the user needs some progress feedback, which means some kind of clock.1 And so on.

In my youth, a typical washing machine, dryer, or dish washer had a handful of controls supporting as many as a dozen modes of operation, and that felt like progress compared to the kit my parents grew up with. Yeah! Living in the future!

This is the control panel from the dryer at Casa NoSwampCoolers. There is a power button and a separate start button (because power means the controls are active and start means the machine is running), a mode-select dial with fourteen options, four categories of adjustable parameters (some of which only apply to some modes), six additional Boolean settings (again, applicability varies), and a time-display-and-control group. I'm not sure if that reaches a thousand front-panel accessible combinations but it is certainly hundreds. And there are extended features only available with an app.2

Okay, so we've established that (at least some) modern appliances require complex interfaces. But we've also established that you can build the interface into the device. So, how does that justify connecting it to your carry-around-computer-thingy?

Well, we're in engineering trade-offs land. We get to compare costs, convenience, maintainability, utility, and user preferences between alternatives. And how we rate some of those depend on how the machine works. If you have to physically load and unload a device for it to be useful (as in clothes washers, clothes dryers, and dish washers) then remote start is less helpful compared to remote access to your car's defroster.

Settable defaults are nice
In principle, to know that my dryer is doing what I want I have to memorize the desired state of approximately a dozen controls, and check each of them on the control panel. In practice I memorize the smaller number of things it takes to get from "just booted" to the state I want: switch the mode to "Speed dry", set the temperature to low, increase the time, then go. Wouldn't it be nice if I could set the default state or name several default states? If there is a win here it's for "smart"
Control panels cost money and represent additional points of failure
Controls are a pain. They have to be robust, when the options are discrete they should be unambiguously in one state or another, they should give clear feedback, and in the physical realm all of that takes engineering. The cheapest switches and dials general lack something. I'm not in this business, but over the years I've talked to people who are. Controls often represents a surprising portion of the cost. To be sure, digital interface can fail too. If the magic smoke gets out of the computer on the device, your done with either kind of controls.3 If your phone smokes, you need a different interface device. At least partly in favor of "smart"
You're right there anyway
As mentioned above, some devices require your presence. For those I really want a on-device control set for the primary features. Because otherwise they require the presence of you and your phone or tablet. But I'm pretty happy with our machine that presents the big easy stuff on the panel and leaves the fiddly choices for a computerized interface. Definitely favors some physical controls on machines you have to attend in person
Translation
Did you notice that the controls on my washer are labeled in English? That's built into the physical medium, so a French-speaking, would-be user can't just change it. Now the company could (and presumably does) provide that part in multiple language variants, but that becomes a logistical headache and thus a cost and you can't easily switch back and forth. Adding translation to a software interface is not free, but (with some support from your OS or framework) it can be relatively painless, and users can change the language on demand. That's cool. A reason for pixelated UI, where ever the display is mounted
Maintainability
Once a physical control panel is engineered, manufactured, and sent into the wild changes are hard and expensive. In principle, software changes are easier, though if the software in question resides on the machine, you'll need to provide a firmware-update facility of some kind.4 If there is a win here it's for "smart"
"Just give it a proper display" is a good idea, but you already have a display...
There is nothing magical about using your phone or your tablet that enables digital UI. You could put a display (probably a touch display) on the device and then have a programmed interface. Yeah!. But ... how many pixels are you willing to pay for? And on every appliance in your house? what about physically small devices like my espresso machine? And those question lead to the idea of sharing a single interface device. Your potential customers almost all have a phone or tablet already,5 why not just borrow that display? Favors "smart"
Human stuff
This is hard for me. I'm not trained in UI/UX. I haven't done any surveys. Basically this whole essay has been me ranting about how I feel. But people sometimes care about appearance, and sometime don't like to learn new things. And so on. I'm not going to guess how this breaks for smart versus not smart.

Anyway, the next article presents what I think might be a viable approach that everyone can live with.


1 In some instances the dial that the user used to set the run-time was a mechanical clock and would wind back down also serving as the time-remaining indicator. Parsimonious design, that.

2 This is the sort of situation that prompted the third Smart Appliance rule. That dryer is very useful even without the app. Indeed, we've never used the app. Mrs. NoSwampCoolers tells me she has used the app for the matching washer (with a very similar control panel) once in the fourish years we've owned these things. Once.

3 Yeah. I'm just assuming there is a computer in there. My sense of it is that no one is designing these machines with end-to-end analog circuits anymore. I once did a few hours of analog diagnostics on a Kenmore model 40, but I'd expect to find a MCU behind any reasonably modern control panel.

4 My car's entertainment center has a SD card slot hidden behind a pop-up panel.

5 If they don't we're talking an additional up-front cost for them, but if they can afford the appliance, ... well Walmart's website lists tablets starting from under US$50 as I'm writing this..