Languages often build more complex ideas by combining symbols for simpler ideas. How this works in any particular language is governed by some set of rules or another. It's pretty typical to divide the rules into (at least) two groups: some tell you what symbols can go where (grammar), and other tells you what it means when you put them there (semantics).
To clarify the difference, let's some example rules from each family across a set of natural and synthetic languages. Some example of syntactic rules are:
- English
- Prepositions are generally followed by objects or object phrases
- Algebra
- An equal-sign, other equivalence symbol, or inequality has an expression on each side either explicitly or implicitly
- C
- A declaration consists of one or more identifiers (with optional initializers) and information about their type1
- English
- Appending "ly" to many nouns converts them into associated adjectives2
- Algebra
- Compound expressions are reduced by respecting grouping symbols to identify sub-expression, followed by applying exponentiation, then applying multiplicative operations, and finally applying additive operations
- C
- A declaration gives the identifier(s) meaning within the program and instructs the compile on how it (they) can be used (the type information).
The fun part of this is that none of it is forced on it. Both sets of rules are devised by people for people reasons. In "natural" languages this comes about slowly and often organically for reasons that I certainly don't understand. Talk to a linguist. In programming languages some person or small group of people sat down and consciously decided them (though after the first couple of decades there came to be some broad consensus understanding to build upon).3
An advantage of someone making a deliberate decision shows up when you have
complicated rules. The originator can write down an authoritative description of
the method and that's that. For instance, the c-declaration int
(*normalized_comp)(unsigned, const char *, const char *) may be
pretty complex,4 but by looking up the procedure in The C
Programming Language, the standard document
or some website, we
can know with certainty that "normalize_comp" is a pointer to a function taking
three arguments (one unsigned integer, and two pointers to const characters) and
returning a integer value.5
My beef today is about the rules for expressing frequency in English. In particular, we can use the "ly" suffix formation discussion about to modify a time-period into a frequency. Monthly. Daily.
Fine.
But we also have access to some prefixes that modify the number of thing: "bi" and "semi" for two and one-half are common in this use.
Alas, there is no authoritative author's document to tell us if "biweekly" should be interpreted as "twice weekly" or as "every two weeks". I'm fairly sure it's the former, but...
1 I'm going to ignore the wrinkle in which multiple identifiers in a single declaration can have different type when some of them are pointers. Those of you who need to know, know. And for the rest of you it doesn't add anything to the discussion.
2 I'm also going to largely ignore the irregularities of English. They are other ways to make adjectives but, again, it doesn't add anything to the discussion.
3 The algebraic symbols and order of operations is an intermediate case. It came to be through an organic process of push-n-pull in a community, but it was a small community and the candidates were generally formed deliberately by one or a few participants. Fun stuff.
4 This kind of thing is hard enough that a typical course in c
includes a bunch of exercises in how to read these thing, but in case you fall
out of practice there is a tool (cdecl) just to help you out.
5 Experienced c-programmers will likely intuit still another
layer of meaning. Guessing that the pointer arguments are probably meant to
point to character buffers (that is strings) rather than single
characters, and the return value probably takes on values in the range -1 to +1
ala strcmp. The name of that layer is "idiom", and like in natural
languages it is required to be really fluent. Our hypothetical experienced
program might also have a guess about the initial argument (unsigned, so probably
a size, so probably the max number of characters to compare...), but that is not
so well established in the idiom.
No comments:
Post a Comment