A type-checking square dance puzzle

Here’s a square dance puzzle I tweeted about the other day: how many permutations of [Initially, Finally, Stable, Echo, Tandem, Mix] are valid square dance calls?

To explain the puzzle, here’s some terminology:

Calls are certain square dance instructions, like Mix or Hinge.
Concepts are functions that take a call and return another call, like Stable or Tandem.
Metaconcepts are functions that take a concept and return another concept, like Initially, Finally, or Echo.

Functions are written before what they’re applied to, like how, in math, we write $f(x)$ to mean the function $f$ applied to an input $x$ . For example:

Echo is a metaconcept, and Tandem is a concept. Applying Echo to Tandem gives Echo Tandem, which is also a concept.
Echo Tandem is a concept, and Mix is a call. Applying gives Echo Tandem Mix, which is also a call.
Finally is a metaconcept, and Stable is a concept, so Finally Stable is a concept.
Initially is a metaconcept, and Finally Stable is a concept, so Initially Finally Stable is a concept.
Initially Finally Stable is a concept, and Echo Tandem Mix is a call, so Initially Finally Stable Echo Tandem Mix is a call.

Metaconcepts can only be applied to concepts, and concepts can only be applied to calls. There’s only one way to read Initially Finally Stable Echo Tandem Mix that follows these rules, and it’s ((Initially (Finally Stable)) ((Echo Tandem) Mix)).

A list of words is a valid call if it can be formed using these rules. Mix is a valid call, and so is Tandem Mix, but Mix Tandem isn’t. So the question is: of the $6! = 720$ permutations of [Initially, Finally, Stable, Echo, Tandem, Mix], which ones are valid calls?

(Note that the rest of this post isn’t about square dancing at all. If you want more square dancing content, check out my post on the non-compositionality of square dance calls.)

Writing a parser

We can solve this problem by writing a small parser. In our case, a parser is a program that takes a list of words and checks whether it’s a valid call. Specifically, we’ll write a recursive descent parser.

We’ll write three functions, for parsing metaconcepts, concepts, and calls. Each function takes a list of words, tries to parse its thing, and then return the rest of the words. If it can’t do that, it raises an exception. For example, for metaconcepts:

def parse_metaconcept(words):
    metaconcept, *rest = words
    assert metaconcept in METACONCEPTS
    return metaconcept, rest

This takes the list of words, then checks if the first word is a metaconcept. If so, it returns what it parsed, and the rest of the words. Else, it’ll raise an exception:

>>> parse_metaconcept(["finally", "stable", "mix"])
("finally", ["stable", "mix"])
>>> parse_metaconcept(["mix"])
Exception

Concepts can be formed in two ways. Either it’s a concept word, or it’s a metaconcept followed by a concept:

def parse_concept(words):
    try:
        concept, *rest = words
        assert concept in CONCEPTS
        return concept, rest
    except Exception:
        metaconcept, rest = parse_metaconcept(words)
        concept, rest = parse_concept(rest)
        return [metaconcept, concept], rest

This function correctly parses things like Tandem or Finally Stable as concepts, and raises an exception for things that aren’t, like Mix or Initially Echo:

>>> parse_concept(["finally", "stable", "mix"])
(["finally", "stable"], ["mix"])
>>> parse_concept(["mix"])
Exception

The function for parsing calls is similar:

def parse_call(words):
    try:
        call, *rest = words
        assert call in CALLS
        return call, rest
    except Exception:
        concept, rest = parse_concept(words)
        call, rest = parse_call(rest)
        return [concept, call], rest

And it works!

>>> parse_call(["finally", "stable", "mix"])
([["finally", "stable"], "mix"], [])

Putting it together, we have

from itertools import permutations

METACONCEPTS = ["initially", "finally", "echo"]
CONCEPTS = ["stable", "tandem"]
CALLS = ["mix"]

def parse_metaconcept(words):
    ...

def parse_concept(words):
    ...

def parse_call(words):
    ...

for words in permutations(
    METACONCEPTS
    + CONCEPTS
    + CALLS
):
    try:
        call, rest = parse_call(words)
        assert not rest
        print(call)
    except Exception:
        pass

Running this program prints this, with quotes omitted for clarity:

[[initially, [finally, stable]], [[echo, tandem], mix]]
[[initially, [finally, [echo, stable]]], [tandem, mix]]
[[initially, [finally, [echo, tandem]]], [stable, mix]]

followed by 45 more lines. There’s 48 valid permutations, the answer to our puzzle.

Polish notation and arity

Note this neat fact: we didn’t need any parentheses to parse Initially Finally Stable Echo Tandem Mix as ((Initially (Finally Stable)) ((Echo Tandem) Mix)). Compare this to arithmetic, where we need parentheses to know that $(1 + 2) \times 3$ is different from $1 + (2 \times 3)$ .

At first, I thought this followed from the fact that Polish notation doesn’t need parentheses if each operator has fixed arity. For context, in Polish notation, or prefix notation, instead of putting operators between, we put them in front. Rather than writing $1 + 2$ , we write $+\,1\,2$ . Then $(1 + 2) \times 3$ would be $\times(+ 1\,2)\,3$ , and $1 + (2 \times 3)$ would be $+\,1\,(\times \,2\,3)$ .

In this case, we can omit parentheses, and it’d still be clear what we mean. Say we’re reading $+\,1 \times 2\,3$ . We see the $+\,1 \times$ , and we can tell that the $\times$ isn’t the thing being added to $1$ , because it’s not a number. Instead, it’s going to produce a number that’ll be added to $1$ . We keep reading and see $\dots \times 2\,3$ , which are two numbers, so we know they’ll end up as a group.

Does the same argument apply to square dance words? In our arithmetic world, the operators $+$ and $\times$ are binary; they’re functions that take two things. In our square dance world, concepts and metaconcepts are unary; they’re functions that take one thing. The number of things a function takes as input is called its arity.

We have unary functions in arithmetic too, like $-$ and $\sqrt{}$ . A sequence of unary functions doesn’t need parentheses; we can tell that $\sqrt{} - - 4$ means $\sqrt{-(-4)}$ . But if we applied the same logic to Initially Finally Stable Echo Tandem Mix, we’d read it as (Initially (Finally (Stable (Echo (Tandem Mix))))), which isn’t right. How is it different, then?

Assigning types

The difference is with types. In the world of arithmetic, there are only two types: numbers and operators. But with our square dance words, there are three types: calls, concepts, and metaconcepts.

For us, there’s one important fact about types: functions restrict the type of their inputs. This is how we know $\sqrt{+}$ isn’t valid, because the square root function expects a number as its input, not an operator. We can omit parentheses in Polish notation not only because of arity, but also because of types.

It’s the same story for square dance calls. We can say that:

calls have type $C$ ,
concepts have type $C \to C$ , and
metaconcepts have type $(C \to C) \to (C \to C)$ .

It’s types that let us know that Echo Tandem Mix is (Echo Tandem) Mix. Because if we read this as Echo (Tandem Mix), then the metaconcept Echo would take the call Tandem Mix as an input, which is a type error. This is valid:

\begin{align*} \underbrace{\underbrace{\underbrace{\text{Echo}}_{(C \to C) \to (C \to C)} \underbrace{\text{Tandem}}_{C \to C}}_{C \to C}\quad \underbrace{\text{Mix}}_{C}}_{C} \end{align*}

This is not:

\begin{align*} \underbrace{\underbrace{\text{Echo}}_{(C \to C) \to (C \to C)}\quad \underbrace{\underbrace{\text{Tandem}}_{C \to C}\quad \underbrace{\text{Mix}}_{C}}_{C}}_{\text{type error!}} \end{align*}

String rewriting

Writing all those $C \to C$ s is kind of a pain, so let’s introduce abbreviations:

concepts have type $K = C \to C$ ,
metaconcepts have type $M = K \to K = (C \to C) \to (C \to C)$ .

Our Echo Tandem Mix example now becomes:

\begin{align*} \underbrace{\underbrace{\underbrace{\text{Echo}}_{M}\quad \underbrace{\text{Tandem}}_{K}}_{K}\quad \underbrace{\text{Mix}}_{C}}_{C} \end{align*}

We can replace our rules for how functions and types work with two string rewriting rules. If we have $MK$ next to each other, we can replace both with $K$ , and if we have $KC$ next to each other, we can replace both with $C$ . Then Echo Tandem Mix would correspond to the string $MKC$ , which we simplify as $(MK)C \to (KC) \to C$ . The order we apply the rules matters, which corresponds to how we parenthesize it; if we tried doing $M(KC) \to MC$ , we’d get stuck.

Similarly, our Initially Finally Stable Echo Tandem Mix example becomes

\begin{align*} M(MK)(MK)C \to (MK)(KC) \to (KC) \to C. \end{align*}

Recasting our square dance words with these letters turns it to a string rewriting system. From here we can start talking about production rules and grammars. But let’s stop, and pivot to talking about math.

Doing math

Let’s recall the original problem (which, to be fair, was forever ago). How many permutations of [Initially, Finally, Stable, Echo, Tandem, Mix] are valid square dance calls? We can solve this problem by doing some math.

What word comes immediately before Mix? It can’t be a metaconcept, because a metaconcept can only be followed by a concept. There’s two possibilities, then, for the word that comes before Mix: either Stable or Tandem. Then we can observe that the four remaining words can come in any order. The whole thing would look like concept-concept-call, with each concept possibly having metaconcepts applied to it. That means there’s $2 \times 4! = 48$ possibilities.

Here’s another way to think about it. If we write calls as $C$ , concepts as $K$ , and metaconcepts as $M$ , the only thing that matters for validity is whether the sequence of letters reduces to $C$ in our string rewriting system. There’s four valid sequences: $MMMKKC$ , $MMKMKC$ , $MKMMKC$ , and $KMMMKC$ . Once we choose one of these sequences, we then have $3!$ ways to replace the $M$ s with metaconcepts, $2!$ ways to replace the $K$ s with concepts, and $1!$ way to replace the $C$ s with calls. This gives $4 \times 3! \times 2! \times 1! = 48$ possibilities.

Generalizing

The logic in our first solution generalizes to any number of concepts and metaconcepts. If we have $k$ concepts and $m$ metaconcepts, then there’s $k$ choices for the word immediately before the call, and the other $k + m - 1$ words can come in any other before that, giving $k(k + m - 1)!$ possibilities. If we wanted to be confident, we can change our parser program to check this.

What if we had more than one call? We can then ask which permutations give a sequence of calls, rather than a single call. For example, if we added the call Hinge, then the words Echo Finally Stable Hinge Initially Tandem Mix is a valid sequence of two calls: ((Echo (Finally Stable)) Hinge), then ((Initially Tandem) Mix). If we have $c$ calls, $k$ concepts, and $m$ metaconcepts, how many permutations of these words give a valid sequence of calls?

It’s not clear to me how to modify the solutions to generalize to many calls. Well, if I think about it for longer than a few seconds, I might figure it out. But the nice thing about the parser solution is that generalizing it to many calls isn’t that hard:

for words in permutations(
    METACONCEPTS
    + CONCEPTS
    + CALLS
):
    calls = []
    while words:
        try:
            call, words = parse_call(words)
            calls.append(call)
        except Exception:
            break
    else:
        print(calls)

In the case where we have 2 calls, 2 concepts, and 3 metaconcepts, there are 288 valid permutations.

Guessing the formula

Now that we have a program, it’s easier for me to run the program on different values of $c$ , $k$ , and $m$ , than try to do math. I’ll take an engineer’s induction approach to find the formula instead, by getting some values and guessing the formula. For $c = 2$ , I got this data:

$k$ \ $m$	1	2	3	4
1	4	8	24	96
2	24	72	288	1440
3	144	576	2880	17280
4	960	4800	28800	201600

Running with $(k, m) = (4, 4)$ takes about a minute on my laptop, which is about the limit of my patience, so that’s how far the table goes.

Recall that $k(k + m - 1)!$ was the formula we got for $c = 1$ . In this table, we can see there’s still some $(k + m - 1)!$ factor. If we divide by that:

$k$ \ $m$	1	2	3	4
1	4	4	4	4
2	12	12	12	12
3	24	24	24	24
4	40	40	40	40

Aha! What is this 4, 12, 24, 40 sequence? Looking at OEIS, the first result is $4\binom{k + 1}{2}$ , which sounds reasonable.

Putting it together, we get a formula of $4 \binom{k + 1}{2}(k + m - 1)!$ when $c = 2$ .

We can do a similar thing for $c = 3$ . At this point, I’m pretty confident there’s a factor of $(k + m - 1)!$ , so I’m only going to take the $m = 1$ case. Then I divide by $(k + m - 1)!$ , and then I divide by the common factor. This gives me:

$k$	$m=1$	divide $(k + m - 1)!$	divide 18
1	18	18	1
2	144	72	4
3	1080	180	10
4	8640	360	20

Using OEIS, we can find that the last column is $\binom{k + 2}{3}$ . To recap the formulas we have so far:

$c$	formula
1	$1 \binom{k + 0}{1} (k + m - 1)!$
2	$4 \binom{k + 1}{2} (k + m - 1)!$
3	$18\binom{k + 2}{3} (k + m - 1)!$

where I slightly rewrote the $c = 1$ formula to make the pattern clearer. We’re almost done with guessing the formula, except we need to figure out this 1, 4, 18 sequence. Looking at OEIS again, the first result is $c \cdot c!$ . Assuming this is right, the full formula would be

\begin{align*} c \cdot c! \binom{k + c - 1}{c}(k + m - 1)!. \end{align*}

Checking with some small values for $c$ , $k$ , and $m$ against the output of our parser, this looks correct!

Working backwards

Now that we have an answer, we’re done! We can stop, as in the great tradition of Ramanujan. Or we can try to find an explanation.

From our second mathy solution, we know that if we can count the number of valid strings of $M$ s, $K$ s, and $C$ s, we can then multiply by $m!\,k!\,c!$ to get the answer. Working backwards, then, let’s see what we get when we divide our formula by this:

\begin{align*} &\phantom{=} \frac{c \cdot c! \binom{k + c - 1}{c}(k + m - 1)!}{m!\,k!\,c!} \\[1em] &= \frac{(k + c - 1)!}{(c - 1)!\,(k - 1)!}\cdot\frac{(k + m - 1)!}{m!\,k!} \\[1em] &= \frac{(k + c - 1)!}{(c - 1)!\,k!}\cdot\frac{(k + m - 1)!}{(k - 1)!\,m!} \\[1em] &= \binom{k + c - 1}{k} \binom{k + m - 1}{m}. \end{align*}

This is where my combinatorics knowledge comes in handy, because I recognize these binomial coefficients from stars and bars. We can interpret the first factor as the number of $c$ -tuples of non-negative integers whose sum is $k$ , and the second factor as the number of $k$ -tuples of non-negative integers whose sum is $m$ . Is there a bijection from these two things to valid $M$ , $K$ , and $C$ strings?

Let’s take our Echo Finally Stable Hinge Initially Tandem Mix example. This is the string $MMKCMKC$ .

For the first factor, we could correspond it to $c$ -tuples that sum to $k$ by splitting at the $C$ s, and then counting the $K$ s in each block. In this case it’d be $(1, 1)$ . As every concept must be followed by some call, splitting at the $C$ s gives us $c$ blocks, not $c + 1$ blocks. The total number of $K$ s is $k$ , so the sum of the tuple is also $k$ .
For the second factor, we could split at the $K$ s and count the number of $M$ s in each block. In this case, it’d be $(2, 1)$ . This works for a similar reason, as every metaconcept must be followed by some concept.

Thus, given a valid string, we can construct a $c$ -tuple of non-negative integers with sum $k$ , and a $k$ -tuple of non-negative integers with sum $m$ . We can also go backwards: if we had the tuples $(0, 2)$ and $(1, 2)$ , the string would be $CMKMMKC$ . This means it’s a bijection, like we wanted!

This explanation also gives us insight if we wanted to generalize by a level deeper, but I’ll leave that as an exercise.

But why?

I know what you’re thinking. Or at least, I know what I’m thinking, and that’s, why are you writing and posting this?

My list of blog ideas doesn’t get shorter. It stands to reason that, if I wanted to “maximize the impact of my blog,” I’d pick the best thing on the list, then write about that next. The trouble with this approach is that it doesn’t match my own motivation for what I want to write about, which is rather fickle. My life is eclectic potpourri; why would my blog be any different?

You could argue that maybe I could curate my blog. Keep the Actually Good writing separated from chaff like this post, which come up only to be discarded. I’ve tried doing this before, and I think separating by a vague metric like “quality” or “how many people I think will like it” leads to less total output. (As opposed to something like “how comfortable I’d be with strangers seeing this on the internet” which is easier to gauge after years of blogging.)

On why I thought about this in the first place… well, the question came up, and I was bored. It kinda happened that this topic intersected a lot of my interests, which I thought was neat. Something something interdisciplinary something something.