by CJ Quines • on
Conjunction is adjoint to implication
posets, limits, adjoints
In this post I’m trying to digest some things I’ve been learning from Daniel Rosiak’s Sheaf Theory Through Examples, by building up to the statement “conjunction is adjoint to implication.”
Possible-worlds semantics
Tomorrow, it will either rain, or not rain. These are two possible tomorrows.
Or rather, they’re sets of possible tomorrows. There’s many possible tomorrows where it rains. In some of them, I will eat a bowl of ice cream. In others (most of them), I won’t eat any ice cream.
We can represent a proposition like “it rains” as the set of possible tomorrows where it holds. Similarly, we can represent a proposition like “I eat a bowl of ice cream” as a set . We’ll abuse notation and use and for both the propositions and the sets of possible tomorrows where they hold.
Consider the set of possible tomorrows where it both rains and I eat a bowl of ice cream. As sets, this is : it’s the intersection of the tomorrows where it rains, and the tomorrows where I eat a bowl of ice cream. Thus, the proposition translates to . Note how the symbols and look similar.
Similarly, what are the set of possible tomorrows where it rains or I eat a bowl of ice cream? As sets, this is ; as propositions, this is .
We can also work backward, starting from a set of possible tomorrows, then thinking about which proposition it represents. Which proposition does the empty set, , represent? It should be a proposition that holds in no possible tomorrows, so it’s false, which we will write as . And the opposite, the set consisting of all possible tomorrows, represents the proposition true, which we write as .
Interlude: semantics
In mathematical logic, semantics is about translating logical statements into mathematical objects. When I first studied logic, the semantics I was taught was truth-value semantics: a proposition is one of the Boolean values false or true, and and are the boolean operations on those values. The boolean operations capture our intuitive sense of what it means for when should be true, which is iff is true and is true.
What we’ve discussed, where we interpret propositions as the set of possible worlds in which it’s true, is called possible-worlds semantics (nLab, SEP). Here, and are intersection and union, which again capture the intuitive sense for when a possible world is in : iff it’s in and in .
Note that truth-value semantics is a special case of possible world semantics, where there’s only one possible world. In particular, if there’s only one possible world, then there’s two sets of possible worlds: the empty set (false, ), and the set containing that possible world (true, ).
Are there other possible semantics? In my type theory handout, I discuss another kind of semantics, called proof-theoretic semantics, which is about interpreting propositions as their proofs: is an ordered pair of proofs of and , while is either a proof of , or a proof of . (In that handout, I also use instead of ; the symbolic counterpart would be instead of . But I’m sticking with Rosiak’s notation for consistency.)
External and internal implication
We’ve talked about conjunction () and disjunction (). We’ll now discuss implication. In possible-world semantics, there’s actually two reasonable ways to think about ” implies ”:
-
We can say that implies when: every time holds in a possible world, then also holds in that possible world.
-
Or, ” implies ” can be a proposition in itself. It’s the set of possible worlds where, if holds, then also holds.
The former is external implication, and the latter is internal implication. The former is a fact that, by looking at all the possible worlds, can be either true or false. The latter is a set of possible worlds, and it will always exist, though sometimes it’s empty. nLab uses the terms entailment and conditional statement.
Concretely, consider some sets of possible tomorrows:
- : It rains tomorrow.
- : I eat a bowl of ice cream tomorrow.
- : I have dessert tomorrow.
Then:
-
externally implies , because in every tomorrow where I eat a bowl of ice cream, I have dessert. As sets, this means .
-
However, does not externally imply . There are some tomorrows where it rains, but I don’t eat a bowl of ice cream.
-
The set ” implies ” is the set of possible tomorrows where: if it rains, then I eat a bowl of ice cream. It includes some of the tomorrows in : particularly, the ones which are also in . But note that it also includes all the tomorrows in which it doesn’t rain, because that’s how logical implication works.
To introduce some notation, we’ll say that:
-
is the statement that externally implies , and
-
is the set that represents the internal implication ” implies ”.
The set is also called the relative pseudocomplement of with respect to . It’s the set of all possible worlds except the ones where holds but doesn’t. So it’s like the opposite of under .
Let’s check our understanding:
-
What’s the relationship between the two sets and ? A possible tomorrow is in if, when it rains, I eat a bowl of ice cream. Because I eat ice cream, I have dessert. Thus, it’s also in . So, every world in is also in . Thus, .
-
What’s the relationship between external implication and internal implication? Consider the fact that , or, that externally implies . Then what is the set ? It consists of all possible worlds in which, if holds, then holds. But because every world where holds is also a world where holds (because , or ), this must be every world. So is the set of all possible worlds, or .
-
What’s the proposition ? It’s the set of all possible tomorrows in which, if it rains, then false. So it cannot have any possible tomorrows where it rains. But it has all the possible tomorrows where it doesn’t rain, in the same way that has all the tomorrows where it doesn’t rain. Thus, represents the proposition that it doesn’t rain.
-
What’s the proposition ? Recall that consists of the worlds that are in both and , as well as all the worlds not in . We are intersecting this with the set of worlds that are in . So this is .
-
What’s the proposition ? Note that every possible world is either in , or not in . But if it’s not in , it’s in . So has to be every possible world, or .
Posets, meets, and joins
The notation we’ve been using, with and and , is meant to emphasize that we’re working in a poset. We use posets as a bridge to understanding categories because they’re a special case of categories. There’s an nLab page that says “if you’re having trouble understanding an item in [category theory], then first you should understand the item in [order theory].”
Recall that a poset (Wikipedia, nLab) is a set with a reflexive, transitive, and antisymmetric binary relation . The set of set of possible worlds forms a poset, with the we’ve been using above. Depending on your definition of possible worlds, this could be a special case of the fact that, for any set , its power set is a poset under .
Through the rest of the post I’ll be implicitly referring to Hasse diagrams of posets. For concreteness, you can imagine this Hasse diagram of the poset of subgroups of the dihedral group with order eight. When I refer to relative directions, like “above” and “below”, or “highest” and “lowest”, I’m referring to an imaginary Hasse diagram that I’m not going to include in this post because I’m too lazy to make diagrams. It’s maybe helpful pedagogically to force you to recreate the diagrams I’m thinking of, I dunno. Anyway.
The poset thinking encourages us to think of sets of possible worlds via only , that is, without referring to individual worlds. So, using only , can we still define and and ?
One general trick we’ll use is the notion of taking “the largest” or “the smallest” of a specific kind of property. Indeed, is “the largest” set that is and . We’ll now talk about this in lots of detail.
Consider, the following property of a set of possible worlds: “It rains in every possible tomorrow in the set.” There’s several sets that satisfy this property.
-
It’s true of , the set of every possible tomorrow where it rains.
-
It’s also vacuously true for , the empty set.
-
It’s also true of the set , the set of possible tomorrows where it both rains and I eat a bowl of ice cream.
-
Consider the set of possible tomorrows where it rains, I eat a bowl of ice cream, and I go to work. It’s also true of this set.
Of all of possible sets that satisfy this property, there happens to be one of them which is the largest: itself. It’s “the largest” in the sense that, if is any set that satisfies this property, then . Indeed, , and , and , and even .
In the Hasse diagram, I imagine the sets that satisfy this property as a shaded region of the Hasse diagram. The largest one, , is on top, and everything in the shaded region has an arrow pointing to . Further, nothing outside of the shaded region has an arrow pointing to .
Now consider the property “In every possible tomorrow in the set, it rains, and I eat a bowl of ice cream.” Again, several sets satisfy this property, like , or . But the largest such set that satisfies this property is . Conversely, every set that satisfies this property is . And every set that doesn’t satisfy this property isn’t .
Both of these properties can be restated in the same form:
-
“It rains in every possible tomorrow in the set” can be written as “the set is .”
-
“In every possible tomorrow in the set, it rains, and I eat a bowl of ice cream” can be written as “the set is and .”
In general, the pattern is: we have some set of elements in the poset ( in the first one, in the second one), and we’re considering all elements that are each of them. In these cases, there happens to be a largest such element ( in the first one, in the second one). We call that element the meet (Wikipedia, nLab) of the set.
Note that the meet doesn’t need to exist. Consider the poset on four elements whose Hasse diagram looks like a . The set of elements that’s the upper two elements is the lower two elements. But neither of them is larger than the other, so there’s no “largest” one. So the upper two elements in this poset don’t have a meet. But, when it does exist, it’s unique, so we can say the meet. (If there were two, each is the other, so by antisymmetry they’re equal.)
In the poset of sets of possible worlds, the meet always exists. We can then define as the meet of the set . Similarly, we can define using the dual notion, the join of a set, which is the smallest element that each of them are than.
As a special case, note that the meet of an empty set is , and the join of an empty set is .
Interlude: categories and limits
I’ve mentioned that a poset is a special case of a category. One way it’s special is antisymmetry: in a poset, if and then . If you drop this condition, you get a preorder (nLab, Wikipedia). The thing you lose in a preorder is uniqueness, but things are still unique up to isomorphism.
Generalizing from a preorder to a category is trickier, though. In a preorder, we say , and we don’t have to worry about how . In other words, if , there’s only one way for to be . In a category, we generalize to allowing multiple ways for something to be than another. We then write , and call these multiple ways morphisms.
In a poset, transitivity says that if and , then . The categorical version of this is composition, which says that if is a morphism, and is another morphism, then you get another morphism .
To go the other way around, we can start from the definition of a category, and define a thin category, where two morphisms with the same source and target are equal. Then we can define a skeletal category, where two isomorphic objects are equal. A poset is a thin skeletal category.
To me, what feels like the main difference when working with poset and categories is that, in a poset, all diagrams commute. Because there’s only one way to get from one point to another, everything commutes automatically. Thus, to generalize from posets to categories, you have to worry a bit more about making things commute.
As an example, consider how to generalize meets. In a poset, we take meets of sets of elements, and we don’t really care about the s between them. In a category, if we take the meet of a set of elements, and ignoring any s between them, we generalize meets to products.
To be specific: we have a set of elements, we take all the elements that are to every element in that set, and we see if there’s a “largest” such element, one that everything else is to. This is the product (nLab, Wikipedia), so named because in the category of sets, this is the cartesian product.
Back to a poset. Suppose, instead of just thinking about sets of elements, we considered diagrams of elements, where we also care about the s between them. Well, this kinda doesn’t matter when we’re taking their meet—whatever element we take, it’ll respect those s. Like, suppose we had three elements satisfying and . And suppose there’s some element that’s each of them. Then , by definition, and because , then by transitivity we have . And similarly, combining and we get . And by definition (of ), we also have . Which, well, in the poset case, we ended up with thrice, and that’s fine.
But what about in a category? Consider the diagram with three elements with morphisms and . Then suppose there’s some element , with morphisms each of them. There’s some morphism , by definition, and because exists, then by composition there’s . And similarly, combining and we get . And by definition (of ), we also have some . And now, we have three morphisms that go from to , namely, , , and . And these aren’t necessarily the same.
If we require them to be the same—that is, you require the whole diagram to commute—we get the notion of the limit (nLab, Wikipedia) of a diagram. In a poset, the limit of any diagram happens to be the same as the product of its elements; they’re both the meet. But because things don’t automatically commute in a category, then the limit of a diagram isn’t the same as the product of its elements.
Dually, the generalizations of joins are coproducts and colimits.
Lattices and adjoints
A lattice (Wikipedia, nLab) is a poset with all finite meets and joins. By taking the meets and joins of empty sets, we get that a lattice must have and , which are smaller and larger than every element. Our poset of the sets of possible worlds is in fact a lattice.
Given a lattice, we now move to the task of defining . How do we characterize in terms of what we have? We’ll end up doing something similar to what we did for meets and joins, where we take “the largest” of a particular kind of property.
Let’s recall our running example, where represents the proposition “it rains tomorrow” and represents the proposition “I eat a bowl of ice cream tomorrow.”
The property we’ll consider is: “In every possible tomorrow in the set, if it rains, then I eat a bowl of ice cream.” Let’s consider some sets that satisfy this property:
-
It’s vacuously true of , the empty set.
-
It’s also true of , the possible tomorrows where it both rains and I eat a bowl of ice cream.
-
It’s also true of the set of possible tomorrows where it doesn’t rain, because that’s how logical implication works.
The key property here is its relationship with “what happens when it rains”, or when you do to it. In particular, these are all rewrites of each other:
-
“For each possible tomorrow in the set where it rains, I eat a bowl of ice cream.”
-
“The tomorrows in the set that are in are also in .”
-
“When you the set, the result is .”
We can then define to be (provided it exists) the largest such set so that, when you it, the result is . This means that a set satisfies this property iff it’s . We can thus write
where the bar means “if and only if”. In general, lattices don’t always have an , but the power set lattice does, and so does the poset of possible worlds.
This general shape is called an adjunction. Given two functors (-preserving functions) and , both from the poset to itself, we say that is adjoint to if:
(Here, I’m using the category theory words functor and adjoint. In usual order theory, functors are called monotone maps (Wikipedia, nLab), and adjoint pairs are called a Galois connection (nLab, Wikipedia). If the word functor scares you, you can replace it with function and you’ll be fine.)
Note that adjoints are asymmetric: we call the left adjoint and the right adjoint. In this particular case, is the functor and is the functor . So conjunction by is adjoint to implication from . Or, in snappier phrasing, conjunction is adjoint to implication.
Provided that a right adjoint to exists, it must be unique. This is for similar reasons to the above: if we want to compute , we take all such that , and then take the largest. (“…the largest such set so that, when you it, the result is .”) Similarly, left adjoints are unique.
We can think of adjoints as “the closest thing to inverses”. Consider that the functor takes a set and returns . If an inverse functor existed, it would take a set and return the original set . But, given just and , we can’t tell what the original is. If we imagine the Hasse diagram, and I highlight and , then there’s a region of possible s, ranging from , and going up. We then take the largest of these, which is .
Now let’s consider the dual case. If conjunction is adjoint to implication, then what is adjoint to disjunction? Using the “smallest” perspective, the left adjoint must be the smallest set that satisfies the property:
-
“When you the set, then the result.”
-
“The possible tomorrows in are in , or the set.”
-
“For each possible tomorrow where it rains, then I eat a bowl of ice cream, or it’s in the set.”
Many sets satisfy this property. It’s trivially true of , the set of all possible worlds. It’s also true of , the set of tomorrows where it rains. But what’s the smallest set that satisfies this property? It’s , or the set of possible tomorrows where it rains, but I don’t eat a bowl of ice cream. So subtracting is adjoint to disjunction by . In snappier words, subtraction is adjoint to disjunction:
Like the situation with , not every lattice has a , but the power set lattice does, and so does the poset of possible worlds.