by CJ Quines • on
Declarative programming as mindset
hint: it’s not about map/filter/reduce
The usual take for “declarative” programming are that it’s about saying what you want to get done, not how; or about using reduce
rather than for
loops; or about immutability or isolating side effects. But that’s not what I think of when I talk about doing “declarative” programming. Here’s some examples.
Here’s two approaches to running a long chain of functions, each depending on the results of the previous. You can call the functions in sequence, collecting the results and passing them from one function to the next. Or you can make each function responsible for its own result; if it doesn’t have any inputs, it asks for them from the previous function. The second approach feels more “declarative” to me than the first one.
Here’s two approaches to designing an API to list the files in a file system. You can make the server remember your current directory, and you can ask the API to go up one directory, or to go down a subdirectory. Or you can remember the current directory yourself, and when you ask the API you have to give both your current directory and where you want to go. The second approach feels more “declarative” to me than the first one.
Here’s two approaches to data fetching. You can fetch data in the parent component, and pass it down to child components through props, context, or a global store (like Redux or Zustand). Or you can fetch data in the child component itself, perhaps with a data fetching library (like Query or SWC), and trust that the network requests won’t be repeated or something. Again, the second approach feels more “declarative” to me than the first one.
What makes the second approaches more “declarative” doesn’t have to do with how it’s implemented. It doesn’t matter which language or library you’re using, the second approaches are still more “declarative”. That’s why I’ve been writing “declarative” in quotes—it’s different from how others might use the word. I’ll drop the quotes from now on, with the understanding that declarative refers to approach, not implementation.
I’m not sure exactly how to define what declarative means for me. I think it has to do with two principles: laziness and statelessness.
Laziness
Let’s say I have a class that stores a list and can compute its sum:
class List:
items = []
def set_items(self, items):
self.items = items
def get_sum(self):
return sum(self.items)
class List:
items = []
def set_items(self, items):
self.items = items
def get_sum(self):
return sum(self.items)
This is great. For the sake of example, let’s say sum
was a more expensive function, and we measured and found out that calling get_sum
was too expensive. I could change it to:
class List:
items = []
sum = 0
def set_items(self, items):
self.items = items
self.sum = sum(self.items)
def get_sum(self):
return self.sum
class List:
items = []
sum = 0
def set_items(self, items):
self.items = items
self.sum = sum(self.items)
def get_sum(self):
return self.sum
Let’s say we have to compute the square of the sum of the list as well (which, for sake of example, also happens to be expensive). We can write something like:
class List:
items = []
sum = 0
square = 0
def set_items(self, items):
self.items = items
self.sum = sum(self.items)
self.square = self.sum ** 2
def get_sum(self):
return self.sum
def get_square(self):
return self.square
class List:
items = []
sum = 0
square = 0
def set_items(self, items):
self.items = items
self.sum = sum(self.items)
self.square = self.sum ** 2
def get_sum(self):
return self.sum
def get_square(self):
return self.square
I’d call this a push approach to computing the sum and squared sum. When we update the list, we have to remember to push the changes to the computed values. The responsibility for keeping sum
updated rests with set_items
, and not get_sum
.
There’s some disadvantages with a push approach:
- If we never call
get_sum
, then the work we did computingsum
was useless. This becomes a problem if we don’t always expect to callget_sum
. - Whenever we have something that depends on
items
, we need to compute that value whenever we changeitems
. This becomes a problem the more things depend onitems
. We’ve seen this when we addedsquare
. - The knowledge for how
get_sum
works lives inset_items
, and not inget_sum
, where it’d be more natural to be. This becomes a problem when future maintainers are tryinig to understand howget_sum
works: they have to readset_items
as well.
The opposite of a push approach would be a pull approach:
class List:
items = []
summed_items = []
sum = 0
squared_sum = 0
square = 0
def set_items(self, items):
self.items = items
def get_sum(self):
if self.items != self.summed_items:
self.summed_items = self.items[:]
self.sum = sum(self.summed_items)
return self.sum
def get_square(self):
if self.get_sum() != self.squared_sum:
self.squared_sum = self.get_sum()
self.square = self.squared_sum ** 2
return self.square
class List:
items = []
summed_items = []
sum = 0
squared_sum = 0
square = 0
def set_items(self, items):
self.items = items
def get_sum(self):
if self.items != self.summed_items:
self.summed_items = self.items[:]
self.sum = sum(self.summed_items)
return self.sum
def get_square(self):
if self.get_sum() != self.squared_sum:
self.squared_sum = self.get_sum()
self.square = self.squared_sum ** 2
return self.square
Now get_sum
is memoized. This addresses the three disadvantages above, at the cost of using more memory (by keeping summed_items
) and being a bit more complicated (it has to check that sum
is updated).
Here’s a different way to do a pull approach:
class Versioned:
version = 0
value = None
class List:
items = Versioned()
sum = Versioned()
square = Versioned()
def set_items(self, items):
self.items.version += 1
self.items.value = items
def get_sum(self):
if self.sum.version < self.items.version:
self.sum.version = self.items.version
self.sum.value = sum(self.items.value)
return self.sum.value
def get_square(self):
if self.square.version < self.items.version:
self.square.version = self.items.version
self.square = self.get_sum() ** 2
return self.square.value
class Versioned:
version = 0
value = None
class List:
items = Versioned()
sum = Versioned()
square = Versioned()
def set_items(self, items):
self.items.version += 1
self.items.value = items
def get_sum(self):
if self.sum.version < self.items.version:
self.sum.version = self.items.version
self.sum.value = sum(self.items.value)
return self.sum.value
def get_square(self):
if self.square.version < self.items.version:
self.square.version = self.items.version
self.square = self.get_sum() ** 2
return self.square.value
Here, we use versions to keep sum
updated. Each value has a version, and that’s how we can tell whether something’s outdated.
There’s some repeated logic in both cases. That’s something that a declarative language or a functional library can help with.
To me, making data more pull than push feels like an important part of writing declarative programs. As much as possible, a piece of code should not be responsible for what other code does. The word I think most closely describing this is laziness.
Statelessness
Our List
class carries three things. It carries:
- state, which is data (
items
); - computations, which are ways to derive state (
get_sum
,get_square
); and - effects, which are ways to change state (
set_items
), or interact with the world.
A class couples state, effects, and computations together. I think of something that carries state or depends on an effect as stateful. Something that’s only does computations or produces effects is stateless. Part of declarative programming is making things stateless when possible. We can extract the get_sum
and get_square
out of the List
class into a function:
def sum_square(items):
summed = sum(items)
squared = summed ** 2
return (summed, squared)
def sum_square(items):
summed = sum(items)
squared = summed ** 2
return (summed, squared)
This is a function that doesn’t have any state of its own. Let’s compare this to:
calls = 0
def discount_sum(items):
global calls
calls += 1
summed = sum(items)
# Every 10th call gets a discount!
if calls % 10 == 0:
summed = 0.9 * summed
return summed
calls = 0
def discount_sum(items):
global calls
calls += 1
summed = sum(items)
# Every 10th call gets a discount!
if calls % 10 == 0:
summed = 0.9 * summed
return summed
This function is not stateless, because it depends on calls
. While it also depends on items
, that’s not part of the function’s state; it’s someone else’s state that gets passed to the function.
Stateless functions are nice because their dependencies are explicit. They depend only on their inputs to produce their output and do whatever other effects. Because the dependencies are explicit, we can memoize a stateless function by comparing inputs. If the inputs to sum_square
are the same, then the output will be the same too.
With a stateful function, we don’t have this guarantee. I can call discount_sum
with the same inputs, at two different times, and get two different outputs, because calls
is different. To memoize a stateful function, I also have to keep track of its state, and compare both inputs and state.
Here’s a more subtle example of a stateful function:
from datetime import datetime
def discount_sum(items):
summed = sum(items)
# Holiday discount!
if datetime.now().month == 12:
summed = 0.9 * summed
return summed
from datetime import datetime
def discount_sum(items):
summed = sum(items)
# Holiday discount!
if datetime.now().month == 12:
summed = 0.9 * summed
return summed
While it may not look like it depends on state, it depends on an effect, datetime.now
. Depending on the result of an effect is still depending on state on that’s not passed into the function. Again, if called with the same input, at different times, it can give different results.
We can make discount_sum
stateless by making the dependency part of the inputs:
def discount_sum(date, items):
summed = sum(items)
# Holiday discount!
if date.month == 12:
summed = 0.9 * summed
return summed
def discount_sum(date, items):
summed = sum(items)
# Holiday discount!
if date.month == 12:
summed = 0.9 * summed
return summed
I’m making a subtle distinction between a stateless function and a pure function. A pure function is a stateless function that doesn’t have any effects. Here’s a function that’s stateless but not pure:
def print_sum(items):
summed = sum(items)
print(summed)
return summed
def print_sum(items):
summed = sum(items)
print(summed)
return summed
Pure functions are nice for all the reasons that stateless functions are nice. There’s other nice things about pure functions that we don’t get with stateless ones, but in my mind, not that much more.