Infinite Ascent.

by CJ Quineson

Declarative programming as mindset

hint: it’s not about map/filter/reduce

The usual take for “declarative” programming are that it’s about saying what you want to get done, not how; or about using reduce rather than for loops; or about immutability or isolating side effects. But that’s not what I think of when I talk about doing “declarative” programming. Here’s some examples.

Here’s two approaches to running a long chain of functions, each depending on the results of the previous. You can call the functions in sequence, collecting the results and passing them from one function to the next. Or you can make each function responsible for its own result; if it doesn’t have any inputs, it asks for them from the previous function. The second approach feels more “declarative” to me than the first one.

Here’s two approaches to designing an API to list the files in a file system. You can make the server remember your current directory, and you can ask the API to go up one directory, or to go down a subdirectory. Or you can remember the current directory yourself, and when you ask the API you have to give both your current directory and where you want to go. The second approach feels more “declarative” to me than the first one.

Here’s two approaches to data fetching. You can fetch data in the parent component, and pass it down to child components through props, context, or a global store (like Redux or Zustand). Or you can fetch data in the child component itself, perhaps with a data fetching library (like Query or SWC), and trust that the network requests won’t be repeated or something. Again, the second approach feels more “declarative” to me than the first one.

What makes the second approaches more “declarative” doesn’t have to do with how it’s implemented. It doesn’t matter which language or library you’re using, the second approaches are still more “declarative”. That’s why I’ve been writing “declarative” in quotes—it’s different from how others might use the word. I’ll drop the quotes from now on, with the understanding that declarative refers to approach, not implementation.

I’m not sure exactly how to define what declarative means for me. I think it has to do with two principles: laziness and statelessness.

Laziness

Let’s say I have a class that stores a list and can compute its sum:

class List:
    items = []

    def set_items(self, items):
        self.items = items

    def get_sum(self):
        return sum(self.items)
class List:
    items = []

    def set_items(self, items):
        self.items = items

    def get_sum(self):
        return sum(self.items)

This is great. For the sake of example, let’s say sum was a more expensive function, and we measured and found out that calling get_sum was too expensive. I could change it to:

class List:
    items = []
    sum = 0

    def set_items(self, items):
        self.items = items
        self.sum = sum(self.items)

    def get_sum(self):
        return self.sum
class List:
    items = []
    sum = 0

    def set_items(self, items):
        self.items = items
        self.sum = sum(self.items)

    def get_sum(self):
        return self.sum

Let’s say we have to compute the square of the sum of the list as well (which, for sake of example, also happens to be expensive). We can write something like:

class List:
    items = []
    sum = 0
    square = 0

    def set_items(self, items):
        self.items = items
        self.sum = sum(self.items)
        self.square = self.sum ** 2

    def get_sum(self):
        return self.sum

    def get_square(self):
        return self.square
class List:
    items = []
    sum = 0
    square = 0

    def set_items(self, items):
        self.items = items
        self.sum = sum(self.items)
        self.square = self.sum ** 2

    def get_sum(self):
        return self.sum

    def get_square(self):
        return self.square

I’d call this a push approach to computing the sum and squared sum. When we update the list, we have to remember to push the changes to the computed values. The responsibility for keeping sum updated rests with set_items, and not get_sum.

There’s some disadvantages with a push approach:

  • If we never call get_sum, then the work we did computing sum was useless. This becomes a problem if we don’t always expect to call get_sum.
  • Whenever we have something that depends on items, we need to compute that value whenever we change items. This becomes a problem the more things depend on items. We’ve seen this when we added square.
  • The knowledge for how get_sum works lives in set_items, and not in get_sum, where it’d be more natural to be. This becomes a problem when future maintainers are tryinig to understand how get_sum works: they have to read set_items as well.

The opposite of a push approach would be a pull approach:

class List:
    items = []
    summed_items = []
    sum = 0
    squared_sum = 0
    square = 0

    def set_items(self, items):
        self.items = items

    def get_sum(self):
        if self.items != self.summed_items:
            self.summed_items = self.items[:]
            self.sum = sum(self.summed_items)
        return self.sum

    def get_square(self):
        if self.get_sum() != self.squared_sum:
            self.squared_sum = self.get_sum()
            self.square = self.squared_sum ** 2
        return self.square
class List:
    items = []
    summed_items = []
    sum = 0
    squared_sum = 0
    square = 0

    def set_items(self, items):
        self.items = items

    def get_sum(self):
        if self.items != self.summed_items:
            self.summed_items = self.items[:]
            self.sum = sum(self.summed_items)
        return self.sum

    def get_square(self):
        if self.get_sum() != self.squared_sum:
            self.squared_sum = self.get_sum()
            self.square = self.squared_sum ** 2
        return self.square

Now get_sum is memoized. This addresses the three disadvantages above, at the cost of using more memory (by keeping summed_items) and being a bit more complicated (it has to check that sum is updated).

Here’s a different way to do a pull approach:

class Versioned:
    version = 0
    value = None


class List:
    items = Versioned()
    sum = Versioned()
    square = Versioned()

    def set_items(self, items):
        self.items.version += 1
        self.items.value = items

    def get_sum(self):
        if self.sum.version < self.items.version:
            self.sum.version = self.items.version
            self.sum.value = sum(self.items.value)
        return self.sum.value

    def get_square(self):
        if self.square.version < self.items.version:
            self.square.version = self.items.version
            self.square = self.get_sum() ** 2
        return self.square.value
class Versioned:
    version = 0
    value = None


class List:
    items = Versioned()
    sum = Versioned()
    square = Versioned()

    def set_items(self, items):
        self.items.version += 1
        self.items.value = items

    def get_sum(self):
        if self.sum.version < self.items.version:
            self.sum.version = self.items.version
            self.sum.value = sum(self.items.value)
        return self.sum.value

    def get_square(self):
        if self.square.version < self.items.version:
            self.square.version = self.items.version
            self.square = self.get_sum() ** 2
        return self.square.value

Here, we use versions to keep sum updated. Each value has a version, and that’s how we can tell whether something’s outdated.

There’s some repeated logic in both cases. That’s something that a declarative language or a functional library can help with.

To me, making data more pull than push feels like an important part of writing declarative programs. As much as possible, a piece of code should not be responsible for what other code does. The word I think most closely describing this is laziness.

Statelessness

Our List class carries three things. It carries:

  • state, which is data (items);
  • computations, which are ways to derive state (get_sum, get_square); and
  • effects, which are ways to change state (set_items), or interact with the world.

A class couples state, effects, and computations together. I think of something that carries state or depends on an effect as stateful. Something that’s only does computations or produces effects is stateless. Part of declarative programming is making things stateless when possible. We can extract the get_sum and get_square out of the List class into a function:

def sum_square(items):
    summed = sum(items)
    squared = summed ** 2
    return (summed, squared)
def sum_square(items):
    summed = sum(items)
    squared = summed ** 2
    return (summed, squared)

This is a function that doesn’t have any state of its own. Let’s compare this to:

calls = 0

def discount_sum(items):
    global calls
    calls += 1
    summed = sum(items)
    # Every 10th call gets a discount!
    if calls % 10 == 0:
        summed = 0.9 * summed
    return summed
calls = 0

def discount_sum(items):
    global calls
    calls += 1
    summed = sum(items)
    # Every 10th call gets a discount!
    if calls % 10 == 0:
        summed = 0.9 * summed
    return summed

This function is not stateless, because it depends on calls. While it also depends on items, that’s not part of the function’s state; it’s someone else’s state that gets passed to the function.

Stateless functions are nice because their dependencies are explicit. They depend only on their inputs to produce their output and do whatever other effects. Because the dependencies are explicit, we can memoize a stateless function by comparing inputs. If the inputs to sum_square are the same, then the output will be the same too.

With a stateful function, we don’t have this guarantee. I can call discount_sum with the same inputs, at two different times, and get two different outputs, because calls is different. To memoize a stateful function, I also have to keep track of its state, and compare both inputs and state.

Here’s a more subtle example of a stateful function:

from datetime import datetime

def discount_sum(items):
    summed = sum(items)
    # Holiday discount!
    if datetime.now().month == 12:
        summed = 0.9 * summed
    return summed
from datetime import datetime

def discount_sum(items):
    summed = sum(items)
    # Holiday discount!
    if datetime.now().month == 12:
        summed = 0.9 * summed
    return summed

While it may not look like it depends on state, it depends on an effect, datetime.now. Depending on the result of an effect is still depending on state on that’s not passed into the function. Again, if called with the same input, at different times, it can give different results.

We can make discount_sum stateless by making the dependency part of the inputs:

def discount_sum(date, items):
    summed = sum(items)
    # Holiday discount!
    if date.month == 12:
        summed = 0.9 * summed
    return summed
def discount_sum(date, items):
    summed = sum(items)
    # Holiday discount!
    if date.month == 12:
        summed = 0.9 * summed
    return summed

I’m making a subtle distinction between a stateless function and a pure function. A pure function is a stateless function that doesn’t have any effects. Here’s a function that’s stateless but not pure:

def print_sum(items):
    summed = sum(items)
    print(summed)
    return summed
def print_sum(items):
    summed = sum(items)
    print(summed)
    return summed

Pure functions are nice for all the reasons that stateless functions are nice. There’s other nice things about pure functions that we don’t get with stateless ones, but in my mind, not that much more.

Comments

Loading...