ICS 33 Fall 2024, Exercise Set 4 Solutions

Problem 2

generate_range

When written as a generator function, ranges end up being pretty straightforward indeed. There is about as much code here to shuffle the parameters' values around than to solve the problem afterward.


def generate_range(start, stop = None, step = None, /):
    if stop is None:
        stop = start
        start = 0

    if step is None:
        step = 1

    if step == 0:
        raise ValueError('step cannot be zero')

    value = start

    while True:
        if step < 0 and value <= stop:
            break
        elif step > 0 and value >= stop:
            break

        yield value
        value += step

no_fizz_without_buzz

The problem calls for an infinite sequence, since there is an infinite collection of integers that meet the requirements. Consequently, we'd need to write an infinite generator function — one that never returns — to solve it.


def no_fizz_without_buzz(start):
    value = start

    while True:
        div_by_three = value % 3 == 0
        div_by_five = value % 5 == 0

        if div_by_three == div_by_five:
            yield value

        value += 1

It's worth noting that this problem's name comes from a famously described — if not frequently asked — problem in technical interviews, as a way to gauge whether a candidate possesses the most basic of programming skills. In the usual writing, the problem boils down to taking a number and printing either Fizz, Buzz, FizzBuzz, or nothing, depending on whether the number is divisible by 3, divisible by 5, divisible by both 3 and 5, or divisible by neither of them. Anecdotal evidence suggests that many candidates, even those claiming prior programming experience, are unable to solve it.

cartesian_product

This problem technically allows a large number of arguments, but the practical reality of what we're building here reasonably guarantees that a large number of arguments is impractical. For example, if we passed 1,000 two-element tuples as arguments to cartesian_product, we'd wind up with 2^1,000 elements in our result, which is a number ridiculous enough that we can reasonably conclude that the number of arguments will need to be small.

Given that, a recursive approach is probably the most straightforward one here; if we recurse on the arguments, we don't have a practical concern about stack overflow before we've recursed through all of them. So, we could rely on the rough idea that the Cartesian product of n iterables consists of every element of the first iterable combined with every tuple in the Cartesian product of the other n - 1 of them, and fashion our solution around that. There are multiple ways to express it, but this is certainly one way to do it.


def cartesian_product(*iterables):
    def cartesian_product_from(values, index):
        if index >= len(iterables):
            yield values
        else:
            for value in iterables[index]:
                yield from cartesian_product_from((*values, value), index + 1)

    if len(iterables) > 0:
        yield from cartesian_product_from(tuple(), 0)

An iterative approach is likely more difficult to write, even if it feels easier to get started on one. If we start by thinking about there being exactly two iterables, then we might conclude that we could simply nest a couple of loops and be done with it. In this case, though, since we don't know how many iterables we are, loop-nesting is not an option for us — we can't know ahead of time how deeply to nest the loops! — so our best bet would be handling the iteration by hand, though the mechanics of that are a bit tricky, for at least a couple of reasons.

We'd need to keep track all of the separate iterators ourselves.
We'd need to keep track, at each step, of which iterator needs to be advanced next.
We'd need a way to "reset" the iteration of each iterable (except the first) at the appropriate time, by starting a new iteration in place of the previously concluded one.
We'd need to take the values coming back from the iterators and recombine them into tuples at each step, while dealing with the fact that every call to next on an iterator advances it, so we can't simply ask every iterator "What's your current element?" at every step; we'll need to handle that ourselves.

All of these things can be done, but would be difficult to do correctly.

Problem 3

Iterators are a superset of generators — or, thought differently, generators are a handy shorthand for writing iterators that can be expressed as functions that return a sequence of outputs. One way to demonstrate this is to show how a generator function can be turned into an iterator class. Let's consider the solution to no_fizz_without_buzz above.


def no_fizz_without_buzz(start):
    value = start

    while True:
        div_by_three = value % 3 == 0
        div_by_five = value % 5 == 0

        if div_by_three == div_by_five:
            yield value

        value += 1

To convert this to an iterator class, we'd need a few things.

A class instead of a generator function.
An __init__ method that initializes our iterator, but that doesn't do any of the work in no_fizz_without_buzz's body.
An __iter__ method that returns self, consistent with how iterators generally work in Python (i.e., iterators are also iterable).
We'd need an analogue for the generator function's local variables, though since iterators' work is done via a sequence of calls to its __next__ method, we'd need to store them in attributes of the iterator object instead, so they'd outlive each call to __next__.
Finally, we'd need to do something about loops that contain a yield statement, because an iterator would not be able to execute them in their entirety; we'd have to be able to "pick up where we left off," while also continuing to loop when a yield statement isn't reached.

Combining these ideas together, we might end up with a class that's something like this.


class NoFizzWithoutBuzz:
    def __init__(self, start):
        self._is_first = True
        self._start = start
        self._value = None
        self._div_by_three = None
        self._div_by_five = None


    def __iter__(self):
        return self


    def __next__(self):
        if self._is_first:
            # On the first call, start from the top of the generator function.
            self._is_first = False

            self._value = self._start
        else:
            # On subsequent calls, pick up where we left off, finishing off the
            # previous loop iteration.
            self._value += 1

        while True:
            self._div_by_three = self._value % 3 == 0
            self._div_by_five = self._value % 5 == 0
    
            if self._div_by_three == self._div_by_five:
                return self._value

            # We'll only get here if we skip a value (i.e., it doesn't belong in the
            # sequence), but that's fine; we'd still want to run another loop
            # iteration in that case.
            self._value += 1

There are complexities that we haven't considered here, two of which are worth noting.

We'd need to deal with the problem that there may be multiple yield statements in the same generator function, which means we'd need to remember where we came from, so that a subsequent call to __next__ would know where to work from. This might best be generalized by changing the boolean value we're storing in self._is_first into something else, like an integer value that chooses one of the "places we can start from" in the generator function.
We'd also need a mechanism to support the yield from statement, which we'd probably turn into the manual creation of an iterator using iter and manual iteration using next on it.

Still, this certainly gives us an idea of how we might translate a generator function into an iterator class, which gives us some insight about how similar these two techniques are, but why generator functions can be so useful: Iterators are open-ended enough to do things that generator functions can't, but generator functions handle the problems for which they're suited with dramatically less ceremony.

Problem 4

Whether it's something simple like extracting a character from a string, or something more complex like accessing the value associated with a dictionary key or slicing a list, when we use indexing on a Python data structure, there are three things we expect to be true.

When we read from the index or slice, we'll get back the appropriate value(s).
When we write to an index or slice (for data structures that support it), we'll modify the appropriate value(s).
The data structure will continue its life afterward, which is to say that we expect that we can then proceed to use it in any way it supports.

It's this last expectation that would cause a problem when we index or slice an iterator. We might reasonably expect behavior like this.


>>> numbers = [1, 3, 5, 7, 9, 11]
>>> i = iter(numbers)
>>> i[4]
    9           # This doesn't work in Python, but let's imagine it did.
>>> i[3]
    7           # If it did, wouldn't we expect this afterward?

The problem, though, is the way that iterators work in Python. They're meant for one-time use iteration; once we've advanced them forward, there's no going back. You could imagine list iterators working differently, but iterators don't only allow us to iterate lists. We might instead be iterating things like the results yielded from generator functions, which couldn't be "rewound" later without the iterator being required to store all of them.

Rather than require that, iterators are kept simple: You can use them to iterate a sequence of values once. If the sequence is still around (e.g., if it's in a list already), you can create a new iterator and perform another iteration, but the original iterator is spent once it's been through the entire sequence. That's incompatible with reasonable assumptions about how indexing works in Python, so it's best that indexing be illegal on iterators.

With a different function that looks quite different — itertools.islice(x, 2, 5) instead of x[2:5] — we have a visual cue that what we're doing is different, so the fact that it is built on a different set of expectations is not such a problem.