ICS 33 Fall 2024
Exercise Set 4 Solutions


Problem 1

One straightforward implementation that uses the techniques we learned in lecture would look like this.


def all_substrings(original):
    return _SubstringIterator(original)


class _SubstringIterator:
    def __init__(self, original):
        self._original = original
        self._length = 1
        self._start = 0


    def __iter__(self):
        return self


    def __next__(self):
        if self._start + self._length > len(self._original):
            self._start = 0
            self._length += 1

        if self._length > len(self._original):
            raise StopIteration

        next_substring = self._original[self._start : (self._start + self._length)]
        self._start += 1
        return next_substring


Problem 2

generate_range

When written as a generator function, ranges end up being pretty straightforward indeed. There is about as much code here to shuffle the parameters' values around than to solve the problem afterward.


def generate_range(start, stop = None, step = None, /):
    if stop is None:
        stop = start
        start = 0

    if step is None:
        step = 1

    if step == 0:
        raise ValueError('step cannot be zero')

    value = start

    while True:
        if step < 0 and value <= stop:
            break
        elif step > 0 and value >= stop:
            break

        yield value
        value += step

no_fizz_without_buzz

The problem calls for an infinite sequence, since there is an infinite collection of integers that meet the requirements. Consequently, we'd need to write an infinite generator function — one that never returns — to solve it.


def no_fizz_without_buzz(start):
    value = start

    while True:
        div_by_three = value % 3 == 0
        div_by_five = value % 5 == 0

        if div_by_three == div_by_five:
            yield value

        value += 1

It's worth noting that this problem's name comes from a famously described — if not frequently asked — problem in technical interviews, as a way to gauge whether a candidate possesses the most basic of programming skills. In the usual writing, the problem boils down to taking a number and printing either Fizz, Buzz, FizzBuzz, or nothing, depending on whether the number is divisible by 3, divisible by 5, divisible by both 3 and 5, or divisible by neither of them. Anecdotal evidence suggests that many candidates, even those claiming prior programming experience, are unable to solve it.

cartesian_product

This problem technically allows a large number of arguments, but the practical reality of what we're building here reasonably guarantees that a large number of arguments is impractical. For example, if we passed 1,000 two-element tuples as arguments to cartesian_product, we'd wind up with 21,000 elements in our result, which is a number ridiculous enough that we can reasonably conclude that the number of arguments will need to be small.

Given that, a recursive approach is probably the most straightforward one here; if we recurse on the arguments, we don't have a practical concern about stack overflow before we've recursed through all of them. So, we could rely on the rough idea that the Cartesian product of n iterables consists of every element of the first iterable combined with every tuple in the Cartesian product of the other n - 1 of them, and fashion our solution around that. There are multiple ways to express it, but this is certainly one way to do it.


def cartesian_product(*iterables):
    def cartesian_product_from(values, index):
        if index >= len(iterables):
            yield values
        else:
            for value in iterables[index]:
                yield from cartesian_product_from((*values, value), index + 1)

    if len(iterables) > 0:
        yield from cartesian_product_from(tuple(), 0)

An iterative approach is likely more difficult to write, even if it feels easier to get started on one. If we start by thinking about there being exactly two iterables, then we might conclude that we could simply nest a couple of loops and be done with it. In this case, though, since we don't know how many iterables we are, loop-nesting is not an option for us — we can't know ahead of time how deeply to nest the loops! — so our best bet would be handling the iteration by hand, though the mechanics of that are a bit tricky, for at least a couple of reasons.

All of these things can be done, but would be difficult to do correctly.


Problem 3

Iterators are a superset of generators — or, thought differently, generators are a handy shorthand for writing iterators that can be expressed as functions that return a sequence of outputs. One way to demonstrate this is to show how a generator function can be turned into an iterator class. Let's consider the solution to no_fizz_without_buzz above.


def no_fizz_without_buzz(start):
    value = start

    while True:
        div_by_three = value % 3 == 0
        div_by_five = value % 5 == 0

        if div_by_three == div_by_five:
            yield value

        value += 1

To convert this to an iterator class, we'd need a few things.

Combining these ideas together, we might end up with a class that's something like this.


class NoFizzWithoutBuzz:
    def __init__(self, start):
        self._is_first = True
        self._start = start
        self._value = None
        self._div_by_three = None
        self._div_by_five = None


    def __iter__(self):
        return self


    def __next__(self):
        if self._is_first:
            # On the first call, start from the top of the generator function.
            self._is_first = False

            self._value = self._start
        else:
            # On subsequent calls, pick up where we left off, finishing off the
            # previous loop iteration.
            self._value += 1

        while True:
            self._div_by_three = self._value % 3 == 0
            self._div_by_five = self._value % 5 == 0
    
            if self._div_by_three == self._div_by_five:
                return self._value

            # We'll only get here if we skip a value (i.e., it doesn't belong in the
            # sequence), but that's fine; we'd still want to run another loop
            # iteration in that case.
            self._value += 1

There are complexities that we haven't considered here, two of which are worth noting.

Still, this certainly gives us an idea of how we might translate a generator function into an iterator class, which gives us some insight about how similar these two techniques are, but why generator functions can be so useful: Iterators are open-ended enough to do things that generator functions can't, but generator functions handle the problems for which they're suited with dramatically less ceremony.


Problem 4

Whether it's something simple like extracting a character from a string, or something more complex like accessing the value associated with a dictionary key or slicing a list, when we use indexing on a Python data structure, there are three things we expect to be true.

It's this last expectation that would cause a problem when we index or slice an iterator. We might reasonably expect behavior like this.


>>> numbers = [1, 3, 5, 7, 9, 11]
>>> i = iter(numbers)
>>> i[4]
    9           # This doesn't work in Python, but let's imagine it did.
>>> i[3]
    7           # If it did, wouldn't we expect this afterward?

The problem, though, is the way that iterators work in Python. They're meant for one-time use iteration; once we've advanced them forward, there's no going back. You could imagine list iterators working differently, but iterators don't only allow us to iterate lists. We might instead be iterating things like the results yielded from generator functions, which couldn't be "rewound" later without the iterator being required to store all of them.

Rather than require that, iterators are kept simple: You can use them to iterate a sequence of values once. If the sequence is still around (e.g., if it's in a list already), you can create a new iterator and perform another iteration, but the original iterator is spent once it's been through the entire sequence. That's incompatible with reasonable assumptions about how indexing works in Python, so it's best that indexing be illegal on iterators.

With a different function that looks quite different — itertools.islice(x, 2, 5) instead of x[2:5] — we have a visual cue that what we're doing is different, so the fact that it is built on a different set of expectations is not such a problem.