ICS 33 Fall 2024
Exercise Set 4 Solutions
Problem 1
One straightforward implementation that uses the techniques we learned in lecture would look like this.
def all_substrings(original):
return _SubstringIterator(original)
class _SubstringIterator:
def __init__(self, original):
self._original = original
self._length = 1
self._start = 0
def __iter__(self):
return self
def __next__(self):
if self._start + self._length > len(self._original):
self._start = 0
self._length += 1
if self._length > len(self._original):
raise StopIteration
next_substring = self._original[self._start : (self._start + self._length)]
self._start += 1
return next_substring
Problem 2
generate_range
When written as a generator function, ranges end up being pretty straightforward indeed. There is about as much code here to shuffle the parameters' values around than to solve the problem afterward.
def generate_range(start, stop = None, step = None, /):
if stop is None:
stop = start
start = 0
if step is None:
step = 1
if step == 0:
raise ValueError('step cannot be zero')
value = start
while True:
if step < 0 and value <= stop:
break
elif step > 0 and value >= stop:
break
yield value
value += step
no_fizz_without_buzz
The problem calls for an infinite sequence, since there is an infinite collection of integers that meet the requirements. Consequently, we'd need to write an infinite generator function — one that never returns — to solve it.
def no_fizz_without_buzz(start):
value = start
while True:
div_by_three = value % 3 == 0
div_by_five = value % 5 == 0
if div_by_three == div_by_five:
yield value
value += 1
It's worth noting that this problem's name comes from a famously described — if not frequently asked — problem in technical interviews, as a way to gauge whether a candidate possesses the most basic of programming skills. In the usual writing, the problem boils down to taking a number and printing either Fizz
, Buzz
, FizzBuzz
, or nothing, depending on whether the number is divisible by 3, divisible by 5, divisible by both 3 and 5, or divisible by neither of them. Anecdotal evidence suggests that many candidates, even those claiming prior programming experience, are unable to solve it.
cartesian_product
This problem technically allows a large number of arguments, but the practical reality of what we're building here reasonably guarantees that a large number of arguments is impractical. For example, if we passed 1,000 two-element tuples as arguments to cartesian_product
, we'd wind up with 21,000 elements in our result, which is a number ridiculous enough that we can reasonably conclude that the number of arguments will need to be small.
Given that, a recursive approach is probably the most straightforward one here; if we recurse on the arguments, we don't have a practical concern about stack overflow before we've recursed through all of them. So, we could rely on the rough idea that the Cartesian product of n iterables consists of every element of the first iterable combined with every tuple in the Cartesian product of the other n - 1 of them, and fashion our solution around that. There are multiple ways to express it, but this is certainly one way to do it.
def cartesian_product(*iterables):
def cartesian_product_from(values, index):
if index >= len(iterables):
yield values
else:
for value in iterables[index]:
yield from cartesian_product_from((*values, value), index + 1)
if len(iterables) > 0:
yield from cartesian_product_from(tuple(), 0)
An iterative approach is likely more difficult to write, even if it feels easier to get started on one. If we start by thinking about there being exactly two iterables, then we might conclude that we could simply nest a couple of loops and be done with it. In this case, though, since we don't know how many iterables we are, loop-nesting is not an option for us — we can't know ahead of time how deeply to nest the loops! — so our best bet would be handling the iteration by hand, though the mechanics of that are a bit tricky, for at least a couple of reasons.
next
on an iterator advances it, so we can't simply ask every iterator "What's your current element?" at every step; we'll need to handle that ourselves.All of these things can be done, but would be difficult to do correctly.
Problem 3
Iterators are a superset of generators — or, thought differently, generators are a handy shorthand for writing iterators that can be expressed as functions that return a sequence of outputs. One way to demonstrate this is to show how a generator function can be turned into an iterator class. Let's consider the solution to no_fizz_without_buzz
above.
def no_fizz_without_buzz(start):
value = start
while True:
div_by_three = value % 3 == 0
div_by_five = value % 5 == 0
if div_by_three == div_by_five:
yield value
value += 1
To convert this to an iterator class, we'd need a few things.
__init__
method that initializes our iterator, but that doesn't do any of the work in no_fizz_without_buzz
's body.__iter__
method that returns self
, consistent with how iterators generally work in Python (i.e., iterators are also iterable).__next__
method, we'd need to store them in attributes of the iterator object instead, so they'd outlive each call to __next__
.yield
statement, because an iterator would not be able to execute them in their entirety; we'd have to be able to "pick up where we left off," while also continuing to loop when a yield
statement isn't reached.Combining these ideas together, we might end up with a class that's something like this.
class NoFizzWithoutBuzz:
def __init__(self, start):
self._is_first = True
self._start = start
self._value = None
self._div_by_three = None
self._div_by_five = None
def __iter__(self):
return self
def __next__(self):
if self._is_first:
# On the first call, start from the top of the generator function.
self._is_first = False
self._value = self._start
else:
# On subsequent calls, pick up where we left off, finishing off the
# previous loop iteration.
self._value += 1
while True:
self._div_by_three = self._value % 3 == 0
self._div_by_five = self._value % 5 == 0
if self._div_by_three == self._div_by_five:
return self._value
# We'll only get here if we skip a value (i.e., it doesn't belong in the
# sequence), but that's fine; we'd still want to run another loop
# iteration in that case.
self._value += 1
There are complexities that we haven't considered here, two of which are worth noting.
yield
statements in the same generator function, which means we'd need to remember where we came from, so that a subsequent call to __next__
would know where to work from. This might best be generalized by changing the boolean value we're storing in self._is_first
into something else, like an integer value that chooses one of the "places we can start from" in the generator function.yield from
statement, which we'd probably turn into the manual creation of an iterator using iter
and manual iteration using next
on it.Still, this certainly gives us an idea of how we might translate a generator function into an iterator class, which gives us some insight about how similar these two techniques are, but why generator functions can be so useful: Iterators are open-ended enough to do things that generator functions can't, but generator functions handle the problems for which they're suited with dramatically less ceremony.
Problem 4
Whether it's something simple like extracting a character from a string, or something more complex like accessing the value associated with a dictionary key or slicing a list, when we use indexing on a Python data structure, there are three things we expect to be true.
It's this last expectation that would cause a problem when we index or slice an iterator. We might reasonably expect behavior like this.
>>> numbers = [1, 3, 5, 7, 9, 11]
>>> i = iter(numbers)
>>> i[4]
9 # This doesn't work in Python, but let's imagine it did.
>>> i[3]
7 # If it did, wouldn't we expect this afterward?
The problem, though, is the way that iterators work in Python. They're meant for one-time use iteration; once we've advanced them forward, there's no going back. You could imagine list iterators working differently, but iterators don't only allow us to iterate lists. We might instead be iterating things like the results yielded from generator functions, which couldn't be "rewound" later without the iterator being required to store all of them.
Rather than require that, iterators are kept simple: You can use them to iterate a sequence of values once. If the sequence is still around (e.g., if it's in a list already), you can create a new iterator and perform another iteration, but the original iterator is spent once it's been through the entire sequence. That's incompatible with reasonable assumptions about how indexing works in Python, so it's best that indexing be illegal on iterators.
With a different function that looks quite different — itertools.islice(x, 2, 5)
instead of x[2:5]
— we have a visual cue that what we're doing is different, so the fact that it is built on a different set of expectations is not such a problem.