ICS 33 Fall 2024
Project 1: Calling All Stations
Due date and time: Wednesday, October 16, 11:59pm
Git repository: https://ics.uci.edu/~thornton/ics33/ProjectGuide/Project1/Project1.git
Introduction
When I first started learning to write programs in the 1980s, software systems were, by and large, written to run on a single computer. It would be necessary for a program to communicate with its computer's main memory, and perhaps with peripheral devices like disk drives, but usually not any farther than that. Communication between computers was much less common, with techniques supporting it much less well-known. While computer networks did exist in those days, they were substantially less common and ran at a tiny fraction of the speed of the networks in place today, which meant that even well-heeled people and organizations couldn't use them in the ways we take for granted today; the bandwidth simply wasn't there at any price.
Today, we have ubiquitous high-speed connectivity, allowing distributed systems to be a common part of the real-world software landscape. In a distributed system, programs run on many machines and collaborate to solve a problem. The programs on each machine run separately, which means that they can make simultaneous progress, but this is a double-edged sword.
In other words, in terms of raw work, the effect of running on many computers is multiplicative; the problem is that the raw work of one computer will only be useful if it's coordinated with the raw work being done by the others. Just like when multiple people collaborate on a task, if there's no agreement about who will do what, you can end up with duplicate effort, with two people working on the same task. If there are dependencies between tasks — one task can't be started until another is finished — then coordination will be needed, so that things will be done in the right order, and so that the output from the first can be made available as input to the second. So, an important part of any distributed system is coordinating these kinds of things, which requires careful design before the system runs, as well as computation and communication bandwidth for the coordination while it runs.
In this project, we'll scratch the surface of the ideas underlying collaboration in a distributed system, by building and testing a simulation of how information propagates between the devices that make up such a system. Our simulation will focus on the problem of communicating alerts, which is to say that we're concerned with one device noticing a problem and then notifying other devices about it. We're not concerned about what the alerts mean, though, nor what should be done about them; we're just interested in the communication aspect of the problem.
Along the way, you'll continue to build on your skills from prerequisite coursework by making the appropriate design decisions (e.g., determining which of Python's built-in data structures might best help you solve the various parts of the problem, deciding how best to organize your program into modules, classes, functions, and so on). Additionally, you'll continue to develop your abilities to test your programs in an automated fashion by writing unit tests, employing a few techniques that we've learned more recently to widen the reach of what can be verified by your automated tests.
A simulation of a distributed system
A simulation is a program that executes a simplified model of some reality, usually with the goal of allowing us to explore some aspect of that reality without having to actually experience it. There are lots of reasons why we might want a simulation. We can use one for the purposes of entertainment, like how we might use a flight simulator to allow us to experience the feeling of flight without purchasing an airplane and obtaining a pilot's license; we might instead use one to explore the impact of a potential change to a business process without having to make the change, so we can see ahead of time whether the change is likely to be profitable. (Of course, what we learn from a simulation is only useful if our model matches the reality of what we're simulating — if our model mischaracterizes our customers' actual reaction to the change in our business process, say, then our simulation will mislead us into making poor decisions — but that's a deep topic for other courses.)
In our case, we'd like to explore one aspect of how distributed systems behave without having to obtain multiple machines, write coordinated programs to run on them, and connect them via a computer network. So, rather than setting all of that up, we'll write a program that boils our problem down to its essence, with simulated devices pretending to communicate with each other, according to a set of rules that will be a simplification of reality, but that will allow our program to have a result that we hope will be indicative of reality. (Since we won't ultimately be using that result for anything beyond this project, there's less pressure on us to define our simulation model accurately; it's more important that we have a definitive result, and that we learn something about the nature of how simulations are built and tested.)
The simulation model
Our simulation model is based around a few concepts, which we'll need to understand before we can proceed into the details of how to implement and test our simulation.
So, in general, there's a straightforward effect that we can expect from these rules. When a device raises an alert, it begins propagating among the devices, continuing to propagate — even amongst devices that have already been made aware of it — until all of the devices have been notified of its cancellation. Because communication takes time, there will naturally be a delay between the alert being raised and all devices being aware of it, just as some devices will continue propagating canceled alerts until they've been made aware of the cancellation. (We see the same kinds of effects in any kind of distributed organization, whether it's made up of computers or people; some members will be acting on outdated information at least some of the time.)
As an example, we might have four devices, which we'll call Device 1, Device 2, Device 3, and Device 4, for the purposes of discussion. Let's suppose that the propagation sets are configured as follows.
Now, suppose Device 1 raises an alert at time 0 (i.e., immediately as the simulation begins). What happens next is a cascade of communication amongst devices, which we can simulate by hand to understand how our simulation model is meant to work.
There are a couple of things worth noting about this example.
Tightening up our simulation model
Because we want our program to have a deterministic outcome — given a particular input, we always want the same events to occur at the same times — we'll need to carefully consider what we should do about the problem of things being scheduled to happen at the same time. There are three scenarios to consider.
There's a simple rule we can employ to keep the last of these scenarios from ever being an issue for us, though: When a cancellation arrives at a device at time n, it only affects the propagation of subsequent alerts or cancellations arriving at that device from time n + 1 onwards. So, we'll employ that rule, and otherwise allow multiple events scheduled simultaneously to be processed in any order.
Getting started
Near the top of this project write-up, you'll find a link to a Git repository that provides the starting point for this project. Using the instructions you followed in Project 0 (in the section titled Starting a new project), create a new PyCharm project using that Git repository; you'll do your work within that PyCharm project.
You won't find that there's much provided code: Just a skeletal project1.py
file, an empty test_project1.py
file in which you could write unit tests for the code in project1.py
, one sample input and one sample output file, and a .gitignore
file to help you curate what does (and doesn't) end up committed into your Git repository. Requirements about how you organize your solution will follow a little later in this write-up.
The program
Your program will begin by reading one line of input from the Python shell (e.g., via a call to the built-in input
function), specifying the path to an input file that describes the work that your program will be doing. Do not print a prompt or anything else; read this input before writing any output.
If the specified file doesn't exist, your program will then print one line of output FILE NOT FOUND
and then terminate. Otherwise, your program will have a simulation to run, with everything driven by what's in that input file, so let's take a look at what you can expect to be in it.
The input
Your program's input file consists of lines of text. Each file will fit one of six characteristics, described below.
LENGTH
, followed by a space, followed by a positive integer value. This line establishes the length of the simulation, specified in milliseconds. So, for example, the line LENGTH 600000
would mean that the simulation will run for 600,000 simulation milliseconds, numbered 0 through 599,999, with the simulation ending (and nothing else happening) at time 600,000.DEVICE
, followed by a space, followed by a non-negative integer value. These lines establish the existence of one device in our simulation; the integer is the device ID that uniquely identifies the device.PROPAGATE
, followed by three non-negative integer values, separated by spaces. These lines describe a rule for propagating an alert or cancellation from one device to another. The three integers specify, as follows, three things:
PROPAGATE 1 2 100
indicates that device 1 will propagate any alert or cancellation to device 2, and that device 2 will receive it 100 milliseconds later.ALERT
, followed by three values, separated by spaces.
ALERT 1 OhNo 5000
indicates that device 1 will raise an alert with the text OhNo
at simulation time 5000.CANCEL
, followed by three values, separated by spaces.
CANCEL 1 OhNo 6000
indicates that device 1 will cancel an alert with the text OhNo
at simulation time 6000.#
character, in which case it can be followed by zero or more characters of text with no restrictions. These lines are used as comments in the input file — similar to comments in a Python program — and can be ignored by your program when detected.There is no restriction about the order in which you'll find the lines of the file. Any of those kinds of lines can appear anywhere in the file.
However, you can assume that the input will be formatted according to those rules; we won't be testing your program with input files that don't meet those requirements, so your program can do anything (including crash) if given such an input file.
There are also a few other assumptions you can safely make about the input (i.e., we won't test scenarios where these assumptions aren't true, so we don't care how — or if — your program handles them).
The output
As your simulation progresses, your program will produce output that describes messages as they're sent and received, printing that output to the standard output. There are four interesting events in our simulation, each of which warrants a line of output.
@100: #1 RECEIVED ALERT FROM #2: OhNo
means that, at simulation time 100, device 1 received the alert with description OhNo
from device 2.@50: #2 SENT ALERT TO #1: OhNo
means that, at simulation time 50, device 2 sent the alert OhNo
to device 1.@200: #1 RECEIVED CANCELLATION FROM #2: OhNo
means that, at time 200, device 1 received a cancellation of the alert with description OhNo
from device 2.@150: #2 SENT CANCELLATION TO #1: OhNo
means that, at time 150, device 2 sent a cancellation of the alert with description OhNo
to device 1.@600: END
means that, at time 600, the simulation ended.Your program's output must exactly match this output specification, character-for-character, with a newline on the end of every line (including the last one). Spelling, capitalization, and spacing are all relevant.
When two events are scheduled to occur at the same time, the order in which your program prints the lines of output describing them is irrelevant. Earlier events must be printed before later ones, but simultaneously occurring events can appear in the output in any order.
How the simulation ends
Your simulation will continue running for its scheduled duration, ending at the scheduled time, regardless of whether there are still messages propagating.
If the simulation is scheduled to end at a particular time, nothing else happens within that same simulation millisecond — nor after that time — even if scheduled. For example, if the simulation length is described in the input as LENGTH 600000
, that means there are 600,000 simulation milliseconds, numbered from 0 through 599,999. The only thing that happens at time 600,000 is the ending of the simulation.
Sample input and output
After you create a PyCharm project from the provided Git bundle, you'll find a directory named sample
that contains two files, sample_input.txt
and sample_output.txt
, which are a sample input and output, respectively. The sample_input.txt
file represents the example scenario described in the section titled The simulation model above. If you run your program using sample_input.txt
as the input file, its output should exactly match (on a character-for-character basis) what's in sample_output.txt
.
We will not be providing additional samples; beyond this, testing is your responsibility, as it usually is when you develop programs.
The meaning of time in your simulation
It's worth noting that your goal here is building a simulation, which is defined generally as a way of exploring a possible reality without actually having to experience that reality. What we learn from a simulation might inform our real-world decisions — if, for example, a simulation tells us that an idea is a good one, and we believe our simulation's model reasonably reflects reality, then we might be inclined to follow through with our idea.
In our case, our goal is not to experience the passage of time in our simulation; it's to measure the possible passage of time in the world that we're simulating, while obtaining a result as quickly as possible. For that reason, when an event is scheduled to take place at a certain time in our simulation, that has no bearing on when it happens in real time. A simulation of eight hours of real-world time shouldn't take eight hours to run; it should run as quickly as our program can determine the outcome. So, don't fall into the trap of trying to figure out how to pause your program for a certain number of milliseconds, or schedule functions to be called at a certain time; that's not the goal here.
There's one more thing to consider in your design, as well. Since we're measuring time at the fine grain of one millisecond, but since nothing will happen during most of the milliseconds in our simulation, you'll want to be sure that your simulation "skips ahead" whenever possible. In other words, if the current time is 5000 and there are no events scheduled to occur until time 6000, your simulation should immediately proceed to time 6000, rather than looping 1000 times — each time concluding that nothing needs to happen — in order to get there. This dramatically improves your program's ability to run sparse simulations over long stretches of simulation time (i.e., long simulations in which relatively little happens).
Design, organization, and testing requirements
Building on your prior background in designing and implementing Python programs (e.g., from ICS 32 or ICS H32), we'll leave many of the details of how to design your program unspecified; there are lots of good ways to approach the problem, so rather than require everyone to do exactly the same thing, we'll allow each of you to discover your own solution.
That said, we do have some requirements that need to be met, and that will inform at least some of your design choices. We will not be willing or able to pre-grade your work, and we won't generally be answering quesitons such as "If I make this decision choice, how will that affect my grade?" The requirements are listed below and they — along with what's specified in the section titled Limitations below and the general grading guidelines given on the Project Guide page — are definitive.
Your main module
You must have a Python module named project1.py
, which provides a way to execute your program in whole; executing project1.py
executes the program. Since you expect this module to be executed in this way, it would naturally need to have an if __name__ == '__main__':
statement at the end of it, for reasons described in prior coursework. Note that the provided Git repository will already contain this file (and the if __name__ == '__main__':
statement.
Aside from the requirement that there be a project1.py
file that is your "main" module, you can change any of the provided code in any way you'd like to meet this project's requirements.
Modules other than the main module
This is a project that is large enough that it will benefit from being divided into separate modules, each focusing on one kind of functionality, as opposed to jamming all of it into a single file or, worse yet, a single function. As you no doubt learned in prerequisite coursework, when you divide a large program into smaller pieces that are relatively isolated from one another, your ability to work on a large-sized problem without losing control of its complexity dramatically increases. So, we'll expect to see your program divided into multiple modules, with each one focused on one kind of problem. We don't have specific requirements around exactly where you draw the line, but we do have the requirement that, in general, you "keep separate things separate."
If deciding what functionality belongs in which modules is territory that feels unfamiliar to you, the Modules notes from my ICS H32 course are a good primer on some of the ways you might think about that problem. Pay special attention to how terms like high cohesion and low coupling are defined there.
Working and testing incrementally
As you did in Project 0, you are required to do your work incrementally, to test it incrementally (i.e., as you write new functions, you'll be implementing unit tests for them), and to commit your work periodically into a Git repository, which you will be bundling and submitting to us.
As in Project 0, we don't have a specific requirement about how many commits you make, or how big a "feature" is, but your general goal is to commit when you've reached stable ground — a new feature is working, and you've tested it (ideally with unit tests). We'll expect to see a history of these kinds of incremental commits.
Test coverage
Your goal in this project is to reach the highest percentage of test coverage you can. Our goal here can't as easily be 100% as it was in Project 0 — mainly because there are aspects of what you're doing here that may not be testable with techniques that you've learned to date — but you'll nonetheless need to test whatever can be tested. Challenge yourself to cover everything that you can cover using the techniques that you've learned to date.
To that end, every function that lacks full coverage (i.e., for which at least one line or at least branch is left uncovered) will require a comment above it, briefly explaining why there is missing coverage. These comments won't ever need to be more than a sentence or two, but the objective is to reflect on why you weren't able to write unit tests that covered it. Doing that may actually help you to realize why you actually can test at least some of the code; by splitting a complex function into separate, smaller pieces, you can often take a function for which you know no technique to test it, and isolate the untestable portion of it in its own separate function or module. (You might also find that splitting up complexity makes an untractably large number of tests turn into a much smaller number of them instead. Automated testing is as much a design problem as it is a testing problem, and there's no better way to learn that lesson than to set a high test coverage goal and challenge yourself to meet it.)
We will be neither willing nor able to negotiate individually with you about explicitly what must be tested and what can be safely left untested, as this will depend heavily on the design decisions you're making. However, it's safe to say that there are relatively few things we're doing here for which no testing techniques exist, and we will be aware, while grading your work, of what aspects of this problem are likelier to be untestable (using techniques taught in this course or in prerequisite coursework) than others.
Sanity-checking your output
We are providing a tool that you can use to sanity check whether you've followed the basic requirements above. It will only give you a "passing" result in these circmustances.
project1.py
), spelled and capitalized correctly.It should be noted that there are many additional tests you'll be want to perform, and that there are many additional tests that we'll be using when we grade your project. The way to understand the sanity checker's output is to think of it this way: Just because the sanity checker says your program passes doesn't mean it's close to perfect, but if you cannot get the sanity checker to report that your program passes, it surely will not pass all of our automated tests (and may well fail all of them).
You'll find the sanity checker in your project directory, in a file named project1_sanitycheck.py
. Run that program like you would any other, and it will report a result.
Limitations
You can use the Python standard library where appropriate in this project, and you can certainly depend on the code that we've provided in the Git bundle, but you will otherwise not be able to use code written by anyone else other than you. Notably, this includes third-party libraries (i.e., those that are not part of Python's standard library), which are strictly off-limits in this course except where specifically allowed. Colloquially, if we have to install something other than Python, Git, and PyCharm in order for your program to work, it's considered off-limits, unless specific permission is given in a particular project. This project offers no such permission.
Preparing your submission
When you're ready to submit your work, run the provided prepare_submission.py
script, as you did in Project 0, which will create a Git bundle from the Git repository in your project directory; that Git bundle will be your submission.
Verifying your bundle before submission
If you're feeling unsure of whether your bundle is complete and correct, you can verify it by creating a new PyCharm project from it, as you did in Project 0. (You'll want to create this project in a different directory from your project directory, so it's separate and isolated.) Afterward, you should see the files in their final form, and the Git tab in PyCharm should show your entire commit history. If so, you're in business; go ahead and submit your work.
Deliverables
Submit your project1.bundle
file (and no others) to Canvas. There are a few rules to be aware of.
main
branch, except to the extent that we'll examine prior commits when evaluating your overall process. We will not negotiate about which commit will be graded, or, for example, grade multiple of your commits and "take the highest score."Can I submit after the deadline?
Yes, it is possible, subject to the late work policy for this course, which is described in the section titled Late work at this link. Beyond the late work deadline described there, we will no longer accept submissions.
What do I do if Canvas adjusts my filename?
Canvas will sometimes modify your filenames when you submit them (e.g., by adding a numbering scheme like -1 or a long sequence of hexadecimal digits to its name). In general, this is fine; as long as the file you submitted has the correct name prior to submission, we'll be able to obtain it with that same name, even if Canvas adjusts it.