ICS 32 Winter 2022
Notes and Examples: Protocols
Background
Protocols
When you write a program that will store data in a file and read it back again later, you have to decide on a file format, which specifies, in detail, what the data will look like once it's stored in the file. Sometimes, it's as simple as just storing text, but if you want to do anything with the data other than display it exactly as it's stored, there's a good chance you'll need to consider a way to organize it within the file. For common problems like storing images or videos, there are existing file formats, such as the JPEG format for image files, but you can define your own, too, if a pre-existing format isn't appropriate for your particular use.
When you write programs that communicate with one another using sockets, you have a similar problem. The program on each side of the connection will be sending data to the other. Without an agreement about what that data will look like, the data sent from one program won't make sense to the other one. So, when programs communicate via sockets, you will always need them to agree on a protocol, which specifies what each program will send and what it will expect to receive. As with file formats, there are well-known protocols already defined for specific purposes — like the HTTP protocol that describes how data is transferred over the web, or the SMTP protocol that is used to send email — but you can also define your own protocol if you need something specific for your particular use. What's important is that both programs implement the same protocol, and that each program knows its role in that protocol.
The Polling protocol
In lecture, we wrote a client program that interacted with a Polling server, a program that I built that allows users to answer multiple-choice questions, while tracking the number of times a user picked each choice. The server is the program that keeps track, as time goes on, of what the questions are and who's answered them how; other programs can interact with it by connecting to it via a socket and then sending and receiving text in a predefined format called the Polling protocol. The Polling protocol, like other protocols, governs what each program — a Polling client and the Polling server — is required to send and what it can expect to receive in return.
The Polling protocol is what is sometimes known as a request-reply protocol. Lines of text are sent back and forth between the client and server, with each interaction being driven by the client sending a request and the server sending a corresponding reply. Every line is terminated with a newline sequence, which is made up of the Python string '\r\n' — technically, these characters are known as a carriage return and a line feed — without which the receiving program won't know that the sender has sent a complete line of text.
Using the Polling protocol, the interactions between a Polling client and the Polling server are expected to work as described below.
An example session follows:
Client | Server |
initiates a connection | |
accepts the connection | |
POLLING_HELLO boo | |
HELLO | |
POLLING_QUESTIONS | |
QUESTION_COUNT 1 | |
QUESTION 1 Who is your favorite Pekingese? | |
POLLING_CHOICES 1 | |
CHOICE_COUNT 1 | |
CHOICE 1 Boo | |
POLLING_VOTE 1 1 | |
VOTED | |
POLLING_VOTE 1 1 | |
ALREADY_VOTED | |
POLLING_GOODBYE | |
GOODBYE | |
closes the connection | |
closes the connection |
What we wanted to build
The protocol described above is not intended for human use, any more than the HTTP protocol — which governs how web browsers download web pages and other data — is intended for people. A web browser has the knowledge of the HTTP protocol embedded within it; behind the scenes, when you visit a web page, a conversation between your web browser and a web server commences, with HTTP defining what that conversation will look like. But the conversation itself is invisible to users of a web browser; someone using a browser simply sees some kind of progress indication and, ultimately, the web page.
Similarly, we might like to build a Polling client, whose job is to provide a user with the ability to use the Polling service without having to know the details of hosts, ports, sockets, and protocols, so they can simply look at a list of questions and vote on them.
Taking the opportunity to think about design
As programs get larger, we're best off separating them into modules that contain related subsets of functionality. When writing this program, we quickly find that there's a natural separation between the part of the program that implements the protocol (i.e., the part that communicates with the Polling server) and the program's user interface. Isolating each of these into its own module makes each of those modules simpler, and also provides other benefits (e.g., keeping a larger, complex program organized; allowing us to put more than one "outer shell" around the protocol code if, for example, we wanted to also write a graphical user interface).
So this program is probably best written with two modules, which we'll call polling (the protocol implementation) and polling_ui (the user interface).
Similarly, the functions in each module are broken into progressively smaller functions, with meaningful names and well-named parameters. This code example, in my view, is a good example of why we should want to do that, because there's a fair amount of complexity here that's worth isolating, so we can think about one thing at a time instead of everything.
Finally, within each module, we were fastidious about separating the public functions (i.e., the ones we expect would be needed by code in other modules) from the private ones (i.e., the ones that are only useful within that module). This separation provides at least two benefits: Making it easier to understand how to use a module, by limiting how many functions a user of that module needs to know about; and leaving open the possibility that certain aspects of how a module is implemented might change without affecting the code that calls it. As long as other modules use only the public parts of our module, we can feel free to change the private ones without having a negative effect on the others.
A word of warning
It should be noted that this isn't code that you're going to be able to copy and paste, in whole, into your Project #2 solution, as the protocol you're implementing in the project (and your program's interaction with it) is different from this one in some important ways. But there are ideas and techniques here that translate to your work on the project; the trick is being sure that you understand what's being done in this example and why before you attempt to use these ideas in your own programs, because part of what's important is understanding what parts of this example fit the problem you're solving in the project and which don't.
The code
Below are links to a complete Polling client, along with a module that implements the Polling protocol. Because we tend to write these examples in lecture in a free-form way, there may be a few minor differences between what's here and what we wrote in lecture, but the code is not meaningfully different from what we wrote.
Remember to download these files and place them in the same directory. It's important that they're in the same directory, so that Python will be able to find them when one imports the other. (There are fancier things that we can do, but we generally keep all of the modules that comprise our programs in a single directory until they get much, much larger.)
Trying out the example client
A Polling server like the one we connected to during lecture is now running on the same machine where the Connect Four server for Project #2 is running. (See a previously-sent email for an indication of where that is.) The Polling server is listening on port 5501.
Note that you'll need to make one update to the code — setting the value of the POLLING_HOST constant defined in polling_ui.py — before you can successfully run it. As you experiment with it, you might also want to set the _SHOW_DEBUG_TRACE constant defined in polling.py to True, which will display every line of text sent to the server and received in response.