We are by nature both highly social and creative animals, and as a result, we are always finding new ways to communicate. It did not take long after computers were first connected together for it to be recognized that those interconnections provided the means to link together people as well. The desire to use computers to create an online community led to the creation of Usenet more than two decades ago.
Usenet started out as an informal network of UNIX computers. Usenet communication consists of four basic steps. A message is first composed and then posted to the originator’s local server. The third step is propagation, where the message is transmitted from its original server to others on the Usenet system. The last step in the process is article retrieval, where other members of the newsgroup access and read the article. The Network News Transfer Protocol (NNTP) is the technology used for moving Usenet articles from one host to the next.
NNTP is similar to the Simple Mail Transfer Protocol (SMTP) in many ways, including its basic operation and command set and reply format. The Network News Transfer Protocol (NNTP) is the protocol used to implement message communication in modern Usenet. It is used for two primary purposes: to propagate messages between NNTP servers and to permit NNTP clients to post and read articles. It is a stand-alone protocol but shares many characteristics with email’s Simple Mail Transfer Protocol (SMTP).
NNTP Overview and General Operation
As we know, Usenet started out as an informal network of UNIX computers using dial-up UUCP connections to transmit messages between servers. This arrangement arose out of necessity, and it worked fairly well, though it had a number of problems. Once the Internet became widely used in the 1980s, it provided the ideal opportunity for a more efficient means of distributing Usenet articles. NNTP was developed as a special TCP/IP protocol for sending these messages. Now NNTP carries billions of copies of Usenet messages from computer to computer every day.
Usenet began as a logical inter-network of cooperating hosts that contacted each other directly. In the early Usenet, a user would post a message to a local server, where it would stay until that server either contacted or was contacted by another server. The message would then be transferred to the new server, where it would stay until the second server contacted the third one, and so on. This transport mechanism was functional but seriously flawed in a number of ways.
Servers were not continually connected to each other; they could communicate only by making a telephone call using an analog modem. Thus, messages would often sit for hours before they could be propagated. Modems in those days were also very slow as compared to today’s standards, so it took a long time to copy a message from one server to another. Worst of all, unless two sites were in the same city, these phone calls were long distance, making them quite expensive.
Why was this system used despite all of these problems? The answer is simply because there was no alternative. In the late 1970s and early 1980s, there was no Internet as we know it, and no other physical infrastructure existed to link Usenet sites together. It was either use UUCP over telephone lines or nothing.
That all changed as the fledgling ARPAnet grew into the modern Internet. As the Internet expanded, more and more sites connected to it, including many sites that were participating in Usenet. Once both sites in an exchange were on the Internet, it was an easy decision to use the Internet to send Usenet articles, rather than relying on slow, expensive phone calls. Over time, more and more Usenet sites joined the Internet, and it became clear that just as email had moved from UUCP to the TCP/IP Internet, the future of Usenet was on the Internet as well.
The shifting of Usenet from UUCP connections to TCP/IP inter-networking meant that some rethinking was required as to how Usenet articles were moved from server to server. On the Internet, Usenet was just one of many applications, and the transfer of messages had to be structured using TCP or the User Datagram Protocol (UDP). Thus, like other applications, Usenet required an application-level protocol to describe how to carry Usenet traffic over TCP/IP. Just as Usenet had borrowed its message format from email’s RFC 822, it made sense to model its message delivery protocol on the one used by email: SMTP. The result was the creation of NNTP, published as RFC 977 in February 1986.
The general operation of NNTP is indeed very similar to that of SMTP. NNTP uses TCP, with servers listening on well-known TCP port 119 for incoming connections, either from client hosts or other NNTP servers. As in SMTP, when two servers communicate using NNTP, the one that initiates the connection plays the role of client for that exchange.
After a connection is established, communication takes the form of commands sent by the client to the server and replies returned from the server to the client device. NNTP commands are sent as plain ASCII text, just like those used by SMTP, the File Transfer Protocol (FTP), the Hypertext Transfer Protocol (HTTP), and other protocols. NNTP responses take the form of three-digit reply codes as well as descriptive text, again just like SMTP (which, in turn, borrowed this concept from FTP).
NNTP was designed to be a comprehensive vehicle for transporting Usenet messages. It is most often considered as a delivery protocol for moving Usenet articles from one server to another, but it is also used for connections from client hosts to Usenet servers for posting and reading messages. Thus, the NNTP command set is quite extensive and includes commands to handle communications between servers and between clients and servers. For message propagation, a set of commands allows a server to request new articles from another server or to send new articles to another server. For message posting and access, commands allow a client to request lists of new newsgroups and messages, and to retrieve messages for display to a user.
The commands defined in RFC 977 were the only official ones for over a decade. However, even as early as the late 1980s, implementers of NNTP server and client software were adding new commands and features to make NNTP both more efficient and useful to users. These NNTP extensions were eventually documented in RFC 2980, published in 2000.
NNTP is used for all of the transfer steps in the modern Usenet communication process. However, NNTP is most often associated with the process of Usenet article propagation. This is arguably the most important function of NNTP: providing an efficient means of moving large volumes of Usenet articles from one server to another. It is thus a sensible place to start looking at the protocol.
NNTP Interserver Communication Process: News Article Propagation
The Usenet Server Structure
In theory, all that is required of the Usenet structure is that each site bisconnected to at least one other site in some form. The logical network could be amorphous and without any formal structure at all, as long as every site could form a path through some sequence of intermediate servers to each other one. However, the modern Usenet is very large, with thousands of servers and gigabytes of articles being posted every day. This calls for a more organized structure for distributing news.
For this reason, the modern Usenet logical network is structured loosely in a hierarchy. A few large Internet service providers (ISPs) and big companies with high-speed Internet connections and large servers are considered to be at the top of the hierarchy, in what is sometimes called the Usenet backbone. Smaller organizations connect to the servers run by these large organizations; these organizations are considered to be downstream from the backbone groups. In turn, still smaller organizations may connect further downstream from the ones connected to the large organizations.
This hierarchical structure means that most Usenet servers maintain a direct connection only to their upstream neighbor and to any downstream sites to which they provide service. A server is said to receive a news feed from its upstream connection since that is the place from which it will receive most of its news articles. It then provides a news feed to all the servers downstream from it.
As an example, suppose Company A runs a large Usenet server called Large news that is connected to the backbone. Downstream from this server is the NNTP server Medium news. That server provides service to the server named Small news. If a user posts an article to Medium news, it will be placed on that server immediately. That server will send the article downstream, to Small news, so that it can be read by that server’s users. The medium news will also, at some point, send the article to the Large news. From Large news, the message will be distributed to other backbone sites, which will pass the message down to their own downstream sites. In this way, all sites eventually get a copy of the message, even though Medium news needs to connect directly to only two other servers.
The term used to describe how news is propagated with NNTP is flooding. This is because of the way that a message begins in one server and floods outward from it, eventually reaching the backbone sites, and then going down all the downstream “rivers” to reach every site on Usenet.
Even though I described the logical Usenet network as a hierarchy, it is not a strict hierarchy. For redundancy, many NNTP servers maintain connections to multiple other servers to ensure that news propagates quickly. The transmission of articles can be controlled by looking at message IDs to avoid duplication of messages that may be received simultaneously by one server from more than one neighbor.
Basic NNTP Propagation Methods
Now let’s look at how messages are actually propagated between servers using NNTP. There are two techniques by which this can be done:
- In the push model, as soon as a server receives a new message, it immediately tells its upstream and downstream neighbors about the message and asks them if they want a copy of it.
- In the pull model, servers do not offer new articles to their neighbors. The neighboring servers must ask for a list of new messages if they want to see what has arrived since the last connection was established, and then request that the new messages be sent to them.Both techniques have advantages and disadvantages, but pushing is the model most commonly used today.
Books you may interested