
Here we discuss the basic concepts behind the operation of a Usenet news system.
Xref: news.starcomsoftware.com starcom.tech.misc:211 starcom.tech.security:452 Newsgroups: starcom.tech.misc,starcom.tech.security Path: news.starcomsoftware.com!purva!shuvam From: Shuvam <shuvam@starcomsoftware.com> Subject: "You just throw up your hands and reboot" (fwd) Content-Type: TEXT/PLAIN; charset=US-ASCII Distribution: starcom Organization: Starcom Software Pvt Ltd, India Message-ID: <Pine.LNX.4.31.0107022153490.30462-100000@starcomsoftware.com> Mime-Version: 1.0 Date: Mon, 2 Jul 2001 16:27:57 GMT Interesting quote, and interesting article. Incidentally, comp.risks may be an interesting newsgroup to follow. We must be receiving the feed for this group on our server, since we receive all groups under comp.*, unless specifically cancelled. Check it out sometime. comp.risks tracks risks in the use of computer technology, including issues in protecting ourselves from failures of such stuff. Shuvam > Date: Thu, 14 Jun 2001 08:11:00 -0400 > From: "Chris Norloff" <cnorloff@norloff.com> > Subject: NYSE: "Throw up your hands and reboot" > > When the New York Stock Exchange computer systems crashed for 85 > minutes (8 Jun 2001), Andrew Brooks, chief of equity trading at > Baltimore mutual fund giant T. Rowe Price, was quoted as saying "Hey, > we're all subject to the vagaries of technology. It happens on your > own PC at home. You just throw up your hands and reboot." > > http://www.washingtonpost.com/ac3/ContentServer?articleid=A42885-2001Jun8&pagename=article > > Chris Norloff > > > This is from -- > > From: risko@csl.sri.com (RISKS List Owner) > Newsgroups: comp.risks > Subject: Risks Digest 21.48 > Date: Mon, 18 Jun 2001 19:14:57 +0000 (UTC) > Organization: University of California, Berkeley > > RISKS-LIST: Risks-Forum Digest Monday 19 June 2001 > Volume 21 : Issue 48 > > FORUM ON RISKS TO THE PUBLIC IN COMPUTERS AND RELATED SYSTEMS (comp.risks) > ACM Committee on Computers and Public Policy, > Peter G. Neumann, moderator > > This issue is archived at <URL:http://catless.ncl.ac.uk/Risks/21.48.html> > and by anonymous ftp at ftp.sri.com, cd risks . > |
In essence all queued feeds work in the following way. When the sending server receives an article, it processes it for inclusion into its local repository, and also checks through all its outgoing feed definitions to see whether the article needs to be queued for any of the feeds. If yes, it is added to a queue file for each outgoing feed. The precise details of the queue file can change depending on the software implementation, but the basic processes remain the same. A queue file is a list of queued articles, but does not contain the article contents. Typical queue files are ASCII text files with one line per article giving the path to a copy of the article in the local spool area.
Later, a separate process picks up each queue file and creates one or more batches for each outgoing feed. A batch is a large file containing multiple Usenet news articles. Once the batches are created, various transport mechanisms can be used to move the files from sending server to receiving server. You can even use scripted FTP. You only need to ensure that the batch is picked up from the upstream server and somehow copied into a designated incoming batch directory in the downstream server.
UUCP has traditionally been the mechanism of choice for batch movement, because it predates the Internet and wide availability of fast packet-switched data networks. Today, with TCP/IP everywhere, UUCP once again emerges as the most logical choice of batch movement, because it too has moved with the times: it can work over TCP.
NNTP is the de facto mechanism of choice for moving queued newsfeeds for carrier-class Usenet servers on the Internet, and unfortunately, for a lot of other Usenet servers as well. The reason why we find this choice unfortunate is discussed in Section 12.1> below. But in NNTP feeds, an intermediate step of building batches out of queue files can be eliminated --- this is both its strength and its weakness.
In the case of queued NNTP feeds, articles get added to queue files as described above. An NNTP transmit process periodically wakes up, picks up a queue file, and makes an NNTP connection to the downstream server. It then begins a processing loop where, for each queued article, it uses the NNTP IHAVE command to inform the downstream server of the article's message~ID. The downstream server checks its local repository to see whether it already has the message. If not, it responds with a SENDME response. The transmitting server then pumps out the article contents in plaintext form. When all articles in the queue have been thus processed, the sending server closes the connection. If the NNTP connection breaks in between due to any reason, the sending server truncates the queue file and retains only those articles which are yet to be transmitted, thus minimising repeat transmissions.
> A queued NNTP feed works with the sending server making an NNTP connection to the receiving server. This implies that the receiving server must have an IP address which is known to the sending server or can be looked up in the DNS. If the receiving server connects to the Internet periodically using a dialup connection and works with a dynamically assigned IP address, this can get tricky. UUCP feeds suffer no such problems because the sending server for the newsfeed can be the UUCP server, i.e. passive. The receiving server for the feed can be the UUCP master, i.e. the active party. So the receiving server can then initiate the UUCP connection and connect to the sending server. Thus, if even one of the two parties has a static IP address, UUCP queued feeds can work fine.
Thus, NNTP feeds can be sent out a little faster than the batched transmission processes used for UUCP and other older methods, because no batches need to be constructed. However, NNTP is often used in newsfeeds where it is not necessary and it results in colossal waste of bandwidth. Before we study efficiency issues of NNTP versus batched feeds, we will cover another way feeds can be organised using NNTP: the pull feeds.
This pull feed works by the downstream server pulling out articles i one by one, just like any NNTP newsreader, using the NNTP ARTICLE command with the Message-ID as parameter. The interesting detail is how it gets the message~IDs to begin with. For this, it uses an NNTP command, specially designed for pull feeds, called NEWNEWS. This command takes a hierarchy and a date,
NEWNEWS comp 15081997 |
This command is sent by the downstream server over NNTP to the upstream server, and in effect asks the upstream server to list out all news articles which are newer than 15 August 1997 in the comp hierarchy. The upstream server responds with a (often huge) list of message~IDs, one per line, ending with a period on a line by itself.
The pulling server then compares each newly received message~ID with its own article database and makes a (possibly shorter) list of all articles which it does not have, thus eliminating duplicate fetches. That done, it begins fetching articles one by one, using the NNTP ARTICLE command as mentioned above.
In addition, there is another NNTP command, NEWGROUPS, which allows the NNTP client --- i.e. the downstream server in this case --- to ask its upstream server what were the new newsgroups created since a given date. This allows the downstream server to add the new groups to its active file.
The NEWNEWS based approach is usually one of the most inefficient methods of pulling out a large Usenet feed. By inefficiency, here we refer to the CPU loads and RAM utilisation on the upstream server, not on bandwidth usage. This inefficiency is because most Usenet news servers do not keep their article databases indexed by hierarchy and date; CNews certainly does not. This means that a NEWNEWS command issued to an upstream server will put that server into a sequential search of its article database, to see which articles fit into the hierarchy given and are newer than the given date.
If pull feeds were to become the most common way of sending out articles, then all upstream servers would badly need an efficient way of sorting their article databases to allow each NEWNEWS command to rapidly generate its list of matching articles. A slow upstream server today might take minutes to begin responding to a NEWNEWS command, and the downstream server may time out and close its NNTP connection in the meanwhile. We have often seen this happening, till we tweak timeouts.
There are basic efficiency issues of bandwidth utilisation involved in NNTP for news feeds, which are applicable for both queued and pull feeds. But the problem with NEWNEWS is unique to pull feeds, and relates to server loads, not bandwidth wastage.
Xref: news.starcomsoftware.com control:814217
Path: news.starcomsoftware.com!linux594.dn.net!news.dn.hoopoo.com!
feed-out.newsfeeds.com!newsfeeds.com!feed.newsfeeds.com!
newsfeeds.com!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!
newsfeed.icl.net!newsfeed.skycache.com!Cidera!newsfeed.gamma.ru!
Gamma.RU!carrier.kiev.ua!goblin.nadrabank.kiev.ua!not-for-mail
From: tale@uunet.uu.net (David C Lawrence)
Newsgroups: news.groups,humanities.hipcrime
Subject: cmsg newgroup humanities.hipcrime
Control: newgroup humanities.hipcrime
Date: Sun, 18 Feb 2001 11:50:28 GMT
Organization: The Cabal
Lines: 20
Approved: tale@uunet.uu.net
Message-ID: <3afWYZTIR.G5YOC2@uunet.uu.net>
NNTP-Posting-Host: 203.145.147.67
X-Trace: goblin.nadrabank.kiev.ua 982528840 21455 203.145.147.67
(18 Feb 2001 20:40:40 GMT)
X-Complaints-To: usenet@nadrabank.kiev.ua
NNTP-Posting-Date: 18 Feb 2001 20:40:40 GMT
X-No-Archive: Yes
humanities.hipcrime is an unmoderated newsgroup which passed its
vote for creation by 326:10 as reported in news.announce.newgroups
on 18 Feb 2001.
For your newsgroups file:
humanities.hipcrime HipCrime for Humanity - you committed one now!
Anyone can create a newsgroup in the alt, biz, comp, earth,
humanities, misc, news, meow, rec, sci, soc, talk, us, or
any other Usenet hierarchy. New newsgroup proposals may be
optionally discussed in news.groups. Please be sure that your
/usr/lib/news/control.ctl is configured correctly:
## NEWGROUP MESSAGES
## honor them all and log in \${LOG}/newgroup.log
newgroup:*:alt.*|biz.*|comp.*|earth.*|humanities.*|misc.*|news.*|\
meaw.*|rec.*|sci.*|soc.*|talk.*|us.*:doit=newgroup
## RMGROUP MESSAGES
## drop them all and don't log
rmgroup:*:*:drop
Meow!
David C Lawrence
|
checkgroups: control message to check whether the list of newsgroups in your active file are all correct as per a master list of newsgroups sent in the control message
newgroup: control message to create a new newsgroup
rmgroup: control message to delete a newsgroup and all articles in it
sendsys: control message to cause an email response to be sent to the author with the sys file of your server in it. This results in a response storm of emails from all the Usenet servers in the world to the author. These responses allow the sender of the control message to analyse all the sys files of the world's Usenet servers and create the directed graph of Usenet newsfeeds. Why someone would want to do this is hard to guess, but the result is surely an awesome picture of one facet of networked human civilisation, like looking at a giant world map.
Incidentally, there is no invasion of privacy here, because your server's sys file is supposed to be public information, if you take feeds from the public Usenet.
version: control message which results in your Usenet software sending an email to the author of the message, containing the type and version of the Usenet news software you are using. This too is not an invasion of privacy, because this information is supposed to be public knowledge.
The cancel message: the most frequently occurring type of control messages. They specify the message ID of an article, and result in the cancellation (deletion) of that article. If you post an article and regret it a moment later, your Usenet newsreader software usually allows you to ``cancel'' it by generating a cancel message.
The Usenet news software maintains a pseudo-newsgroup called control, where it files all control messages it receives. If you have an incoming newsfeed from the public Usenet, your server's control group will usually be full with thousands of cancel messages from trigger-happy fingers all over the world. Usenet news server software like C-News allows you to filter the incoming feed based on newsgroups, and will discard articles for groups they do not subscribe to. But since all servers have to receive and process control messages, they will all accept these cancel messages, though many of them may apply to articles which are not part of your highly-pruned subset of groups. C'est la vie.
Remember to set expiry for the control group to one day or even shorter, so that the junk can be cleaned out as rapidly as possible, just like the junk newsgroup.
The beauty of the control message architecture is that it integrates seamlessly into the newsfeed mechanism for automatic control of the network of servers. No separate channel of connection is needed for the control actions. And article replication automatically propagates control messages with human-readable articles, thus guaranteeing reach across heterogenous networks technologies.
What your Usenet server does on receiving a control message is governed by an authorisation file: $NEWSCTL/controlperms in the case of C-News and control.ctl in the case of INN, for instance. The security measures implemented by this module are further enhanced by the pgpcontrol package with its pgpverify script. Using pgpverify, your server can check that all control messages (except for article cancellation messages) are digitally signed by a trusted party using military-spec public key cryptography. Our integrated Usenet news software distribution includes integration with pgpverify.