Stale HOWTOs (the case of Modem-HOWTO)by David S. Lawyer, Mar. 7, 2001
Out-of-date (stale) documentation is a major problem for Linux. This is also a problem in the Linux Documentation Project (LDP). One well known reason for stale documents is that document authors sometimes don't revise their documents frequently enough. But even if they are revised frequently, people searching for information may not find up-to-date versions.
Here's why. Even though the Linux Documentation Project (LDP) has the most recent versions of its documents on over 200 mirror sites, several hundred other sites also carry LDP documents. Unfortunately, most of these have stale documentation. Why don't people just go to the mirror sites and avoid the other sites? The reason is that many people search for information about Linux using one of the many search engines available on the Internet. More likely than not, such a search engine will find out-of-date Linux documents. While the LDP sites have a search engine for searching the LDP site, it's often advantageous to search the entire Web since there are many other documents available besides just LDP's. But doing so is likely to find stale documentation.
Suppose one finds a LDP HOWTO by using a search engine. Can't they just look at the date of the document and also click on a link to a mirror site that will have the latest document. Unfortunately, this isn't too easy to do. What people usually find with a search engine is not the entire document, but only a chapter of a document. The html documents are usually split up into chapters so that they will download fast.
Each chapter doesn't contain version or date information (perhaps it should). While there may be a chapter in the document that contains a link to the latest version, it's not likely to be in the chapter that one finds with a search engine. To find such a link (if it exists) requires first clicking on the "contents" link to get to the table-of-contents page. Then one might browse the contents to try to find a link to another chapter which itself might contain a link to the most recent version. It's not simple, sure or fast so few readers are likely to do this.
I did a quick survey to find out which versions of Modem-HOWTO were on the Internet. Here's the results: (Last col. is number of sites on the web per Google on Mar. 2, 2001.)
The situation is not quite as dire as shown above since in some cases Google doesn't have the latest info: the site has been updated but Google doesn't know about it, or the site may be dead. But a spot check indicated that roughly 80% of them still exist as listed. The sites that were supposed to have v0.12 frequently had the latest version.
For a small minority of cases there's double counting since some sites have HOWTOs in more than one format. Also, a small minority of sites have stale HOWTOs in a directory named "archives", "old", etc. This is OK since they are being correctly classified.
In another respect the situation is even worse than described above since the Modem-HOWTO was a fork from the Serial-HOWTO. Over 200 old versions of Serial-HOWTO (prior to the first version on Modem-HOWTO) are still on the Internet. They all contain quite obsolete information about modems.
Here's some details on how I did the search. I searched using google.com with search terms: Modem-HOWTO "modulation details" v0.xx Where xx = 00, 01, 02, etc. The phrase ""modulation details" is from the table-of-contents so as to always select the HTML table of contents file (for split HTML-HOWTOs) . This is needed since v0.xx is sometimes also in chapter 1 and used so that readers can click on a link to LDP to see if they have the latest version. If "modulation details" were omitted there would be double counting. Also, "modulation details" removes hits on lists/catalogs of HOWTOS. There's still some more details on how I did it but they're not of general interest and are thus omitted.
Thus there are a lot of out-of-date versions of LDP docs (and other documentation) on the Internet. One way to try to lessen this problem would be to put some requirement into the license so that when a document becomes outdated it must be clearly labeled as such. Such labeling needs to be seen before one clicks on the document. But how can this be assured? What might help would be to add a suffix to the name of the document to indicate that it's outdated.