LINUX GAZETTE

February 2002, Issue 75       Published by Linux Journal

Front Page  |  Back Issues  |  FAQ  |  Mirrors  |  Search (www.linuxgazette.com)

Table of Contents:


Linux Gazette Staff and The Answer Gang

Editor: Michael Orr
Technical Editor: Heather Stern
Senior Contributing Editor: Jim Dennis
Contributing Editors: Ben Okopnik, Dan Wilder, Don Marti

TWDT 1 (gzipped text file)
TWDT 2 (HTML file)
are files containing the entire issue: one in text format, one in HTML. They are provided strictly as a way to save the contents as one file for later printing in the format of your choice; there is no guarantee of working links in the HTML version.
Linux Gazette[tm], http://www.linuxgazette.com/
This page maintained by the Editor of Linux Gazette,

Copyright © 1996-2002 Specialized Systems Consultants, Inc.




Comments on: Play with the lovely netcat

Fri, 11 Jan 2002 19:11:53 +0800
zhaoway ()

I've forwarded these comments about my Jan article in Linux Gazette: Play with the lovely netcat. Could you post it in your Mailbag? Thanks!

zw


The purpose of yes

Date: Thu, 3 Jan 2002 16:05:19 -0700 (MST)
From: Bruno Melli <>

Hi zhaoway,

I was enjoying your column in the latest Linux Gazette and came upon your description of /usr/bin/yes. I'm by no mean a Unix historian, but from what I understand the yes command had a very basic purpose:

The original rm command didn't have a -f option. So if you did rm -r /some/dir (or rm * where the current dir had lots of files) and if the permissions weren't set right you ended up having to type in a bunch of 'y' because rm asked you if you wanted to overwrite the permission.

Try it:

touch /tmp/haha
chmod 000 /tmp/haha
rm /tmp/haha

Imagine how annoying that becomes if you tried to rm hundreds of files at once.

The solution, if you didn't have access to the rm source, (or took the basic philosophy of Unix to the extreme):

yes | rm -r

bruno.


Author of Netcat

Date: Wed, 2 Jan 2002 16:21:27 -0800
From: "Golden_Eternity" <>

In your article "Play with the Lovely Netcat: Reinvent /usr/bin/yes" you comment on the anonymity of the author of Netcat.

I could be wrong, but I'm fairly certain that the author is Hobbit of the l0pht (currently @stake). There's a Win32 version by Chris Wysopal, as well.

http://www.atstake.com/research/tools/index.html#network_utilities


LG #74 Mailbag: Desktop support

We got two messages on this topic.


pls pass this onto Dennis Field - his email doesn't work

Date: Fri, 28 Dec 2001 20:23:50 +0000
Luke Worthy ()

re: Winning the Battle for the Desktop

Dude - quit you're Linux laptop whining...heh - jk ;)

http://www.linux-laptop.net

and btw: try Mandrake, it has excellent PnP - they at least have a chat-style site for support, and it's all pretty good - just make sure you're winmodem is supported:

http://www.linmodems.org

That's usually the most important thing.

Luke


Regarding all these comments about desktop support ---

Date: Thu, 17 Jan 2002 02:54:19 -0800
Iron ()

There are two major classes of desktop: home and office. The former is novices and hobbyists (who help the novices). The latter has help desks.

Linux's economics have little chance of winning over novice desktops. That's because the cost of tech support for the few is borne by everyone who buys the software. Thus, a $50 package can afford to bear a 15 minute tech support phone call, and still turn a profit.

Actually, they cannot. The retailer and distributor will take 20-50% off the top. That leaves $25. Even with low-paid support staff, a 15-minute call can't cost less than $5 unless it's a simple answer (in which case the call would have taken one minute) and all the infrastructure costs to main the help desk and its resources are externalized as overhead. If they sell one copy, they would not have enough profit to take the call, unless the company was tiny and had a tiny customer base (in which case the customer-service staff or other staff would double as tech-support staff, so they would have to be employed anyway).

If they sell a hundred copies (or whatever the number is), they can take that 15-minute call. If the person calls back, they will have lost all of their profit on those hundred copies. If another of those hundred customers also calls in, the company will lose money.

That's why unlimited free tech support has disappeared, why limited free tech support has long been in danger, and why so many companies have put their knowledge bases online and run product newsgroups. It's much cheaper to have support staff monitor a newsgroup two hours a day than to wait by the phone, in terms of the number of customers that will be helped during that time, because others with the same question (or who may have the same question in the future), will see the answer. Actually, that's how The Answer Gang works too....

There are exceptions. The author of MetaKit (http://http://www.equi4.com/metakit/index.html), a non-SQL database server, offered unlimited free technical support, although I assume it was e-mail support rather than phone support. He did it because he wanted to hear how clients were using the product and what kinds of problems they encountered: he considered that his payment because it helped him improve the product. I'm not sure whether he still offers this--the web page now points users with questions to a mailing list. But there's obviously an upper limit on the number of customers you can offer "free unlimited support" to.

Linux is complex enough that the price really needs to be higher to support all the included software.


John Kawakami ()

True, although this is more a responsibility of the distributions that market to newbies than a responsibility of the Linux community as a whole.

On the other hand, Linux could do okay in the corporate desktop, where in-house helpdesks keep people away from the "free" tech support you get from the vendors. (It's not free if you're paying someone to wait on tech support.) The simpler Linux apps are easier to "fix" when errant users make mistakes, and with VNC, the service can be done remotely. Plus, overall stability pays off with fewer internal support staff.

---- John Kawakami

If the in-house help desks know Linux. Often, the only people who know Linux are the IT staff who run the servers. -- Iron


Good attitude!

Tue, 1 Jan 2002 14:50:04 -0500
mike ()
linux-questions-only (linux-questions-only@ssc.com)

Regarding: LG 74, 2c Tips #26

I really like the attitude expressed by the whole answer gang, and a subtle rtfm after the question is answered is a good thing, I think. Before the answer it's a provocation, afterwards it becomes good advice. Happy New Year,

Mike List


Mountpoint permissions

Thu, 03 Jan 2002 21:42:34 -0500
Rick Holbert ()

Use chown, chgrp and chmod to change the owner, group and permissions on the mount point.

Err, no. The querent actually stated that he tried those; I'm willing to believe him (the same situation obtains when you mount a VFAT partition; the owner/perms of the mount point are irrelevant.) I don't have a Samba setup at hand right now, and it's been a while since I had to do one, but I'm pretty certain that Mike Martin's suggestion - setting the "uid/gid" parameters in the conffile - is the right thing to do. -- Ben


Sorry / Saludos

Tue, 8 Jan 2002 08:44:56 +0100
Andres Legarra ()

Perdon!!

Me he confundido al pinchar el mensaje que queria responder.

Sorry, I mispelled when I picked the message to reply (This awful M$ Outllok Express...) By the way, I found some things on Linux Gazette very useful.
Congratulations

Usted escribe un buen español!!
Saludos

Andres Legarra Albizu


attn: Ben Okopnik et al

Fri, 11 Jan 2002 22:33:00 -0800 (PST)
Mather Cotton ()

http://www.linuxgazette.com/issue63/okopnik.html

That url saved my ass. Thank you so much!

Cotton


Tux' Gender

We got two messages on this topic.


re: Lady Penguins

Date: Wed, 02 Jan 2002 04:50:22 -0500
Rachel Rawlings ()

That might refer to Linus' original comment that penguins are happy because they have just stuffed themselves full of herring or have been hanging out with lady penguins. We only /know/ that Tux is stuffed full of herring, but we can assume Tux hangs out with lady penguins. -- Heather

Which actually doesn't get say definitively whether Tux is male. Tux could hang out with lady penguins cf. Marlena Dietrich, or be a high-class drag king. ;>

However, speaking as a dyke with a largish stuffed animal collection (one of whom is a female Peter Rabbit named Katja) my Tux is male. Other users' Tuxen may vary according to the needs of the user, much like their kernel configurations.

Interesting. I wonder if Eric Raymond's enhanced kernel configurator will have a question for which sex your kernel should be built as. -- Mike


All the Girls like him

Date: Fri, 18 Jan 2002 11:26:17 +0100
patrick.op.de.beeck ()

But, we couldn't publish his very cute note because it was marked confidential. Sorry folks! -- Heather


This page edited and maintained by the Editors of Linux Gazette Copyright © 2001
Published in issue 75 of Linux Gazette February 2001
HTML script maintained by of Starshine Technical Services, http://www.starshine.org/

More 2¢ Tips!


Send Linux Tips and Tricks to


pseudo-chroot

Fri, 4 Jan 2002 09:34:18 -0500
trevor ()

hi,

in issue 74 there's a question from Faber Fedor asking about how to setup an environment so that a user can't wander from their home directory.

i believe the person asking the question was looking for something along the lines of a restricted shell. tell the person asking the question to look at the "-r" option to bash, smrsh, and/or do a google search for "restricted shell".

best regards,
trevor


See LILO only when you need it

Fri, 04 Jan 2002 13:21:36 -0800
John R. Jones ()

Hello gazette,

Being a new Linux administrator, I had "hardened" down my install by implementing a "protected" and "password=<pass> entry in my /etc/lilo.conf file to keep people just as dangerous as myself out of single mode.

I also rem'd out the timeout= value so my install would always boot straight into Linux.

My question for the day was "how could I boot Linux Single if I had to? a Boot and Root set would work, but I discovered this...

After the BIOS Mem check, hold down either control key and the LILO boot "screen" is displayed! And of course, you'd need the password=<value> to use it...

Wow, Now I am scary on 2 platforms. :)

-- Thank you,

John R. Jones

3, if you count that he's an Oracle DBA. -- Heather


Active Directory...

Fri, 04 Jan 2002 01:38:57 -0600
John Lederer ()

OpenLDAP is the Linux equivalent of Active Directory.

Regards.
John

There's been enough small-comment interest in this, it would probably be good to see an article on the subject of setting up this sort of environment the Linux way. -- Heather


CSS2? Try XML and its kin instead

Sun, 30 Dec 2001 22:18:56 -0500
XunDog ()

Ok,

If the feature is unique to CSS2 then you won't replicate it with CSS1 and cross-platform browser support for either is restrictive ...

so .... I would suggest using Xml, xsl, xslt and either DTD or xsd schema formats ...

this is more completely supported ... just a little (mabye a lot) more work ... Check out the books by Benoit Marchal ...

regards
XunDog


Linux with win2000

Mon, 14 Jan 2002 13:56:00 -0500 (COT)
nadeem ()
answered by John Karns (The Answer Gang)

Anybody please tell me about installation of linux with win2000. I already installed linux 7 on my pc. now i want that without format my system i install win2000 on my pc.

any body pls give me any utility. don't tell me FAQ. this is boring for me. if anybody wants help me out than pls provide me utility.

If you find reading FAQ's boring, I don't think you're going to like Linux too much.

Three recommendations:

For disk partition manipulation:

  1. fips or
  2. Partition Magic (there are others, but these are two I've used)

For installing and running Windows (MSW) with Linux.
3) VMWare

It would be nice to be able to avoid MSW entirely, but since my work demands it, using VMWare allows me to run it without having to reboot and leave the Linux environment.

-- John Karns


Cable Modem Setup

Wed, 2 Jan 2002 10:26:11 +0100
Eugene Poole ()
answered by Yann Vernier and Mike Orr (The Answer Gang)

On January 3, 2002 I'm having a external cable modem installed. I've been looking around for some simple suggestions on what needs to be done, confuguration wise, to my Linux machine. Can you help? Naturally, the normal statement has been made - "We don't support Linux". The Linux machine that it's being connected to has a second NIC installed and I've accessed the machine via the second NIC to that's all set up. Where do I go from there?

We can't know the next step until you have the instructions for how to connect using the cable modem. If you are using Debian GNU/Linux, a simple way to prepare for running a masquerading gateway is to install the ipmasq package, but we don't know if you need PPPOE, DHCP, or some special login methods. A useful resource may be http://www.cablemodeminfo.com/LinuxCableModem.html

Good luck! -- Yann

The extra Ethernet card should be all you need. Beyond that, just follow the Windows dialogs in the manual and see whether it's dhcp or a static IP, which nameservers to put in /etc/resolv.conf, etc.

Yann is right about setting up masquerading if you have a local network. I don't think of that as "setting up a cable modem" though. That's another step, connecting a local network to the Internet.

Be glad you have an external modem. It would be much harder to set up if it were internal, because it would probably require some proprietary DLL that isn't available for Linux. -- Mike


read a timestamp... the EASY way

Wed, 2 Jan 2002 23:19:54 -0500
Joe Smith ()

I was looking for a solution to extract the timestamp of a file with plain shell methods.

... (Lots of all-too-complicated suggestions followed)

What's wrong with

date -r file

Which only goes to show that we really need a friendly way to query the vast obscurity which is Unix documentation... sigh.

<Joe

<laugh> Bravo! Well done, sir!

This illustrates the point that I often make to folks just learning Unix: the tools are in there, somewhere. It's finding them that's the problem. -- Ben

......... the original querent replies .........

Indeed.

Especially when some of your man pages are out of date. In my case,

date --help

would have given the solution, while

man date

just keeps this secret. Sob.

-- Regards, Fakir


How to manually label a tape in linux

Wed, 9 Jan 2002 10:18:29 +0530
FRANCO FERNANDES ()
Answered by Jay Ashworth (The Answer Gang)

I manually backup my linux server every day for that i need to put a label on my tape according to the date, I backup my server. Does anyone know how to manually label a tape in linux is there any command for doing that.

Please help
Thanks in Advance
Franco.F

Well, my approach to this is to create a directory called /tmp/TIMESTAMP, and, just before you make a backup, clear out all the files, then use

touch /tmp/TIMESTAMP/`date +%Y%m%d-%a%H%M%S`

This wlil give you a label for the backup which you can read without having to actually load any data.

Cheers, jra


Problem faced while using script to backup

Wed, 9 Jan 2002 10:26:14 +0530
FRANCO FERNANDES ()
answered by Dan Wilder (The Answer Gang)

I have created a automated script to backup my server for that i want my log file to display the date it backsup my server every day. My script has this line ,


echo " BACKUP OF fileserver STARTED " >>
/var/log/bkuplogs/fileserver/mainlog

Is there any parameter which has to be put like %m %h %d. Any kind of help will be highly appreciated

Try

echo " BACKUP OF fileserver STARTED $(date +'%c') " >> whatever

See

man date

for other format strings. -- Dan Wilder


Posters for [LG 72] help wanted #7

Fri, 28 Dec 2001 11:36:12 +0100
Yann Vernier, Chris Gianakopoulos, Jim Dennis ()

Brian Keyse ()

I feel I must recommend O'Reilly's "Anatomy of a Linux System" poster. It is a large, colourful poster giving a rough overview of how things fit together and recommending (O'Reilly, of course) books.

Their address is http://www.ora.com but I didn't find the poster in their product list; it is probably promotional material which you'll have to ask them for. -- Yann

It's available as a PDF file:
ftp://ftp.oreilly.com/pub/poster/oreilly_linux_poster.pdf
-- Brian Koyse

I saw some sort of a thing like that for Linux. Is it 3 or 4 feet in diameter and it shows the ring structure of the operating system? You know..., the kernel in the middle, with the applications at the outer ring? If that's the thing, it's kinda cool. I think that I am gonna get one of those. -- Chris G.

Hmmm. The one I saw was just of the Linux kernel sources. Core memory management and scheduler in the center and VFS and core networking support forming a second tier, with filesystems and specific device drivers on the periphery. That one was a sort of a fractal star or "peacock." -- JimD

I'll have to look at the chart when I go back to work next week. There's a book entitled "The Design of the Unix Operating System" by Maurice Bach. The poster that I saw, for Linux, looks like the structure on the cover of that book.

Regards, Chris G.


This page edited and maintained by the Editors of Linux Gazette Copyright © 2001
Published in issue 75 of Linux Gazette February 2001
HTML script maintained by of Starshine Technical Services, http://www.starshine.org/


(?) The Answer Gang (!)


By Jim Dennis, Ben Okopnik, Dan Wilder, Breen, Chris, and... (meet the Gang) ... the Editors of Linux Gazette... and You!
Send questions (or interesting answers) to the Answer Gang for possible publication


Contents:

¶: Greetings From Heather Stern
(?)How does one examine a core file

(¶) Greetings from Heather Stern

... you stand there waiting for Heather to look up from her keyboard...

Oh! Hi everybody! It's certainly been an active month here with The Answer Gang. We had almost 700 slices of Gazette related mail come past my inbox. The longest thread (not pubbed this month, look forward to it next time) was over 50 messages long. Less than 20 people got no answer whatsoever (not counting the occasional spammer) and the top reason for not getting a post answered, appeared to be simply a lack of interest in that message. Crazy attachments are down a LOT since our sysadmin improved the filters. Ben did a bit more cleanup on the TAG FAQ and Knowledgebase and we have a new posting guidelines page which I hope you find easy to read.

In the land of Linux I'm pleased to note that the 2.4 series kernel is resembling stable since 2.4.17 is over a month old now. A lot of work is being done in 2.5.

Flu struck my area and melted my mind back down to a mere single CPU when I'm used to being an SMP system. Bleh! And before you ask ... yes, I'm feeling better. Lots of liquids, chicken soup, all that.

It appears as though Ghostscript is my evil nemesis of the month. I haven't had time to finish compiling support for that new color printer of mine. In a moment of foolishness I upgraded my Dad-in-law's box and the next few days were completely nuts since kword and gs refused to agree on what fonts to print, or even to get the metrics right so margins would work. They're happy again since I forced ghostscript to uninstall completely and then reinstall. And we still wonder what the heck happened to gnucash in Debian/Woody, though I admit, I haven't looked very hard.

Cheerfully for my mortgage I've had a lot of consulting work this month. Between 600 plus messages and all that, though, there wasn't time for me to fit the usual ten pack (this blurb and nine of the juiciest TAG threads) in under a tighter than usual deadline. Mike will be enjoying a Python conference much of this next month. I hope it counts for a well deserved vacation on his part.

I've not left you completely wanting, though. Here's a few days in the life of The Answer Gang, troubleshooting one of those day to day things that drives everybody nuts once in a while -- segfaults.

Core files are a mess. Good thing we have a dustbin around here.


(?) How does one examine a core file

From Faber Fedor

Answered By Jim Dennis, Dan Wilder, John Karns,
with side comments from Ben Okopnik and Heather Stern

I've got a problem with a RH7.1 machine and no error messages to look at, so I'm wondering how does one debug a problem like this?

Moved a machine from NY to NJ yesterdy. When I left it last night, everything was running, esp. Apache. This morning, normal maintanence occurred at 4:02 AM, and when the system (syslog?) went to restart httpd, the restart failed. It's been failing ever since too!

The only http related message in /var/log/messages is


Dec 22 12:27:13 www httpd: httpd startup failed

Access and error logs for httpd are empty.

Running /usr/sbin/httpd (with and without command line parms) generates the message


Segmentation fault (core dumped)

and the requisite core file:


core: ELF 32-bit LSB core file of 'httpd' (signal 11), Intel 80386,
version 1, from 'httpd'

File size and date of /usr/sbin/httpd matches my local copy.

Any ideas where to look next?

-- Regards, Faber

Jim Dennis pontificates about troubleshooting apache's startup... -- Heather

(!) [JimD] First, I would run /etc/init.d/httpd or /etc/init.d/apache, or whatever it is on your system. Run it with the "start" option.
(Actually I'd read the /etc/init.d/ start script for that service, and probably I'd manually go through it to figure out what I needed to do in order to run this particular installation of Apache correctly).

(?) Did that. That's what I meant by "it crashed at the command line with and wothout parameters.

(!) [JimD] To dig further I might replace the httpd with a short "strace wrapper" script:
#!/bin/bash
exec strace -f -o /tmp/apache/strace.out /usr/sbin/httpd.real "$@"

(?) This definitely goes into my bag of tricks (once I decode it :-))

(!) [JimD] (be sure to mkdir /tmp/apache, and make it writable to the appropriate UID/GID --- whatever the webserver runs as).
I'd look through the strace.out file for clues. Don't leave this running in this fashion for too long. The strace.out files will get huge very quickly; and your performance should suffer a bit.
Considering that it used to work, you did a shutdown, moved the system, brought it back up, and then, presumably, CONFIGURED IT FOR A NEW NETWORK, I'd look very carefully at network masks, routes and related settings.

(!) Very close! The problem turned out to be that the name server the box was using is no longer accessible (the box is there, but dig returns "no name servers were found") and there were no backup name servers in /etc/resolv.conf (mea culpa).

I wouldn't have expected apache to segfault under those conditions, but it did.

(!) [JimD] Also, consider upgrading to RH7.2 if you can.

(?) [Faber] I just got my hands on it earlier this week so I'm still evaluating it.

(!) Red Hat's distribution has been very consistent in it's release history: avoid the .0, skip the .1, and wait for the .2; that's been the rule since 4.2!

(?) [Faber] Normally, that's what I do, but we needed to upgrade to PHP4 ASAP and it was alot easier to upgrade the whole system to 7.1 (from 6.2).

thanks again!
Regards, Faber

(!) [JimD] You're welcome.

... while Dan took a different approach, considering the core file itself. -- Heather

(!) [Dan] 0) Start by making sure there's no error in your httpd.conf by running
apachectl configtest
No doubt there's nothing there. But if there is, you are not apt to find it by examining core files, etc.
If you're an expert C developer

(?) [Faber] At one point in my life, I might have said that, but then only to impress women like Heather. ;-)

(!) [Dan] I don't expect Heather's that easily impressed. Especially by guys like me that mistype "developer".
That's ok, I fixed it. That's what editors are for, at least sometimes. I'm more impressed by how people solve problems than by whether they're an expert in everything around them. It's nice if they can solve my problems, though. -- Heather
(!) [Dan] and have the source tree to your apache handy, examining the core file might yield you something.

(!) [Faber] IOW, no, I don't want to do that. :-)

(!) [Dan] Naah, me neither. Last resort.
(!) Mostly it's pretty indirect. Segfaults are typically caused by out-of-bounds pointers or array references, references to allocated memory since freed, confusion about number or type of parameters passed to a function, and the like. The error happens earlier, when the bad pointer is parked someplace, memory is erroneously freed, etc. The fault happens later, when something is dereferenced.
I've spent many a happy and well-paid hour trying, sometimes without success, to track backwards from fault to error. And when you find the error, you may still a long and winding road back to the defect which caused the error.
Defect    --------->  Error  -------------> Fault

(Improper          (Something bad       (Result becomes
code construct)     happens)             observable as
                                         unexpected result)
Unless you're an expert C developer, and patient and lucky as well, it's more likely you'll find the problem by a process of elimination.
1) What's changed recently? New application? Change in httpd.conf? New module installed? Try backing out any recent changes, one by one. Restart apache after each thing you back out.
2) Is it possible there's filesystem corruption? Corrupted binaries often fail to run well. Take the machine down and run
fsck -f
on all filesystems. If you find anything amiss, determine what files were affected.
3) Reinstall apache just in case, anyway.
4) Could the machine have other hardware problems? If you have the kernel development packages installed, build the kernel eight or ten times. If you get "died with signal 11" or other abnormal termination, proceed with hardware troubleshooting procedures.
5) Figure out what area of apache is affected. Save your httpd.conf and start with a default one. Will apache start? If so, re-introduce features from the running copy of httpd.conf a few at a time until apache begins dying at startup.
Let us know how you do. Depending on where you find trouble, the gang can offer further advice. -- Dan Wilder

Jim has quite a bit to say about using strace -- Heather

#!/bin/bash
exec strace -f -o /tmp/apache/strace.out /usr/sbin/httpd.real "$@"
(!) [JimD] In runs a shell (bash) which then exec()s (becomes) a copy of the strace command. That strace command is told to "follow forks" (so we can trace the system call of child processes) and writes its output to a file in our /tmp/apache directory. strace then runs (fork()s then exec()s) a copy of the "real" httpd with a set of arguments that matches those that were passed to to our script.
The distinction between exec()'ing a command and invoking it in the normal way is pretty important. Normal command invocation from a UNIX shell involves a fork() (creating a clone process which is a subshell) and then an exec*() by that shell to transform that subprocess into one which is running the target command.
Meanwhile the parent shell process normally does a wait*() on the child. In other words, it sits there, blocked until the child exits, or until a signal is received.
When we use the shell exec command, it prevents the fork() (there's no creation of a subprocess). The "text" (executable binary code) of the process that was running a copy of your shell (/bin/bash in our case) is overwritten by the "text" of the new program; all of the heap and stack segments (memory blocks) of the old process are freed and/or clear) and the only traces of the old memory image that remain available are the contents of the process' environment. In other words, the exec command is a wrapper around the one of the exec*() system calls (there are several different versions of the exec*() system call which differ in the format of their arguments, and the preservation/inheritance versus creation of environments).
Actually I think that Linux kernel implements execve() as a wrapper around its clone() system call, and that libc/glibc provides the handling for all of the variations on that. The three "variables" on these exec variations are:
format of the command argument list:
(which is either done through C varargs --- like printf() and friends, or is a pointer to an array of NUL terminated strings), (execv* vs. execl*)
 
environment handling:
whether the process keeps its current environment or overwrites it. The execle() and execve() versions have an extra parameter pointing at an NUL terminated of NUL terminated strings.
 
path searching:
The first argument of the execvp() and execlp() functions can be a simple command basename --- while all other variations require a qualified path. The "p" versions will search the PATH as a shell would.
It appears that you can either search the PATH or create a new environment, but not both. Of course you can use a simple execl() or execv() to do neither. Of course you can read the man exec(3) manual pages in the library functions section of your online docs to read even more details about this.
When I'm teaching shell scripting I spend a considerable amount of time clarifying this worm's eye view of how UNIX and the shell handles fork()s and exec*()s. I draw diagrams representing the memory space and environment of a process, and another of a child process (connected by dotted lines labeled "fork()"). The I crosshatch most of the memory space --- leaving the environment section, and label that exec*().
When I do this, people understand how the environment really works. The "export" shell command moves a shell variable and its value from the local heap "out" to the environment region of memory. Once they really understand that, then they won't get too confused when a child process sets a shell variable, exports, and then their original process can't see the new value. ("export" is more of a memory management operator than an inter-process communications mechanism; at best it is a one-way IPC, copying from parent to children children).
After than I generally have to explain about some implicit forms of sub-process creation (forking) that most people miss. In particular I remind them that pipes are an *inter-process* communications channel. So, any time you see or use a | operator in the shell, you are implicitly creating sub process. That's why a command like:
unset bar; echo foo | read bar; echo $bar
(!) [Ben] Oh, that's cute. I go through pretty much the same spiel - some of it admittedly cribbed from your description of this, because I liked it the first time I heard it - but the way I've been demonstrating it is with a
while read bar; do echo $bar; done < file
loop. This nails down the other end. Very cool.
(Scribbling notes in newly acquired Palm Pilot)
(!) [JimD] ... will return an empty value in most shells. The read command is executed in a subprocess which promptly exits, freeing the memory that held its copy of the bar variable/value pair. (I say most shells because ksh '93 and zsh, create their subprocesses on the left hand side of their pipe operators. That's one of those subtle differences among shells. Personally I think bash and others do it wrong, the ksh/zsh semantics are superior and I hope bash 2.x or 3.x will adopt them, or offer a shopt, shell option, to select the desired semantics).
The "$@" ensures that the arguments that were passed to us wil be preserved in count and contents. If we used "$*" we'd be passing a single argument to our command. That single argument would contain the text of all of the orginal arguments, concatenated as one string, separated by spaces (or by the first character from IFS if you believe the docs). If we used $* (no soft quotes) we'd be having the current shell resplit the number of arguments --- they'd have the same contents, but any arguments that had previously had embedded spaces (or other IFS characters) would be separated accordingly.
The "$@" handling is the most subtle part of this script. An unquoted $@ would be be the same as an unquoted $* (as far as I can tell). It is just the "$@" that gets the special handling. ($* and "$*" aren't special cases, they are expanded and split in the normal way; "$@" is expanded and sort of "internally requoted" to preserve the $# --- argument count).
If you were going to need to do this frequently we might write a "strace.wrapper.sh" shell script which would work a bit like this:
 #!/bin/bash
 OLDMASK=$(umask)
 umask 077
 TMPDIR=/tmp/$(basename $1)$$
 mkdir "$TMPDIR" || exit 1
  ## make a temporary directory or die
 umask $OLDMASK
 TARGETCMD="$1"
 shift
 exec strace -f -o "$TMPDIR/strace.out" "$TARGETCMD" "$@"
In this example we call strace.wrapper.sh with an extra argument, the name of he command to be "wrapped." We then fuss a little with umask (to insure that our process' output will have some privacy from prying eyes, and doing an atomic "make a private dir or die trying" (This is the safest temp file handling that can be managed from sh, as far as I know).
Then we restore our umask, (so we don't create a Heisenbug by challenging one of our target command's hidden assumptions about the permissions of files it creates). We than grab our target command, shift it off our argument list (which does NOT disturb the quoting of the remaining arguments) and call our strace command as before --- with variables interpolated as necessary.
Mind you I don't use this script. I don't bother since I can do it about as easily by hand. Also this script wouldn't be the best choice for CGI, inetd launched, or similar cases. In those cases we're better renaming the original binary.

Of course we were all happy when Faber found what it was! We encouraged him to send in his bug report -- Heather

I wouldn't have expected apache to segfault under those conditions, but it did.

(!) [JimD] Report it as a bug (after upgrading to the latest stable release). Try to isolate the .conf directive(s) that are involved, if possible.
(!) [Dan] ... The error happens earlier, when the bad pointer is parked someplace, memory is erroneously freed, etc. The fault happens later, when something is dereferenced.

(?) Well, as I told Jim, the fact that it couldn't find a name server caused it to segfault. Weird; you would have thought it would have exited wih a message at least.

(!) [John K] It sounds like there's a bug or some abnormality with apache's handling of a situation which is doesn't expect in normal operation. IOW, a problem with error handling. If the apache version is not the latest stable version, you might want to consider upgrading. If it is the latest, then you may want to consider reporting it to the apache developers.

...and of course we congratulated him on his success, with some extra thoughts on general troubleshooting. -- Heather

(!) [Dan] Congradulations on solving the problem.
That's what I call the "natural history approach". Examine carefully the behavior and habitat of the creature in question, and think carefully about what you've observed.
I've probably fixed a lot more bugs in my life by the natural history method, than I have by the method of examining core files, or for that matter running under a debugger or emulator.
Strace, mentioned separately in this thread, is a little harder to classify. A program that attaches itself to a running process and dumps out information about system calls, it affords a level of information about a program that may sometimes come close to what you'd see using a debugger.
Mostly it doesn't, but sometimes it provides that key observation not available by other means which allows us to finally come to grips with a bug. I'd group it with natural history tools, perhaps as an analog to a radio collar. You know where the animal's been, but maybe not why, or what it did there. -- Dan Wilder
(!) [JimD] I like to use the classic "OSI reference model" as a rough troubleshooting sequence. Keep going down the stack (from application, down through network and to the physical layers until you isolate the problem, then proceed back upwards correcting each problem until the application works).


This page edited and maintained by the Editors of Linux Gazette Copyright © 2001
Published in issue 75 of Linux Gazette February 2001
HTML script maintained by of Starshine Technical Services, http://www.starshine.org/

"Linux Gazette...making Linux just a little more fun!"


News Bytes

Contents:

Selected and formatted by

Submitters, send your News Bytes items in PLAIN TEXT format. Other formats may be rejected without reading. You have been warned! A one- or two-paragraph summary plus URL gets you a better announcement than an entire press release.


 February 2002 Linux Journal

[issue 94 cover image] The February issue of Linux Journal is on newsstands now. This issue focuses on Small Office/Home Office (SOHO). Click here to view the table of contents, or here to subscribe.

All articles through October 2001 are available for public reading at http://www.linuxjournal.com/magazine.php. Recent articles are available on-line for subscribers only at http://interactive.linuxjournal.com/.


Legislation and More Legislation


 Jon Johansen Indicted by Norwegian Authorities Regarding DeCSS

Unhappy news this month, as it emerged that Jon Johansen has been indicted by Norwegian authorities for his part in creating and distributing the DeCSS code. This comes two years after he and his father were first taken from their home in connection with the same software. The initial report is available in Norwegian, and a translation was posted in the Slashdot discussion of the story.

It appears that the case against Jon is unusual in that he is being charged under laws which are generally applied in cases involving breaking into computers and theft of electronic records or company files. Pressure from the MPAA and the US entertainment industry appears to have encouraged the Norwegian authorities to try this experimental attempt to secure a conviction.

The Electronic Frontier Foundation have extensive resources on this case. Particularly interesting are some legal arguments as to why no offence has been committed under Norwegian law and transcripts including Jon Johansen's testimony at the 2600 Magazine trial in New York under the DMCA (July 20, 2000).

A mailing list has also been set up to discuss issues concerning the case, including how to support Jon and how to protest against the indictment.

The sorry truth is that cases like this are likely to become more common in the future. Governments internationally are harmonising their intellectual property laws through measures such as the WIPO copyright treaty which will come into force in March (having recently secured its 30th signatory). The result will be that all countries might eventually enact legislation akin to the DMCA to protect the media multinationals' intellectual property and access-control technologies. Countries attempting to resist this trend will not be well received. Slashdot reported recently that Ukraine is subject to US trade sanctions for not using an "optical media licensing regime" for blank CDs and CD recorders. The best way to resist at an individual level is to make your voice heard and start lobbying and writing letters. Your local LUG could form a focus for this activity.


 Support From Washington

Congressman Rick Boucher has been receiving a lot of press lately for the position he has taken with regard to issues such as digital rights management and the DMCA. Dotcom Scoop recently reported that Congressman Boucher has written to the RIAA expressing his concern at the introduction of copy-protected compact discs. He feels that such developments "...may prevent or inhibit consumer home recording using recorders and media covered by the Audio Home Recording Act of 1992". A report on the same story in The Register, however, indicated that the copy protection measures probably are legal. It seems that though the record label cannot sue you for making a legitimate personal copy of your new CD, they are not obliged to make it easy for you! ZDNet has reported [Reuters] that Boucher is planning to introduce a bill that would eliminate the "anti-circumvention" clause of the DMCA. It is certainly encouraging to see an elected representative taking an overtly pro-consumer line on these issues.

Another elected representative who seems to understand a thing or two is Rep. Darrell Issa, a member of the US House of Representatives' Judiciary Committee. Speaking to Linux Journal's Don Marti, he indicated that the SSSCA was "dead on arrival". Though this is encouraging, it might be foolish to get too relaxed until the grave is actually occupied. Don comments that Issa also seemed well informed on other issues in this area (DMCA, etc.,).

Perhaps when campaigning on issues of concern, it would be wise to be alert to good as well as bad news. Elected representatives careers are based on achieving public support and they can be very sensitive to public opinion. It could not hurt to mail guys like Boucher and Issa to tell them if you like what they are doing.


 UCITA

LWN reported that UCITA is back again. The main issue for the free software community would be that the UCITA, if it came into US law in its current form, would prohibit the distribution of software to consumers without warranty. This would mean that by distributing a free software utility, you could be held responsible by consumers for any flaws in the product (even though you have disclaimed all warranties, etc.,). This story was also reported by TheRegister, who linked to this article by Richard Stallman on "Why We Must Fight UCITA".


 Legislative Links

Indianapolis' attempt to keep minors from playing violent video games in public arcades was ruled unconstitutional, at a cost of $318,000 to taxpayers.

NY Times review of the year in tech law, which makes a nice lead in to their preview of what might be to come. Both articles feature the input of various experts from the field, and both require registration.

Essay on cryptome.org by Mike Godwin on digital rights management and the battle between computer companies and entertainment companies. (Courtesy Crypto-Gram)


Linux Links

Jun Jungho mailed to announce a LG Korean translation site at http://www.whiterabbitpress.com/lg/. He and fellow volunteers have tested this site for 5 months, and would now like to inform others. "I wish that this site gives more fun & infomation to Korean Linuxers."

ASCII: American Standard Code for Information Infiltration by Tom Jennings. A very interesting, and in-depth article. Covers history of ASCII, and its various developments over almost half a century.

Courtesy crypto-gram is a link to a review [pdf] of the year in vulnerabilities. This contains a list of all the operating systems and applications with vulnerabilities.

Newsforge has a story on one person's experiences with Gentoo Linux , a distribution that requires the user to start the installation by compiling new compilers. In a similar vein, DistroWatch have a review of Sorcerer GNU Linux, which again compiles much of the system from source during install.

ZDnet asks `is Linux ready for the desktop?' While Cio.com tell us how to run a Microsoft-free shop.

Linux Journal have looked back over the problems exposed in SSH during the past year, and the solutions which have resulted.

Some links and stories that appeared on SlashDot over the past month:

Linux Today have featured the following links which you may be interested to follow:

TheRegister's Thomas Greene reported on getting superior benchmarks for Quake-3 FPS on Linux as opposed to Windows. Hardly a scientific test, but nice to see none the less.

From the O'Reilly stable of websites, the following may interest you:

Scientific American article on really bad patents. If you find those interesting, you might like to look at IBM's new patent for a toilet reservation system highlighted by Hartmut Pilch on the patents mailing list at aful.org.

What to do after a computer break-in.

Some Linux Weekly News highlights:

The Washington Post have an interesting article by Lawrence Lessig entitled "Who's Holding Back Broadband". It appears issues of control loom large in this area, with media companies loath to take any move which might loosen their grip on the "content industry". Embracing broadband would be just such a move.

Two IBM whitepapers (here and here) on security issues relating to "Linux in Enterprise Systems" (and we are not talking about Klingons off the starboard bow). Both pdf's, and quite large. IBM appears to be strengthening their support for Linux. Slashdot reported that IBM's new $400,000 Z-series mainframe will not be sold with z/OS, but rather with Linux.


Upcoming conferences and events

Listings courtesy Linux Journal. See LJ's Events page for the latest goings-on.


LinuxWorld Conference & Expo (IDG)
January 30 - February 1, 2002
New York, NY
http://www.linuxworldexpo.com/

The Tenth Annual Python Conference ("Python10")
February 4-7, 2002
Alexandria, Virginia
http://www.python10.com/

Australian Linux Conference
February 6-9, 2002
Brisbane, Australia
http://www.linux.org.au/conf/

Internet Appliance Workshop
February 19-21, 2002
San Jose, CA
http://www.netapplianceconf.com/

Internet World Wireless East (Penton)
February 20-22, 2002
New York, NY
http://www.internetworld.com/events/weast2002/

Intel Developer Forum (Key3Media)
February 25-28, 2002
San Francisco, CA
http://www.intel94.com/idf/index2.asp

COMDEX (Key3Media)
March 5-7, 2002
Chicago, IL
http://www.key3media.com/comdex/chicago2002/

BioIT World Conference & Expo (IDG)
March 12-14, 2002
Boston, MA
http://www.bioitworld.com/

Embedded Systems Conference (CMP)
March 12-16, 2002
San Francisco, CA
http://www.esconline.com/sf/

CeBIT (Hannover Fairs)
March 14-22, 2002
Hannover, Germany
http://www.cebit.de/

COMDEX (Key3Media)
March 19-21, 2002
Vancouver, BC
http://www.key3media.com/comdex/vancouver2002/

FOSE
March 19-21, 2002
Washington, DC
http://www.fose.com/

Game Developers Conference (CMP)
March 19-23, 2002
San Jose, CA
http://www.gdconf.com/

LinuxWorld Conference & Expo Singapore (IDG)
March 20-22, 2002
Singapore
http://www.idgexpoasia.com/

Software Solutions / eBusiness World
March 26-27, 2002
Toronto, Canada
http://www.softmatch.com/soln20.htm#ssebw

SANS 2002 (SANS Institute)
April 7-9, 2002
Orlando, FL
http://www.sans.org/newlook/home.htm

LinuxWorld Conference & Expo Malaysia (IDG)
April 9-11, 2002
Malaysia
http://www.idgexpoasia.com/

LinuxWorld Conference & Expo Dublin (IDG)
April 9-11, 2002
Dublin, Ireland


Internet World Spring (Penton)
April 22-24, 2002
Los Angeles, CA
http://www.internetworld.com/events/spring2002/

O'Reilly Emerging Technology Conference (O'Reilly)
April 22-25, 2002
Santa Clara, CA
http://conferences.oreillynet.com/etcon2002/

Software Development Conference & Expo, West (CMP)
April 22-26, 2002
San Jose, CA
http://www.sdexpo.com/

Networld + Interop (Key3Media)
May 7-9, 2002
Las Vegas, NV
http://www.key3media.com/

Strictly e-Business Solutions Expo (Cygnus Expositions)
May 8-9, 2002
Minneapolis, MN
http://www.strictlyebusiness.net/strictlyebusiness/index.po?

Embedded Systems Conference (CMP)
June 3-6, 2002
Chicago, IL
http://www.esconline.com/chicago/

USENIX Annual (USENIX)
June 9-14, 2002
Monterey, CA
http://www.usenix.org/events/usenix02/

PC Expo (CMP)
June 25-27, 2002
New York, NY
http://www.techxny.com/

O'Reilly Open Source Convention (O'Reilly)
July 22-26, 2002
San Diego, CA
http://conferences.oreilly.com/

USENIX Securty Symposium (USENIX)
August 5-9, 2002
San Francisco, CA
http://www.usenix.org/events/sec02/

LinuxWorld Conference & Expo (IDG)
August 12-15, 2002
San Francisco, CA
http://www.linuxworldexpo.com

LinuxWorld Conference & Expo Australia (IDG)
August 14 - 16, 2002
Australia
http://www.idgexpoasia.com/

Communications Design Conference (CMP)
September 23-26, 2002
San Jose, California
http://www.commdesignconference.com/

Software Development Conference & Expo, East (CMP)
November 18-22, 2002
Boston, MA
http://www.sdexpo.com/


News in General


 Euro Support

As many of you have surely noticed, the euro became a real paper and coins currency on the first of January 2002. Being able to type the euro symbol is now something which will be necessary for very many computer users. The Debian Project have released the Debian Euro HOWTO by Javier Fernández-Sanguino Peña which details how to enable support for the symbol in your Linux system. Much of the advice will be of use to users of distributions other than Debian.

Long-term, the best solution may be a move towards Unicode. This is particularly the case when interoperability with Windows systems is required.


 Athlon/Duron and Linux Bug

A bug in AMD's Athlon family of processors has been reported on TheRegister, following an earlier revelation by Gentoo Linux. The issue relates to extended memory paging sizes and is a bug in the processor, not the kernel. Those using Linux 2.4 kernels, and AGP may experience problems with memory corruption. The fix is to pass the option "mem=nopentium" to the kernel at boot-time (via GRUB or LILO). Gentoo have a good description of the situation on their main webpage at the moment, and an analysis of how this was neglected for so long (since September 2000!).


 Linux Adoption

TheRegister.co.uk recently reported that Korea is to convert 120K civil servants to Linux desktop use. This appears to be as much a fightback by local favourite Haansoft (producers of Hancom Linux, and HancomOffice) as a victory for Linux, but it is still good news.

In a separate development, NewsForge reports that Red Hat India is helping to introduce GNU/Linux as part of a scheme to meet the software needs of the Indian education system. The program will include not only software, but also free training to help get the scheme off the ground.

Spinning the globe again, this time to China, we see more penguins on the march. Linux Today have a report that Linux is making an impression on many in China. Apparently the Chinese Academy of Sciences have published a report highlighting the savings which could be achieved by using Linux as an alternative to Microsoft solutions. This follows a Gartner report that Microsoft recently lost out on a major IT investment in China, while indigenous firms including Red Flag Linux were favoured.


 Penguin Art

A new issue of TUX (Terminator Unit X) online comic is now available at: http://www.thelinuxreview.com/TUX/. the reports of TUX's death have been greatly exaggerated.

Also in the artistic vein, IBM have updated their Linux Cartoons page. Flash or Real Player required.


 Linux Trojan Found

qualys.com have announced that they have discovered a Linux Trojan, in the wild. This follows qualys's discovery of a very similar linux trojan last year. This story was also picked up by Newsbytes.com, and from there Slashdot got in on the act. To be infected, you must execute the trojan as root, so there is likely to be a need for some sort of social engineering in getting this one to propagate. Main risk would be if a binary in a Linux distribution became infected, since most people trust the binaries on their install media. At the very least, this is another very good reason to be very very careful what you do as root.


 DOSSIER, Documentation Source

DOSSIER is a convenient new way to get printed documentation for Free and Open Source software. Current topics include "Email", "File Systems", "Kernel", PostgreSQL", "Python", and "Text". The demand-printed volumes may be ordered from BSDMall. The motivation and rationale for DOSSIER are covered in " DOSSIER and the Meta Project (Part 1)", in Daemon News.


Distro News


 BrlSpeak

BrlSpeak is a new mini-distribution of Linux that comes with support for braille and speech built-in. The objective is to offer an easy-to-install solution for blind persons who wish to install a Linux distribution on their computer without any assistance from a sighted pereson. BrlSpeak provides a built-in preconfigurer so that you should be able to preconfigure the BrlTty Makefile before starting Linux. Compilation and automated activation of the braille device is the next step, and will be performed when booting the distrib. BrlSpeak was based on Matthew campbell's ZipSpeak mini-distribution, that's why it contains the SpeakUp screen reader for supporting speech synthesizers. The BrlSpeak is available in many languages. To download it, visit the BrlSpeak Projet Home Page.

Author: Osvaldo La Rosa, freely distributable, UMSDOS mini-distribution, size: 36MB, available as: zip or iso, website: en, fr, nl. Any contributions welcome!


 Debian

Debian GNU/Linux 2.2r5 has been released. This fifth revision adds security updates and some bug fixes to the stable `potato' release. A list of FTP and HTTP mirrors is available at http://www.debian.org/distrib/ftplist. Point apt (see the sources.list(5) manual page) at an up to date mirror and then run apt-get update; apt-get upgrade The complete list of all accepted and rejected packages together with rationale is on the preparation page for this revision

It is a good idea to keep an eye on http://security.debian.org/ or to subscribe to the debian security announce mailing list. There have been quite a few security announcements in the past month.


Debian Weekly News reported that new "Debian on CD" Web Pages have been launched. These replace the old pages on cdimage.debian.org, which "were often criticised by visitors of the website". The new pages feature improved documentation, direct download links for images, a CD vendor list Apart from an extended FAQ, the new pages offer direct download links for CD images, a list of CD vendors, artwork, and info on jigdo, the new distribution scheme for downloading CD images from any normal Debian mirror.


Linux Today highlighted a report on the size of Debian 2.2, which includes more than 55,000,000 physical SLOC: The COCOMO model estimates that its cost would be close to $1.9 billion USD to develop Debian 2.2.


Also highlighted by Linux Today was this bugreport, which comments on vulnerability notification and the Debian Social Contract. "Over the past few months, the GNU/Linux community has slowly adopted a way of dealing with security issues which closely resembles the approach suggested by Microsoft last year: more-or-less systematic hiding of security problems from end users, at least for some time. Some Debian maintainers seem to participate in this process, and hold back security fixes, waiting for events to happen which are external and not related to the Debian project (for example, other distributors being ready to publish fixes)."


 Mandrake

Linux Planet have started a 'Month Later' addition to their Distribution Watch section. The first distro to receive this second look is Mandrake 8.1. The review discusses the process of getting settled in and smoothing out the routine bumps and curves of this distribution.


 Red Hat

The Washington Post Washtech.com site has reported that AOL Time Warner is in talks to buy Red Hat. Everything is very vague ("fluid" appears to be the official term), so it is difficult to know what the chances are such a deal actually coming off. Andrew Orlowski of TheRegister is somewhat sceptical about the rumours. He also makes some good comments about what the wider implications of such a deal could be.


Software and Product News


 GUI Based DSSSL/XSLT DocBook Tool Released

Command Prompt is pleased to announce the release of DocPro 0.2.0. DocPro is a tool for professional technical authors whom maintain a large amount of SGML/XML based documentation. DocPro will take any DocBook document and transform it into a user defined format (Postscript, HTML etc...).

DocPro will correctly transform multiple documents, to multiple output formats. It includes the capability to arbitrarily set font sizes, margins, callout definitions etc... via a GUI interface.

DocPro currently runs on x86 Linux only, though there will be a release for YellowDog Linux (PPC) and MacOS X shortly. The Deluxe version of DocPro comes with the popular DocParse tools for converting HTML to DocBook.


 Adobe GoLive 6 Integrates Zend PHP Debugger

Adobe Systems will include Zend's PHP Debugger in its new release of GoLive 6, its flagship product for Web site development. This will give GoLive developers integrated access to advanced PHP debugging for their toughest applications and dynamic Web sites using scripting languages.


 CxProtect

CxProtect is an AntiVirus Solution for Linux Mail Servers. It is a binary based solution that using the Command AntiVirus API. The software offers detection and disinfection of attachments being transported via the Linux Mail Server. The only change required to the existing Sendmail.cf is to register CxProject as the MDA. Post-install configuration is done via a web browser interface.

Download available at http://www.calibretechnologies.com/downloads/CxProtect.tar.gz


 Mahogany 0.64 Released

A new release of Mahogany, has been made. Mahogany is an OpenSource cross-platform mail and news client, available for X11/Unix and MS Windows platforms. It supports many of the internet protocols and standards, including POP3, IMAP4, SMTP and NNTP. Mahogany also supports MIME and many common Unix mailbox formats.

Source and binaries for a of Linux and Unix systems as well as binaries for Win32 are now available.


Copyright © 2002, Michael Conry and the Editors of .
Copying license http://www.linuxgazette.com/copying.html
Published in Issue 75 of Linux Gazette, February 2002

"Linux Gazette...making Linux just a little more fun!"


Secure Printing with PGP

By


The Brother Internet Print Protocol

A recent article "Internet Printing - Another Way" described a printing protocol which can be used with some Brother printers. It enables users of Windows machines to send a multi-part base-64 encoded print file via email directly to a Brother print server.

The article went on to show how the functionality of the Brother print server can be implemented in simple Perl program which periodically polls a POP3 server to check for jobs whose parts have all arrived. When such a job is detected, its parts are downloaded in sequence and decoded for printing.

A subsequent article "A Linux Client for the Brother Internet Print Protocol" showed a simple client program which can be used on Linux workstations for sending print jobs to a Brother print server. That program was implemented as a shell script which split an incoming stream into parts and placed them in a scratch directory for subsequent encoding and transmission.

I have since developed a Perl client program which processes the incoming stream on-the-fly and requires no temporary storage. This is, of course, a much neater way to do things. The down-side is that there is no way of ascertaining the total part-count until the last part is being processed. A slight modification to the server program was therefore required to accomodate an empty "total-parts" field on all except the final part.

A Hole Big Enough to Drive a Truck Through

The whole arrangement as outlined above has been in use at my place for several months, and has saved us a whole lot of time and trouble. However, as pointed out by one reviewer, what we really have here is a security hole big enough to drive a truck through! Anybody in the whole wide world can send celebrity pictures to your color printer, and there's not a lot you can do about it.

Somebody else asked why we go to the trouble of splitting a large job into parts without first trying to compress it. And indeed there are a great number of jobs whose size can be significantly reduced through compression.

Then there were the Windows (and other) users, who thought that everything should be written in Perl for portability. And the Standards Nazis, who thought that the job parts should be sent as 'message/partial' entities in accordance with RFC 2046.

Who's Printing Pamela Anderson Pictures?

Of all the issues outlined above, the most serious is indubitably that of client authentication. And the solution is blindingly obvious; why not use one of the Public Key Encryption mechanisms now available? What we need here is for the sender to digitally sign the entire message using his private key. Upon receipt at the server, the message can then be authenticated by application of the sender's public key. There's no need for any secret key-entry rites at the server, so the whole server operation can be automated.

A message signed in this fashion can be signed in 'clear' form; the message itself is then sent as is, with a digital signature appended to its end. If you elect not to use 'clear' signing, the message will (if usual defaults are accepted) actually be compressed and the signature will be incorporated therein. This comes pretty close to what we need!

There is a set of Perl modules (Crypt::OpenPGP) which can perform the necessary signature and verification procedures, so we can actually write the entire client and server programs in a portable form. I had some difficulty with installing these, since they require that a number of other modules be installed, and they require the 'PARI-GP' mathematics package. I elected instead to use pgp-2.6.3ia; GnuPG-v1.0.6 will also work with the programs in this article.

There are a couple of Perl modules (Crypt::PGPSimple and PGP::Sign) which can be used to call pgp-2.6.3ia and its equivalent executables, but each of them creates temporary files, and that's something I try to avoid where possible.

Appeasing the Standards Nazis

RFC 3156 ("MIME Security with OpenPGP") describes how the OpenPGP Message Format can be used to provide privacy and authentication using MIME security content types. In particular, it decrees that after signing our message by encrypting it with our private key, we should send it as a 'multipart/encrypted' message. The first part should contain an 'application/pgp-encrypted' message showing a version number in plain-text form; the second part should contain our actual PGP message.

This is a bit over-the-top, but the overhead is small, and the whole deal is easily done using the Perl MIME::Lite module, as shown in the 'SEPclientPGP.pl' program hereunder.

So how do we send a long message which needs to be broken into parts for passage through intermediate mail servers? RFC 3156 tells us we should use the MIME message/partial mechanism (RFC 2046) instead! I think what they actually mean is "as well". So our output from 'SEPclientPGP.pl' is actually fed into the 'SplitSend.pl' program (also hereunder) which extracts the message "To:" and "Subject:" lines and replicates them into each sequentially numbered 'message/partial' component that it generates.

The Client Program

Here's the client program. It's pretty much self-explanatory. A pipe to the 'SplitSend.pl' program is opened for output. If the passphrase is supplied on the command-line (dangerous, but sometimes necessary!), it is planted in an environment variable.

The multipart MIME message as previously described is then constructed, taking its second body part from a pipe fed by the PGP executable. If the executable doesn't find a suitable passphase in the appropriate environment variable, it requests it in a terminal window.

#!/usr/local/bin/perl -w
# @(#) SEPclientPGP.pl  Secure Email Print client program. Ref: RFC 3156.
#                       Takes incoming stream and generates PGP-signed message
#                       which is piped to split-and-send program for email
#                       transmission to server. Requires 'pgp' program.
#                       Graham Jenkins, IBM GSA, Dec. 2001. [Rev'd 2001-12-30]

use strict;
use File::Basename;
use MIME::Lite;
use IO::File;
use Env qw(PGPPASS);

die "Usage: ".basename($0)." kb-per-part destination [passphrase]\n".
    " e.g.: ".basename($0)." 16 lp3\@pserv.acme.com \"A secret\" < report.ps\n".
    "       Part-size must be >= 1\n"
  if ( ($#ARGV < 1) or ($#ARGV > 2) or ($ARGV[0] < 1) );

my $fh = new IO::File "| /usr/local/bin/SplitSend.pl $ARGV[0]";
if( defined($ARGV[2]) ) {$PGPPASS=$ARGV[2]}
if( ! defined ($PGPPASS)) {$PGPPASS=""} # Plant passphrase in environment and
my $msg = MIME::Lite->new(           # create signed message.
                To      => $ARGV[1],
                Subject => 'Secure Email Print Job # '.time,
                Type    => 'multipart/encrypted');
$msg->attr  (   "content-type.protocol" => "pgp-encrypted");
$msg->attach(   Type    => 'application/pgp-encrypted',
                Encoding=> 'binary',
                Data    => "Version: 1\n");
$msg->attach(   Type    => 'application/octet-stream',
                Encoding=> 'binary',
                Path    => "/usr/local/bin/pgp -fas - |");
$msg->print($fh);                    # Pipe the signed message into a
__END__                                 # split-and-send program.

Split-and-Send

Here's the split-and-send program. The main loop at the end works just as described above - extract the destination and subject fields, accumulate lines until we are about to exceed the message-size limit supplied as a parameter, then feed what we have to an output routine.

The output routine needs to re-insert the destination and subject fields, and also insert a message-identifier, part-number and total-part-count. The total-part-count is only required on the final part. All fairly easy - except we don't know whether the current part is the final part until we look for the next part. So we get around this by using a double-buffer arrangement, where we don't actually output a buffer's contents until we have the next buffer.

Using MIME::Simple in this program is really overkill; however, what it does accomplish is that it tries to find an appropriate mailer program on whatever platform it executes.

#!/usr/local/bin/perl -w
# @(#) SplitSend.pl     Splits and sends an email message (Ref: RFC 1521, 2046).
#                       Graham Jenkins, IBM GSA, December 2001.

use strict;
use File::Basename;
use MIME::Lite;
use Net::Domain;
my ($Id,$j,$Dest,$Subj,$part,$InpBuf,$OutBuf,$Number,$Total);

die "Usage: ".basename($0)." kb-per-part\n".
    "       Part-size must be >= 1\n" if ( ($#ARGV != 0) or ($ARGV[0] < 1) );

$Id=(getlogin."\@".Net::Domain::hostfqdn().time) or $Id="unknown_user".time;
$Number = 0; $Total = ""; $OutBuf=""; $InpBuf=""; print STDERR "\n";

sub do_output {                         # Output subroutine.
  die basename($0)." .. destination undefined!\n" if ! defined($Dest);
  $Subj = ""                                      if ! defined($Subj);
  if ($OutBuf ne "") {                  # If output buffer contains data, 
    $Number++;                          # increment Number, and check whether
    $Total=$Number if $InpBuf eq "";    # it is the last buffer.
    print STDERR "Sending part: ", $Number,"/",$Total,"\n";
    $part = MIME::Lite->new(
              To      => $Dest,              # Construct a message containing the
              Subject => $Subj,              # output buffer contents.
              Type    => 'message/partial',
              Encoding=> '7bit',
              Data    => $OutBuf);
    $part->attr("content-type.id"     => "$Id");
    $part->attr("content-type.number" => "$Number");
    $part->attr("content-type.total"  => "$Total") if ($Number eq $Total);
    $part->send;                     # Send the message.
  }
  $OutBuf = $InpBuf;                    # Move input buffer contents to
  $InpBuf = ""                          # output buffer and exit.
}

while (<STDIN>) {                 # Main loop.
  if ( (substr($_, 0, 3) eq "To:")      && (! defined($Dest)) ) {
    $Dest = substr($_, 4, length($_) - 4); chomp $Dest; next }
  if ( (substr($_, 0, 8) eq "Subject:") && (! defined($Subj)) ) {
    $Subj = substr($_, 9, length($_) - 9); chomp $Subj; next }
  if ( (length($InpBuf . $_)) > ($ARGV[0] * 1024) ) {do_output}
  $InpBuf = $InpBuf . $_
}
foreach $j (1,2) {do_output}            # Flush both buffers and exit.
__END__

The Art of Jigsaw Assembly

There is no guarantee that the segments of our print-job will arrive at the server in the same order as they left the client. We cannot be sure that there will even be the same number of segments, since message-transfer agents along the way are allowed to re-assemble message/partial entities as they see fit. So what we have at the server end is a set of jigsaw puzzles, with the pieces of each puzzle being related by a common message-identifier, and their placement within that puzzle being determined by their part-numbers.

For a full listing of the 'SEPserverPGP.pl', see the attached text version. I haven't bothered to replicate all of it hereunder, since much of it is the same as the program shown in "Internet Printing - Another Way".

Basically, the program is intended for invocation via an entry in '/etc/inittab', and loops continually thereafter, with half-minute pauses between each loop. During each loop, it visits the mailboxes of one or more printer-entities on a POP3 server, and deletes any stale articles therein before tabulating the message-id's and part-numbers of the remaining articles. When it finds a full set of message/partial entities, it sucks each of them in part-number sequence from the server, and throws their contents into a pipe. The program-extract hereunder shows what happens then.

The relevant message content is deemed to begin at the "-----BEGIN.." line in the first part. For subsequent parts, it begins after the first blank line once an "id=.." line has been seen.

Once in the pipe, the composite message content passes to the PGP executable for validation/decryption, and thence to an appropriate printer. Validation output is passed to a scratch file, and then recovered from there for logging. A validation failure results in no output to the printer.

          for ($k=1;$k<=$tp{$part[0]};$k++){ # Check if we have all parts.
            goto I if ! defined($slot{$part[0]."=".$k});
          }                                     
          $fh=new IO::File
           "| /usr/local/bin/pgp -f 2>$tmp | lpr -P $user >/dev/null" or goto I;
          for ($k=1;$k<=$tp{$part[0]};$k++){ # Assemble parts into pipe. 
            $message=$pop->get($slot{$part[0]."=".$k});
            $l=0; $buffer=""; $print="N";
            while ( defined(@$message[$l]) ) {
              chomp @$message[$l];              # Part 1: start at "-----BEGIN",
              if( $k == 1 ) {                   # stop before 2nd blank line.
                if( @$message[$l]=~m/^-----BEGIN/ ) { $m=-2;  $print="Y"}
                if( $print eq "Y" ) {
                  if( @$message[$l] eq "" ) { $m++; if( $m >= 0)   {last} } 
                  $buffer=$buffer.@$message[$l]."\n"
                }
              }                                 # Part 2,3,..: skip 1 blank line
              else {                            # after "id=", then start; stop
                if( $print eq "Y" ) {           # before next blank line.
                  if( @$message[$l] eq "" )                        {last} 
                  $buffer=$buffer.@$message[$l]."\n"
                }
                if( @$message[$l]=~m/id=/ )                  {$print="R"}
                if((@$message[$l] eq "") && ($print eq "R")) {$print="Y"}
              }
              $l++;
            }
            print $fh $buffer or goto I;
          }
          $fh->close || goto I;
          open $fh, $tmp;
          while (<$fh>) { chomp; syslog('info', $_) }
          close $fh;
          for ($k=1;$k<=$tp{$part[0]};$k++){
            $pop->delete($slot{$part[0]."=".$k})
          }
          goto I;
        }
J:    } 
    }
I:}

Copycat Crime

In the scheme outlined above, there is nothing to prevent a determined trouble-maker replicating and replaying an entire authenticated message. To cover this possibility, you need to retain each log entry for a week or so, and to reject any incoming message having a corresponding signature and signature-date.

If, in addition, you wish to prevent someone from viewing the actual data travelling to your printer as it traverses the Internet, you need to change the PGP executable parameters at the client end so that the data is encrypted with the server's public key as well as signed; you will also need to feed a passphrase into the PGP executable at the server end.

GNU Privacy Guard

I have a mental image of somebody reading this and saying: "How come he's using pgp-2.6.3ia if he doesn't like un-necessary temporary files?" It's a good question, because pgp-2.6.3ia creates temporary files both during encryption and during decryption.

To get around this, or to comply with whatever laws are applicable in your country, you may wish to use GnuPG-v1.0.6 (or later version of the same) instead. In the client program, you will need to change the parameters with which the executable is called. And you won't be able to plant your passphrase in an environment variable.

I have attached for your interest a 'Lite' GPG client program which will execute on Windows machines with 'out-of-the-box' ActiveState Perl or IndigoPerl, and requires no extra modules.

During decryption to a pipe, the 'gpg' executable actually outputs data to the pipe until (and in some cases, after) it encounters a problem. So you will need to send your output to a scratch file - then send that scratch file to your printer if the decryption process completed satisfactorily.

Graham Jenkins

Graham is a Unix Specialist at IBM Global Services, Australia. He lives in Melbourne and has built and managed many flavors of proprietary and open systems on several hardware platforms.


Copyright © 2002, Graham Jenkins.
Copying license http://www.linuxgazette.com/copying.html
Published in Issue 75 of Linux Gazette, February 2002

"Linux Gazette...making Linux just a little more fun!"


A Pioneer for a New Century -- Alan Turing, part 1

By
Originally published at System Toolbox. Reprinted with permission.


Last time, we took a look at the life and some of the achievements, and near achievements, of Charles Babbage, the Godfather of Computing. Babbage made great leaps in our understanding of what would become the field of computer science by considering, and then demonstrating, that mathematical processes could be carried out quickly, repeatedly and without error through mechanical means. This was such a simple idea, but it was ground breaking in its implications. Babbage had been frustrated by the errors that crept into the lookup tables that serious mathematicians used for their calculations. His drive to create calculating machines grew out of the desire to remove these errors from the process of creating those tables. Babbage was ahead of his time. He was a pioneer of the 19th century. If his work hadn't been rediscovered, his achievements would have been almost entirely forgotten by the time the idea of automatic calculations through machines began to take hold in the 20th century.

One of the proponents of such automatic, mechanical, calculations was a mathematician in King's College, Cambridge; a young Alan Turing. It's almost a natural progression for this series to move from the cog wheel brains of Mr. Babbage to the theoretical thought machines of Alan Turing. Out of the necessity to answer one of the most critical mathematical questions of his time, Turing started down the road of what would become the fields of modern computer science and cryptography. As one of the single men whose achievements helped turn the tide of World War II, he is a hero. As developer of some of the original ideas about digital computers and for helping solve Hilbert's final question of Mathematics, he is a genius. Being human, his life is ultimately marked by complexity and, unfortunately... tragedy.

This article will focus on Alan Turing's life leading up to, and including, his invention of the "Turing Machine." Next month, we will tackle his achievements in cryptography during World War II, his ideas on the digital computer, and the controversial events that led to this hero's, one of my heros, tragic death.

Early Signs of a Remarkable Mind

Alan Mathison Turing was born to Julius Mathison Turing, an Indian Civil Service officer, and Ethel Stoney on June 23, 1912 in Paddington, England. Alan's father was still under active commission in India and feared the risks of raising family in the remote provinces over which he held jurisdiction. After Alan's birth, his father decided to leave his family in England instead of risking those uncertainties, choosing instead to make the trip back and forth between India and England while leaving his family with friends in England.

Like Babbage (and many others in this field), Turing showed early signs of, what I like to call, the "personality disorder" that leads to a such vocations as engineering and mathematics. Alan's natural inquisitiveness was often confused with mischief, where "planting" broken toys in hopes of resurrecting them was probably interpreted as "getting rid of the evidence." At a very early age, he is said to have taught himself to read in only three weeks and his discovery of numbers brought about the distracting habit of stopping at every street light in order to find its serial number. At the age of seven, while on a picnic in Ullapool, Scotland, Alan had the idea of gathering wild honey for the afternoon's tea. By plotting the flight paths of the bees among the heather, he was able to find the intersection point that marked their hive and provide an unexpected treat for the family.

There's another anecdote that made an appearance in Neal Stephenson's spectacular work of fiction, The Cryptonomicon, in which Turing plays a supporting role. It seems that Alan had a bicycle that had a problem with its chain. He discovered that the chain would dislodge itself from the gears after a regular, repeatable, number of revolutions. At first, the young Alan would count the revolutions of the gears throughout his ride until it was time for the chain to be forced to derail. He would then get off his bike and re-adjust the chain. As this got to be cumbersome over longer treks, he finally rigged a mechanical device that would maintain the count and readjust the chain itself. Supposedly, it never occurred to him to just buy a new chain to solve the problem. I believe that it is more likely that the chain's issues presented a unique problem set for Turing's mind to solve. It challenged him to think in a different way. It was challenging and fun; buying a chain was not.

Getting an Education

At the age of six, Alan's mother enrolled him in a private day school, St. Michael's, in order for him to learn Latin. Thus began Alan's introduction into the system that would shape his intellectual and personal development for the next fourteen odd years. The English educational system would prove to be both a conflict and a collaboration with Turing's sensibilities. The collaboration is epitomized by his early respect for rules and their relationship to his concept of fairness. These ideas are probably best illustrated by an anecdote of his mother skipping part of The Pilgrim's Progress. Judging one section to be too theologically weighty for the youngster, she had skipped it while reading aloud in order to spare him. Alan objected and felt that the story was ruined; skipping parts, in his sensibility, was against the rules of reading.

The conflict, in his relationship with the English school system, was partially rooted in Alan's resolve that he was nearly always right. Personal opinions were held as closely as fact. He was one of those people that knows something and doesn't think, feel or have an opinion on them. This type of mind set was definitely at odds with an education system built on tradition and firm in the belief that it knew what was best for its charges.

Early on, Alan was marked with the label of "genius" by the Headmistress of St. Michael's, a proclamation that would be echoed a few years later by a gypsy fortune teller. Despite such proclamations, Alan was required to follow the natural order of the English school system and, upon finishing his studies at St. Michael's, followed his brother's path to his next school, Hazelhurst and then to his first public school, Marlborough. Public school showed the ugly side of the English school system and Alan had his first troubles with bullies, proclaiming that he learned to run fast in order to "avoid the ball."

Brushes with Science

Alan was introduced to science through Edwin Tenney Brewster's Natural Wonders Every Child Should Know. Brewster's book sought to introduce topics that help children understand their place in the world and what they had in common and how they differed with and from other living things. This discovery, and that of mathematics, would sustain Turing in a life-long love affair. The rules and discoveries of science and mathematics fit his general sensibilities of the world; it had order and could be explored with reason. Sense could be made of life if observed in the correct way. Brewster's book was probably is the first to link the concept of machine and biology in Alan's mind, explaining that the human body was a complex machine with complicated processes that carried out the duties and chores of maintaining life.

While school offered many torments, it also opened up a world of knowledge to the young Turing. He showed an early interest and ability in languages, especially French, and treated it as a code that would allow him to carry on covert communications. Also, having always had a fascination with various process oriented activities, Alan was exposed to chemistry for the first time and fell instantly in love. Turing would go on to dabble in chemistry for the rest of his life, often co-opting family basements and guest rooms as chemistry labs. His habit of concocting various chemical solutions would later play a part in his untimely death as a adult.

Sherborne

At the age of 13, Alan was enrolled to attend the Sherborne boarding school. At the time of the school's summer term of 1926, England had just been brought to a stand still by the first day of the general strike. No buses or trains were running. Turing made something of a stir, being reported in the local newspaper, by bicycling the sixty miles from his home in Southampton to Sherborne, staying overnight in an Inn at a halfway point.

Sherborne and Alan were not the best match. Sherborne, as many English schools of the time, was concerned with creating citizens and not scholars. The headmaster, at the time of Alan's enrollment, espoused the idea that school was originally created to be a miniature society. Students would learn to navigate the complexities of their later adult lives by learning to survive the power plays of their current public school life. Authority and obedience held more sway than the "free exchange of ideas" and the "opening of the mind." Not long after arriving, the already shy Turing became even more withdrawn.

Alan sought solace in his books and course work. In 1927, he was able to find the infinite series of the "inverse tangent function" from the trigonometric formula for tan1/2x (tan-1x = x - x3/3 + x5/5 - x7/7 ...) without the aid of elementary calculus (Alan had yet to be exposed to it). It was a significant enough achievement to have his mathematics instructor include himself among the roster of people that had proclaimed the boy's genius. Such a proclamation didn't hold much sway with the school. While the accomplishment was extraordinary, Sherborne's headmaster, not a particular fan of science, felt he was wasting his time and was in danger of becoming a scientific specialist and not an educated man. This disrespect of science was not uncommon at the school. Alan's autumn form-master, a classicist who was enthralled with Latin, called scientific subjects "low cunning" and felt that the only reasons that the Germans lost World War I was because they placed to much faith in science and engineering and not enough in religious thought and observance.

Alan's dogged persistence to study such low subjects, finally earned him some respite. As long as he made a few concessions to the formalities of the school, he was left to his own devices. In 1928, he became enthralled with the theory of relativity and lost himself in the English translation of Einstein's Relativity: The Special and General Theory. Probably one of only a few, if any, sixteen year olds who actually grasped Einstein's theories, Turing was able to fully grasp Einstein's doubts of the veracity of Galilei-Newtonian laws. He was even able to deduce Einstein's Law of Motion ("the separation between any two events in the history of a particle shall be a maximum or minimum when measured along its world line") from his readings alone (it wasn't specifically stated in the text). By 1929, Alan had begun to study quantum physics. It was a heady time as Schroedinger and others turned what was considered a "dead" science on its head. Schroedinger's quantum theory of matter was only three years old and Alan and his friend Christopher Morcum immersed themselves in these emerging discoveries. Alan was in his element.

King's College

Turing had originally planned on attending Trinity College at Cambridge. As far as he was concerned, it was the center of scientific and mathematical thought in England and he wanted to attend. After a number of failed attempts at passing his final examinations, more out of abstinence in engaging his "classical" work, he finally missed a scholarship to Trinity but was able to obtain one to King's, the college of his second choice.

King's College agreed with Alan. Though he was still somewhat of a social misfit, his studies and the freedom from the petty tortures of public school life allowed him to relax and find his rhythm. King's also turned out to be a good fit due to the caliber of its faculty. Turing's mathematics professor was one of the most distinguished mathematicians of his time, G.H. Hardy, who had recently left Oxford to take up the Sadleirian Chair at Cambridge. He was also among 85 other students engaged in scientific study, as compared to the one or two he had to seek out during his Sherborne days. As happens today with many high school geeks, college offered a chance for Alan to emerge from his protective shell and begin to engage the world on his own terms.

During the 20's, Cambridge had moved to establish itself as second in the world in the field of new maths. It had been able to stake this claim on the developments that its faculty and students were making in the realms of quantum theory and pure mathematics. It was widely regarded as second only to Gottingen University in Germany, a place that supported such genius as John Von Nuemann.

Von Nuemann and Turing were to cross paths a number of times throughout their lives. In 1932, Turing read Von Nuemann's Mathematische Grundlagen der Quantemechanik and was deeply affected by the text. His interest in quantum theory continued into the studying of the works of other luminaries like Schrodinger and Heisenberg. This exposure to the greats in an emerging field totally engaged the young Turing and set him to exploring the questions that their discoveries raised. It was this exposure and new found focus that put Turing on an crash course with Hilbert's Three Questions of Mathematics.

A Question of Mathematics and Turing Machines

In 1928, developments in pure mathematics seemed to be unraveling the foundations of the field. It seemed that the world was on the cusp of unlocking the vary foundations of mathematics. It wouldn't be long before core axioms were nailed down and mathematics would be just a set of easily applied rules that would lead directly, inevitably to the solution of any problem. No problem would be beyond the reach of mathematics. Appropriately applied, mathematics would make the world a better place (sounds kind of like the commotion surrounding the Internet, doesn't it?).

It was during this period, in 1928, that Hilbert, already famous for his development of Hilbert quantum spaces, posed a number of questions about the core of mathematics, whose unexpected answers would shake the field and push it into new realms of discovery and reason. Hilbert's agenda was to find a general algorithmic procedure for answering all mathematical inquiries, or at least proving that such a procedure existed.

  1. Three of those questions at the heart of his agenda were:
  2. Was mathematics complete? Meaning, could every assertion be proven or disproven with the rules of math?
  3. Was mathematics consistent? Meaning, could a false statement never be proven true with the rules of math?
  4. Was mathematics decidable? Meaning, were there definite steps that would prove or disprove an assertion?

While nobody, including Hilbert, had been able to offer solutions to these questions by proof in 1928, Hilbert was confident that the answer to each was yes. In his mind, there had to be a solution for every problem, if only to prove that it was unsolvable. This failed assertion, as bad as it sounds, would actually save mathematicians a lot of effort spent pursuing blind alleys. So, it was still a solution; its a math thing.

The issue lay in proving that mathematics was complete, consistent, and decidable. At the same gathering, the young mathematician Kurt Godel dealt a serious blow to this line of queries, by showing that math must be incomplete because, as he showed, there are assertions that can be stated that can be neither proved nor disproved. An assertion, encoded in the form of mathematics, that said, in effect, "this statement is unprovable" showed this disturbing (if you are into that sort of thing) property. An attempt to prove it true or untrue leads to contradiction. At least in the form of the question phrased by Hilbert, Godel had proved that arithmetic was incomplete. There are nuances to this, of course, but it was still damaging. Godel also showed that mathematics could not be proven consistent and complete. However, he was not able to shake loose an answer to Hilbert's question as to the decidability of arithmetic.

Alan's professor Hardy, for one, was happy that Godel couldn't topple Hilbert's final question. In his view, a mechanical process that could perform a solution to all mathematical problems would put every serious mathematician out of a job. Everything would have been done.

It was time for the student to instruct the teacher, at least in part. After a day of running, an activity that Alan found to nicely clear the mind, he stumbled onto the idea of a machine of simple, though improbable, design that could tackle any sort of problem put to it. The powerful machine would only understand the digits 0 and 1; the first binary computer. It would move a read/write mechanism across an infinite tape of these numbers and, based on their particular arrangement, solve various types of problems. Alan's breakthrough was that he had defined, in specific language, what a general algorithm actually was. The Turing Machine, as his construct would be called, was a thought experiment that helped codify the features of algorithms. During his exploration of the wonderful ideas that this machine inspired, Turing found that, despite the simple, general, nature of his algorithm, there did exist problems that it could not solve. This discovery proved Hilbert's assertions were incorrect, the answer to Hilbert's final question, the Entscheidungsproblem was "no, mathematics is not decidable."

The young mathematician from King's College, Cambridge had bested one of the greatest mathematicians of his time at the age of 23. He gained a fair measure of acclaim for his achievement and the word "genius" began to be tossed around again. Had he done only this, he would be remembered in some history books and higher math students would get acquainted with him at some point. At any rate, a small amount of historical immortality, as obscure as it may be, would be granted in his memory. However, it was what he did next that changed the course of human history.

Next month, we will explore the workings of a Turing Machine and follow Alan into the war effort. We will see how a single man's true genius can turn the tide of war, and we will shake our heads in disbelief at a hero's humiliation and eventual death. Stay tuned.

------

© 2001 G. James Jones is a Microcomputer Network Analyst for a mid-sized public university in the midwest. He writes on topics ranging from Open Source Software to privacy to the history of technology and its social ramifications. This article originally appeared at System Toolbox (http://www.systemtoolbox.com). Please email and let me know where it is being used. This article is dedicated to the memory of Dr. Clinton Fuelling. Verbatim copying and redistribution of this entire article is permitted in any medium if this notice is preserved.


Copyright © 2002, G James Jones.
Copying license http://www.linuxgazette.com/copying.html
Published in Issue 75 of Linux Gazette, February 2002

"Linux Gazette...making Linux just a little more fun!"


Installing and using AIDE

By


Introduction

If your system was compromised, chances are that the hacker, cracker, trojan, worm or whatever replaced system files, or installed new ones, generally backdoors or hostile code. Imagine a replaced version of the login program, which lets someone in with root access after supplying a magic password (like the ones included in most rootkits), or a trojanized ssh client, which emails server, user and password information to someone when used (something like this happened in an important site last year).

File integrity checkers can help us by keeping checksums or hashes, and various attributes like size, owner, permissions, etc. of files in a database to later, and regularly, compare this information checking for changes. So if the login binary is replaced, or a /tmp/.hidden/backdoord is installed, you would be alerted.

This article will try to explain how to install and use an AIDE, an open source Intrusion Detection System (IDS) of the host-based type, or file integrity checker, if you prefer. Quoting from the AIDE website...

"AIDE (Advanced Intrusion Detection Environment) is a free replacement for Tripwire. It does the same things as the semi-free Tripwire and more."

The installation of the whole system will be done on a floppy disk. We'll check for changes in various files and directories, being a little paranoid. That will take more time and generate more false alarms or false positives, but I think it makes things less complicated, and, hopefully, not less secure. When you set up your own configuration, you can start with my example, and then after a couple of weeks of use you will know what should be changed. You'll mount the disk each time you're ready to do the checks. That requires more steps, but if an attacker gets in, he will not be able to (A) change our database, and (B) not even notice we check our system regularly with AIDE.

Installation

First we will make the filesystem in the floppy disk... (mine is on /dev/fd0, drive A: under DOS, if you use B: under DOS you will use /dev/fd1 here.)

root@pc2:~# 
root@pc2:~# mkfs /dev/fd0
mke2fs 1.22, 22-Jun-2001 for EXT2 FS 0.5b, 95/08/09
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
184 inodes, 1440 blocks
72 blocks (5.00%) reserved for the super user
First data block=1
1 block group
8192 blocks per group, 8192 fragments per group
184 inodes per group

Writing inode tables: done                            
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 37 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
root@pc2:~# 
mount it, and create the aide directory...
root@pc2:~# 
root@pc2:~# mount /dev/fd0 /mnt/floppy
root@pc2:~# 
root@pc2:~# mkdir /mnt/floppy/aide
root@pc2:~# 

Now we will get the sources of AIDE, compile them in a temporary directory, install the system in the floppy disk (pay attenton to the --prefix option when running configure), strip the aide binary before doing the make install, and finally remove the temporary directory...

root@pc2:~# 
root@pc2:~# mkdir /tmp/aide
root@pc2:~# 
root@pc2:~# cd /tmp/aide
root@pc2:/tmp/aide# 
root@pc2:/tmp/aide# wget http://www.cs.tut.fi/~rammer/aide-0.7.tar.gz
--12:54:47--  http://www.cs.tut.fi/%7Erammer/aide-0.7.tar.gz
           => `aide-0.7.tar.gz'
Connecting to www.cs.tut.fi:80... connected!
HTTP request sent, awaiting response... 200 OK
Length: 219,837 [application/x-tar]

    0K .......... .......... .......... .......... .......... 23% @  34.84 KB/s
   50K .......... .......... .......... .......... .......... 46% @  50.97 KB/s
  100K .......... .......... .......... .......... .......... 69% @  65.45 KB/s
  150K .......... .......... .......... .......... .......... 93% @  46.38 KB/s
  200K .......... ....                                       100% @   7.17 MB/s

12:54:52 (50.40 KB/s) - `aide-0.7.tar.gz' saved [219837/219837]

root@pc2:/tmp/aide# 
root@pc2:/tmp/aide# tar xvfz aide-0.7.tar.gz 
aide-0.7/
aide-0.7/Makefile.in

[...]

aide-0.7/include/compare_db.h
aide-0.7/include/gnu_regex.h
root@pc2:/tmp/aide#
root@pc2:/tmp/aide# cd aide-0.7
root@pc2:/tmp/aide/aide-0.7# 
root@pc2:/tmp/aide/aide-0.7# ./configure --prefix=/mnt/floppy/aide 
creating cache ./config.cache
checking for a BSD compatible install... /usr/bin/ginstall -c

[...]

creating aide.spec
creating config.h
root@pc2:/tmp/aide/aide-0.7# 
root@pc2:/tmp/aide/aide-0.7# make
make  all-recursive
make[1]: Entering directory `/tmp/aide/aide-0.7'

[...]

make[2]: Leaving directory `/tmp/aide/aide-0.7'
make[1]: Leaving directory `/tmp/aide/aide-0.7'
root@pc2:/tmp/aide/aide-0.7# 
root@pc2:/tmp/aide/aide-0.7# strip src/aide
root@pc2:/tmp/aide/aide-0.7# 
root@pc2:/tmp/aide/aide-0.7# make install
\Making install in src
make[1]: Entering directory `/tmp/aide/aide-0.7/src'

[...]

make[2]: Leaving directory `/tmp/aide/aide-0.7'
make[1]: Leaving directory `/tmp/aide/aide-0.7'
root@pc2:/tmp/aide/aide-0.7#  
root@pc2:/tmp/aide/aide-0.7# cd ..
root@pc2:/tmp/aide# cd ..
root@pc2:/tmp# rm -r aide
root@pc2:/tmp# 

Finally we will create a very simple configuration file, that will check for changes in permissions, inode number, number of links, user owner, group owner, size, modification time, creation time and md5 checksums in various directory files (including all files under them), and generate the database...

root@pc2:/tmp# 
root@pc2:/tmp# cd /mnt/floppy/aide/bin/
root@pc2:/mnt/floppy/aide/bin# 
root@pc2:/mnt/floppy/aide/bin# cat aide.conf
database=file:/mnt/floppy/aide/bin/aide.db
database_out=file:/mnt/floppy/aide/bin/aide.db.new
/vmlinuz        R
/boot           R
/etc            R
/bin            R
/usr/bin        R
/usr/local/bin  R
/sbin           R
/usr/sbin       R
/usr/local/sbin R
=/var/log       R
/tmp            R
/var/tmp        R
root@pc2:/mnt/floppy/aide/bin# 
root@pc2:/mnt/floppy/aide/bin# ./aide --config=./aide.conf --init
root@pc2:/mnt/floppy/aide/bin# 
root@pc2:/mnt/floppy/aide/bin# mv aide.db.new aide.db
root@pc2:/mnt/floppy/aide/bin# 
The config file is only a working example, and i use it this way, but of course you may or should change it to suit your needs, remember the database generated must reside in the floppy disk. Check the end of this document to download the example aide.conf. We can now umount the floppy and are ready for regular use (checks and updates).

Regular use (checks and updates)

Now that we have the floppy disk with the generated database we can use it regularly to check for changes in the files to be audited. I will create a file in the /tmp directory to show an example of how AIDE tell us about it...

root@pc2:/# 
root@pc2:/# cat > /tmp/.hidden
hidden
root@pc2:/# 
root@pc2:/# mount /dev/fd0 /mnt/floppy/
root@pc2:/# cd /mnt/floppy/aide/bin/
root@pc2:/mnt/floppy/aide/bin# ./aide --config=./aide.conf --check
AIDE found differences between database and filesystem!!
Start timestamp: 2002-01-21 15:22:56
Summary:
Total number of files=1443,added files=1,removed files=0,changed files=1

Added files:
added:/tmp/.hidden
Changed files:
changed:/tmp
Detailed information about changes:

File: /tmp
Mtime: old = 2002-01-21 13:36:25, new = 2002-01-21 15:22:03
Ctime: old = 2002-01-21 13:36:25, new = 2002-01-21 15:22:03
root@pc2:/mnt/floppy/aide/bin# 
So here you see clearly what happened, of course if an existing file was modified you would be alerted in a similar way.

Now imagine that /tmp/.hidden is a file that you placed there, you will not remove it and wish to stop seeing it in the reports, you can update the database, like this...

root@pc2:/mnt/floppy/aide/bin# 
root@pc2:/mnt/floppy/aide/bin# ./aide --config=./aide.conf --update
AIDE found differences between database and filesystem!!
Start timestamp: 2002-01-21 15:28:58
Summary:
Total number of files=1443,added files=1,removed files=0,changed files=1

Added files:
added:/tmp/.hidden
Changed files:
changed:/tmp
Detailed information about changes:

File: /tmp
Mtime: old = 2002-01-21 13:36:25, new = 2002-01-21 15:22:03
Ctime: old = 2002-01-21 13:36:25, new = 2002-01-21 15:22:03
root@pc2:/mnt/floppy/aide/bin# 
root@pc2:/mnt/floppy/aide/bin# mv aide.db.new aide.db
root@pc2:/mnt/floppy/aide/bin# 
root@pc2:/mnt/floppy/aide/bin# ./aide --config=./aide.conf --check 
root@pc2:/mnt/floppy/aide/bin# 

Finally... conclusion, files, links, etc.

Remember to keep all the AIDE stuff in the floppy disk, umount and remove it after use, change the example configuration file to suit your needs, try to not leave any information in the system that may reveal to an attacker that you are using AIDE. You are encouraged to read the manual pages and manual.html of AIDE, it's a very flexible program. And finally, quoting the 'General guidelines for security' section of the AIDE manual:
" Do not assume anything
Trust no-one, nothing
Nothing is secure
Security is a trade-off with usability
Paranoia is your friend ".

The example aide.conf configuration file: misc/maiorano/aide.conf.txt

Home of the AIDE project: http://www.cs.tut.fi/~rammer/aide.html
download AIDE tarball: http://www.cs.tut.fi/~rammer/aide-0.7.tar.gz

Home of the more famous alternative to AIDE, Tripwire: http://www.tripwire.org

Some papers and articles for further reading...

An interesting article at securityfocus.com titled 'You may already be hacked.': http://www.securityfocus.com/columnists/12

An article at linuxsecurity.com titled 'Getting Started with Tripwire (Open Source Linux Edition)': http://www.linuxsecurity.com/feature_stories/feature_story-81.html

'Network- vs. Host-based Intrusion Detection - A Guide to Intrusion Detection Technology' from ISS, interesting reading also: http://secinf.net/info/ids/nvh_ids/

A more commercial point of view from NetworkWorldFusion, 'Getting the drop on network intruders': http://www.nwfusion.com/reviews/1004trends.html

Ariel Maiorano

I'm a free-lance programmer in Argentina, working mostly on web and security development.


Copyright © 2002, Ariel Maiorano.
Copying license http://www.linuxgazette.com/copying.html
Published in Issue 75 of Linux Gazette, February 2002

"Linux Gazette...making Linux just a little more fun!"


GPL or BSD? Yes

By


  1. What is the GPL software license?
  2. What is the BSD software license?
  3. Which is better for you?
  4. Which is best for me?
  5. Conclusion
  6. References

Introduction

What is the GPL software license?

The GNU General Public License is a bit lengthy (in my opinion) and tries to promote a "community" of programmers who share software freely and openly. It obfuscates the meaning of "free" and "freedom", since it really restricts the freedoms of people who don't want to openly share software that has the GPL license. Rick Holbert is suggesting we use the word "liberated software" instead of "free software". It still confuses me, because GNU software is not truly liberated, you can't so with the software whatever you want, but the word "liberated" is much better than the word "free". The GPL license forces people who make changes to the software to openly share those changes. Thus, it forces freedom on the "recipients" of the software, but not to the "programmers" who make changes to the software. It can be a little confusing, since it takes freedom away from the programmers, but strengthens the freedoms of the "recipients" of the software. In general, for people who wish to donate their software for the better of humanity, it seems to me the GPL license satisfies those goals because the software becomes open sourced and free for all people to use and to add to.

Sometimes, from a business perspective, you want to take software you can make proprietary so that you can create a product which has hidden value. If you close source software that has value, and your changes have value, then you can charge people for the software because they can't make the software themselves or it is too difficult and time-consuming for them to do so. You want to look at the BSD-style licenses under those cases.

In a different scenario, if you care more about service rather than software products, then the GPL license isn't something you should fear. For example, IBM is using Linux for various servers. If you develop a business model on top of GPLed software, you don't have anything to worry about. In addition, any software you create from scratch or you use that as a BSD-style license you can keep closed sourced running on top of your GPL software. There are still plenty of ways your business can use GPL software without threatening your business. Customers don't really care about how things are done, they just want it done. A good example is the crappy software produced in the most popular desktop OS. 99% of the customers who use that horrible nasty software don't know all the garbage put into it, and most of them wouldn't care. Look at all the people who are very happy to get patches to their "most stable and reliable version" of their OS, when really the logic is backwards. Shouldn't it have been stable and reliable from the beginning? And if the current version is the most stable and reliable and it crashes and has tons of bugs, then the previous versions were garbage? I keep on trying to emphasize to people that something that is more stable and reliable than garbage is still garbage that is only slightly more reliable and stable. It doesn't mean much. In business, it really isn't the quality of the product that sells, but if you meet the minimum requirements for people to use and you can sell it cheap on a mass market --- or if you can get a monopoly and brainwash people and congress with ads and money that your software is the best, when you know it isn't. Bottom line, if you are scared of the GPL license for business reasons, you probably haven't thought through your business model hard enough. The most popular Linux OS in the US isn't the best in the world, and lacks many features a respectable Linux OS should have, but it stays the most popular because it has market share and they improve with every version, which keeps their customers happy, even though the customers don't know how much better the software really could be.

What are the BSD-style Licenses?

FreeBSD Copyright Information has a variety of licenses. In general, you have freedom to do whatever you want with the software, as long as you acknowledge it came from the project you are working on. In some sense, you have more freedom to do whatever you want with this software, but when you make changes, you can "restrict" the freedom of others who receive the software you modified.

The BSD style licenses don't have any "pass-thru" freedoms. They don't promote "freedoms" for the recipient. This can be a beneficial if you wish to take software other people have developed, make a few changes, and sell the product, or just try to prevent people from understanding what you did.

When a non-programmer only understands what a piece of software does, but not how it does it, then you can sell the software to that person with good marketing skills, even if you really didn't do any of the work in creating the software. Take the most popular OS for desktops and you will get a good idea of a company which has no programming skills whatsoever, but are very good at marketing and selling garbage for software. Having the ability to include software from other developers (who knew what they were doing) without revealing the changes you make can be a very powerful if you can't program worth a darn but you are good a selling. From a business perspective whose goal is to make money (as all businesses are suppose to do), if you can use software that falls under the BSD-style incenses, do it. You can have better control over your OS and prevent people from copying a marketable product. The top two OSes for desktops have done this.

For the record, it seems like BSD programmers are very good at what they do, so I don't want to sound like software that comes from BSD programmers can be garbage. As far as I am concerned, as long as I can look at the source code, it isn't garbage because it can always be changed, but as soon as it does get closed sourced, then it becomes garbage, because I don't know what it is really doing. All the open-sourced stuff from the BSD-style licensed software it great.

Which is best for you?

There is one important belief you must understand: A LICENSE IS NOT BETTER OR WORSE THAN ANY OTHER LICENSE except from your point of view involving the goals you want to accomplish. A license is a foundation of how people are to behave, just like a government. From a business point of view. the US has a great government where money rules all. From a humanitarian point of view other governments have better ideas and goals. But neither government is better or worse if they accomplish the goals the people want. If the license does what you want it to do, then it is good for you, might not be good for someone else, but who cares about them. Thus, ONLY IDIOTS CLAIM ONE LICENSE IS BETTER THAN ANOTHER IF THEY DON'T UNDERSTAND WHAT YOUR GOALS ARE. Once we know what your goals are with the software you are creating, then we can determine which license is best for your software. Even then, it is still an opinion open to debate.

Whenever I talk to BSD people, I usually get them to admit GPL is not a bad thing. How? As I have stated, licenses are there for people to use. Nobody is forcing you to put your software under BSD or the GPL license. Thus, IF YOU CHOOSE to put your software under the GPL, and don't mind people having full rights to the source code, why is that bad? You agreed to it, you don't mind, and you aren't looking to make profit off of it with some closed source version, and you really don't want someone to come along and make profit off of a closed source version off of something you worked really hard at doing when you didn't get a dime. GPL levels the playing field so that everybody has equal opportunity to make profit given the same software and they can't prevent anybody else from having an equal opportunity as well. Looks like good market driven competition to promote business and let the best people win. Again, you choose to put this software on an equal basis for all to use. If they don't want to share like you did, fine, they can invest the millions of dollars needed to create their own software. Nobody is preventing them.

Please license your software under more than one license. For example, Perl is licensed under the GPL and the Artistic Licenses. If you want your software to be used with other free software, you must license it for more than just GPL. GPL tends to not work very well with other free software licenses.

One criticism of the free software community, as far as GPL goes: they are "stealing" the word freedom. Question: Does a dictator have the freedom to be a dictator? Yes. Freedom has nothing to do with a "community". Freedom means you can do whatever you want whenever you want however you want. People should have the freedom not to be free. One thing that irks me, although I understand from a political perspective why they are doing it, but the FSF and GPL dudes tend to redefine freedom what they want it to mean, but really they are only looking a very small subset of freedom, not ALL freedom. They are interested is freedom for people to share open sourced software and in the community, but not the freedom of an individual to do with a piece a software however they wish -- such as making a closed source version of a GPLed piece of software. Hence, GPL doesn't really promote "freedom" in the true sense of the word freedom, but freedom for a community to use software. I don't like how they are redefining the word freedom and how some the the zealots won't even talk to you unless you use the words "free" and "freedom" in a fashion they understand. However, I suppose it is good from a political point of view, because it forces people to think about freedom and most people don't have time to think in our crazy 80 hour workweek schedules.

Now, for BSD, it is not bad either. It is meant for the programmers who like to create closed sourced programs. I understand why this is so attractive. I understand why this is important for some people, but let me raise a very important point about BSD which doesn't make sense from a philosophical point of view:

If people create software under the BSD license, somebody can take all the work created under that license and create a closed source version where they don't have to tell people what changes they made. Thus a whole group of developers can work for years creating a cool piece of software where a single person or company can "steal" the software by creating a slightly different closed source version, promoting it as the standard, and ruining any chance of the real programmers for the software from ever benefiting from it. I just don't understand why so many people want to work so hard to make others millionaires. GPL prevents this. It levels the playing field for all who wish to use the software. Everybody has a fair shot.

A clear case of how dangerous the BSD license is and how it promotes a virus from spreading around (that nasty operating system from a first-rate marketing but 2-bit programming company) around the world. Take a look at this disaster with Kerberos. What a horrible thing to do. For myself, when an evil nasty company corrupts a piece of software and there is no legal way to force them to be cooperative with the rest of society, I will boycott all versions of that software. I cannot afford to worry about different versions that are incompatible popping up all over the place. Kerberos is ruined, and I will never use it. Why is it ruined? The corrupted version has too much influence in the world, that is isn't worth my time using similar software knowing one day future versions can be completely closed sourced ruining any chance of the versions I use being compatible with the closed sourced versions. The threat of not being compatible with other businesses who don't care about politics is too great for me for me to use this type of software. It is on my banned list if I can avoid using it (hopefully).

With all the benefits and complaints I have about GPL and BSD, which is best to use? Either, neither, both. Just understand what the licenses do, and if you don't mind the consequences, great! Even though I really don't like the BSD-style licenses for my uses, if you don't mind other people taking your software and making closed sourced versions they can sell, then the BSD-style licenses might be good for you.

Which is best for me?

Which license is best for me? The answer is yes, both. However, I only use GPL. Why? I am so grateful for all the free software, I don't really create any software that can be sold (I usually create web scripts in Python), and anything I produce for the world I would want someone to take up after I am done with it, it makes sense for me to use GPL. I really don't ever see myself using the BSD-style licenses because I don't want the evil empire from taking my software and using it to make profit without revealing that they did to give other people the same shot at business. The reason why the BSD-licenses are also good for me, is because, it is an option for the future. I don't use the BSD-style licenses, but I am glad they are an option.

Conclusion

Anonymous Coward made a good point:
 I was all set to write a long essay in response, but most of the readers here would probably just appreciate a summary:

The GPL license is conducive to liberating software.

The BSD license is conducive to liberating people.

With the GPL license, the software maintains more of the freedom than the programmers who work on it.

With the BSD licenses, the programmers maintain more of the freedom with what they are allowed to do with derivative code.
I prefer to think of it as the following:

In conclusion, anybody who says one license if better than another is a simple-minded troll who doesn't understand that they can only make a judgment for themselves and not others. I really want to emphasize that these people need to be sterilized so that their DNA doesn't spread and create politicians, generals, and judges who like to make decisions for people in other subjects in life. I have complete disrespect and contempt for anybody who makes a decision for other people about software licenses, and limited disrespect and contempt for anybody who lets people make their decision for them. I don't mind theories about how licenses affect society, BUT DON'T CLAIM ONE IS BETTER THAN ANOTHER, because that is an opinion based on certain values, not a fact. I will accept as fact what you think is the best license for you, but not your opinion about what you think is best for other people -- that is just an opinion and theory.

There seems to be 10 times the amount of BSD people who hate GPL. I imagine that is because Linux is tens time more popular, but I really don't know. If FreeBSD was ten times more popular then Linux, I imagine you would have 10 times more GPL guys moaning than the BSD guys. For me personally, I am unaware of BSD software on a daily basis, and so, I have no reason to voice my opinions actively. I suppose I really don't like people who complain about the other licenses for two reasons:

Nuff said.

References

Thanks to Rick Holbert for suggesting how I can improve the article and for letting me know that "liberated" is a better term than "free" to use when talking about "free software" in the GNU sense.
  1. Slashdot discussion which contains a lot of good points and I think makes a good case for BSD people not to hate GPL.
  2. If this article changes, it will be available here http://www.gnujobs.com/Articles/24/nielsen.html

Mark Nielsen

Mark works as an independent consultant donating time to causes like GNUJobs.com, writing articles, writing free software, and working as a volunteer at eastmont.net.


Copyright © 2002, Mark Nielsen.
Copying license http://www.linuxgazette.com/copying.html
Published in Issue 75 of Linux Gazette, February 2002

"Linux Gazette...making Linux just a little more fun!"


The Foolish Things We Do With Our Computers

By


Drill

By

This happened a long time ago, when a 20-megabyte hard disk was a giant, both in capacity and size. My friend had a Corvus 20 meg drive that was shared among five PCs which were used to run the accounting department of a small manufacturing business. The owner of the company was extremely pleased with my friend and the efficiency of the computerized accounting group.

One day, in the middle of month-end processing, the electric motor on the Corvus burned out. Payroll and Accounts Receivable needed to be done by the end of the day, and there were no backups of the data files. Since the Corvus had never failed, my friend had not bothered making backups.

Not having anything to lose, he opened up the case and removed the burned out motor. He then took an old electric hand drill with a variable speed motor and chucked it directly to the hard disk.

I wrote a quick and dirty program that read one sector of data and displayed a message when the read was successful. He ran this program while squeezing the trigger on the drill until it reported successful reads.

Once he had the speed right, he used black tape and taped the trigger so that it would not move.

The accounting group finished their month-end processing using the drill as the hard disk motor. He continued to use the drive with the drill for several weeks, after carefully making backups of the data however.


Zap

By

Here is a foolish story of what I did to a computer I was building.

Back before I had a lot of money to buy new hardware and such I had to make due with the few parts that I had lying around. One was an old XT case with a working power supply, I had enough money to get a motherboard and a DX4-100 chip, however cases and power supplies were expensive back then so I decided to use the XT case that I had. Since the XT motherboards were non-standard as far as mounting holes go and the new board wouldn't line up I had the great brainstorm to mount the board on the anti-static bag, I though "Sure it's anti-static it'll be safe." I ran it and it would boot up but the keyboard controller was failing. I took it back to the place that I bought and and I was explained to how the anti-static bag will actually conduct electricity and that I fried the board. Luckily he was willing to refund me half of my money, I then had to shell out for another board and a case this time. The lesson I learned was that if I am going to mount a board on anything but the pegs of the case I better use wood, something that is definitely and insulator.


By

Here follows the story of the geekiest use I've ever put my Palm III to.

I bought a Used Sparc Classic and it came with a hard drive, RAM, keyboard and mouse. It did not come with a monitor. Since I planned for it to be a server, the lack of a monitor wasn't a big deal except that I couldn't install Linux on it (or anything else for that matter) without being able to see what I was doing.

Palm III to the rescue! I had a serial cable that plugged into the bottom of the Palmpilot connected to a gender bender, connected to a DB25 <-> DB9 cable to plug into the serial port of the Sparc. The serial port on the Sparc actually has the wiring for both /dev/ttyS0 and /dev/ttyS1, but the first serial port has the same wiring as a PC, so it worked fine. Last but not least, I unplugged the keyboard.

Now that the hardware side was figured out, I downloaded a freeware vt100 program for my Palmpilot and configured it for the proper baud rate, stop bits and such. When I turned on the Sparc, it tried to find a keyboard and failed. Then it found a vt100 terminal on the serial port and used the Pilot as a console. I installed RedHat 6.2 to my Sparc using that tiny little screen.

After the install was done, I rebooted and telnetted in from my PC. Everything worked perfectly.

[If you have a story about something foolish or ingenious you did to your computer, send it to -Iron.]

Mike Orr

Mike ("Iron") is the Editor of Linux Gazette. You can read what he has to say in the Back Page column in this issue. He has been a Linux enthusiast since 1991 and a Debian user since 1995. He is SSC's web technical coordinator, which means he gets to write a lot of Python scripts. Non-computer interests include Ska/Oi! music and the international language Esperanto. The nickname Iron was given to him in college--short for Iron Orr, hahaha.


Copyright © 2002, Mike "Iron" Orr.
Copying license http://www.linuxgazette.com/copying.html
Published in Issue 75 of Linux Gazette, February 2002

"Linux Gazette...making Linux just a little more fun! "


Simple Package Management With Stow

By


When running a single box with tried and true software, tracking the versions of software that you use may be a no-brainer. That is to say, you use whatever Red Hat, Debian, or Sun provided (yes, I will touch on non-Linux issues here) if you could find or build the necessary package. But wait: what if you have been running the same machine for years and you simply must have the latest Emacs? What if you are developing your own software and don't want to create RPMs, or Debian dpkg each time you pause at a version? What if you don't trust that software package written by a 14 year old in that far away country with an unstable government? In short, what if you are heeding Obi-Wan Kenobe's advice, and using the source? How do you make it easy to rip out those configuration files, man pages, binaries, and libraries that you may want to replace in the future?

Well, when you think about it a little bit, Unix has sort of provided the raw materials to do that, in the form of a symbolic link or symlink . Symlinks are a powerful tool because they allow you to configure software so that its implementation does not necessarily connect directly to it's interface (sound familiar?). I might be playing a little loose with the definitions, but that really is what is being done when, for example, postfix mimics sendmail. The implementation, that is postfix, is presenting the same interface as sendmail, which has become a de facto standard interface to the Unix mail transport agent (MTA).

In the case of symlinks, you might have a program /opt/bin/new_cat linked to /opt/bin/cat. So if you looked at the link, you'd know right away what version was being run, but it would still seem to be the same familiar program. In this way the actual program being run can change as a better implementation (algorithm, etc.) is developed. Yes, environmental variables, as used in scripts, allow this, but try retrofitting all the variables that point to a program after the fact. It might not be so easy. Symlinks may be the answer. For example symlinks are typically used to ease the building of motif from source via the lndir utility. Of course this symlinking stuff could get out of hand, and should not be abused, but you get the idea. What the folks at the GNU project did was write a little Perl script that automates that entire process of symlinking the code you are using to the interface that you want to present to the user. Note that hard links are subtly different, because there is no differentiation between the original file and the link (really a second name since they share inodes, and hence are identical). I find hard links to be of minimal use, because it becomes too easy to lose track of which filename should be deleted and which should be kept.

Introducing Stow

Right away I want to emphasize that stow is not a replacement for a full package management database, but it does allow one to get many of the benefits of a complex package management system from a humble Perl script. As an aside, there is a package that will allow source to be entered into a Slackware, RPM, or Debian package database, called checkinstall . As an example I will go through the steps to install stow, then the steps to install a mail (MUA) replacement called nail . This is a good example because it includes multiple files so that you can see how one might encounter inadvertant collisions with previous versions. Also, nail a great enhancement to standard Berkeley mail, since it allows sending binary attachments on the command line, while offering the same base functionality.

Stow is so simple to install that really no in depth discussion is needed. It should work if you have Perl 5.005 or later (this version is stock on Solaris 8 AFAIK). Simply download the source from the GNU website or a local mirror, extract to a source directory with tar xzf and repeat the familiar ./configure , make, and make install sequence. Despite appearances, nothing is compiled, but a few things like the manual still need to get built. The make install step will place stow into the /usr/local/bin directory. This is the default location, and I chose this setting to simplify this discussion. The reasons will hopefully become apparent by the end of this article. The location of the installed stow executable is shown on the last line of the sample output below. I used the type command, but you could also use which or perhaps whereis.

Unpacking and installing stow
 
[zippy@mybox zippy]$ cd src/
[zippy@mybox src]$ gunzip -c ../stow-1.3.3.tar.gz | tar xf -
[zippy@mybox src]$ ll
total 8
drwxrwxr-x 2 zippy zippy 4096 Jan 6 06:19 stow-1.3.3
[zippy@mybox stow-1.3.3]$ ./configure
creating cache ./config.cache
checking for a BSD compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for mawk... no
checking for gawk... gawk
checking whether make sets ${MAKE}... yes
checking for a BSD compatible install... /usr/bin/install -c
checking for perl... /usr/bin/perl
updating cache ./config.cache
creating ./config.status
creating Makefile
creating stow
[zippy@mybox stow-1.3.3]$ make
make: Nothing to be done for `all'.
[zippy@mybox stow-1.3.3]$ sudo make install
make[1]: Entering directory `/home/zippy/src/stow-1.3.3'
/bin/sh ./mkinstalldirs /usr/local/bin
/usr/bin/install -c stow /usr/local/bin/stow
/bin/sh ./mkinstalldirs /usr/local/info
/usr/bin/install -c -m 644 ./stow.info /usr/local/info/stow.info
/bin/sh ./mkinstalldirs /usr/local/man/man8
/usr/bin/install -c -m 644 ./stow.8 /usr/local/man/man8/stow.8
make[1]: Leaving directory `/home/zippy/src/stow-1.3.3'
[zippy@mybox stow-1.3.3]$ type stow
stow is /usr/local/bin/stow

At this point stow is installed under /usr/local/bin. Make sure to include this directory your $PATH

Under The Hood

To describe stow, one first needs to understand the configure script, because these two scripts work together, with configure building all the software components, and installing them on your machine. The configure script is a marvelous convenience. It sniffs the system, checking for various prerequisite software. The results of these tests are used to design a set of Makefiles which will build and install your software to fit your system configuration. There are many options to configure, in fact there are alternate versions of this script as well, but for our purposes the options of greatest interest is the --prefix argument. Note a second argument, the --exec-prefix allows some finer tuning of the actual installation process, but this option will not be discussed in much detail.

So now we understand that configure builds the scripts that build the code, and that the location of the installed code may be specified via configure's --prefix command-- line argument. It turns out that if you pick a single special spot to install all source code, stow can then cleanly automate the creation of symlinks to the installed code in such a way that the source tree is readily evident, and can be replaced and removed. For example, invoking the configure script as ./configure --prefix=/opt/stow/foo-1.2.1 will install your package under /opt/stow/foo-1.2.1

I'm Still Confused. What is this prefix and exec-prefix stuff?

Feel free to skip this section, and come back to it later, after you have digested the rest of this article. Once you are comfortable with the notion of an actual install location being separate from the apparent location of a program you can consider the parts of the puzzle that don't fit the this ideal scenario. Imagine the case of installing software across multiple machines where everything is installed in a symlinked directory tree isolated from the apparent location (found in the $PATH, or $MANPATH). Depending on your intentions, this might not be what you want. Consider the situation where an application might be built for multiple architectures, for example source code could be built for Solaris and linux systems as follows (assuming an identical cross mounted source trees, but separate build directories):

sun$ cd sunsparc
sun$ ../foolib-1.1/configure --prefix=/usr/local \
> --exec-prefix=/usr/local/sunsparc
sun$ make
sun$ make install
Then from another xterm:
sun$ ssh pengie
pengie$ cd linux
pengie$ ../foolib-1.1/configure --prefix=/usr/local \
> --exec-prefix=/usr/local/linux
pengie$ make
pengie$ make install

The bottom line is that the developer has to decide which files are architecture dependant, and which are not, and you might not agree with her. Obviously documentation, and possibly configuration files could be considered architecture independent. Still, if you use stow, you are free to remove symlinks by "unstowing" files. Since this does upgrading will not overwrite the old source, instead it will only break the links, and you can hand copy configuration files back. Just "restow" the package and try again you get the upgrade right. Personally, I don't use the --exec-prefix option much, preferring instead to manually link the (hopefully) few configuration files that I want to treat specially, fixing broken links after upgrading. So far I think it's been a good approach for the simple situations I've encountered.

Installing Software With Stow

When I first started using stow a few years ago, I had some frustration with it because I had already started setting up the system (an HP-UX server) without it. There were frequent collisions with info files and manpages, ironically this was encountered the most with emacs. Naturally, following what is going on is easier for simple packages. The MUA software nail, is about as simple as you can get, since it consists of the executable, the documentation, and the config files (while you might want to link to /etc BTW).

Configuring for alternate locations
 
[zippy@mybox src]$ gunzip -c ../nail-9.29.tar.gz  | tar xf -
[zippy@mybox src]$ cd nail-9.29/
[zippy@mybox nail-9.29]$ ./configure --prefix=/opt/stow/nail-9.29
creating cache ./config.cache
checking for a BSD compatible install... /usr/bin/install -c
checking for iswprint... yes
...
..... lots of stuff ...
updating cache ./config.cache
creating ./config.status
creating Makefile
creating config.h
[zippy@mybox nail-9.29]$

What we are doing here is telling configure to put the files under /opt/stow/nail-9.29 but (implicit as far as stow is concerned) that the installed package will appear to be under /opt for run time files. ( If you're curious, you can look at the generated Makefile to see that the prefix variable is set via the --prefix option).

Building the source code
 
[zippy@mybox nail-9.29]$ 
[zippy@mybox nail-9.29]$ make
gcc -DHAVE_CONFIG_H -I. -I. -I. -g -O2 -c version.c
gcc -DHAVE_CONFIG_H -I. -I. -I. -g -O2 -c aux.c
... more stuff ...
gcc -DHAVE_CONFIG_H -I. -I. -I. -g -O2 -c tty.c
gcc -DHAVE_CONFIG_H -I. -I. -I. -g -O2 -c vars.c
gcc -g -O2 -o nail version.o aux.o base64.o cmd1.o cmd2.o \
cmd3.o cmdtab.o collect.o dotlock.o edit.o fio.o getname.o \
head.o v7.local.o lex.o list.o main.o mime.o names.o popen.o \
quit.o send.o sendout.o smtp.o strings.o temp.o tty.o vars.o
[zippy@mybox nail-9.29]$

Now that we have compiled everything, we can install the software.

Running the Install
 
[zippy@mybox nail-9.29]$ sudo make install
make[1]: Entering directory `/home/zippy/src/nail-9.29'
/bin/sh ./mkinstalldirs /opt/stow/nail-9.29/bin
mkdir /opt/stow
mkdir /opt/stow/nail-9.29
mkdir /opt/stow/nail-9.29/bin
/usr/bin/install -c nail /opt/stow/nail-9.29/bin/nail
/bin/sh ./mkinstalldirs /opt/stow/nail-9.29/man/man1
mkdir /opt/stow/nail-9.29/man
mkdir /opt/stow/nail-9.29/man/man1
/usr/bin/install -c -m 644 ./nail.1 /opt/stow/nail-9.29/man/man1/nail.1
test -f /etc/nail.rc || \
{ /bin/sh ./mkinstalldirs /etc; \
/usr/bin/install -c -m 644 ./nail.rc /etc/nail.rc; }
make[1]: Leaving directory `/home/zippy/src/nail-9.29'
[zippy@mybox nail-9.29]$

So it's apparent from the previous listing that the file was tucked under /opt/stow/nail-9.29 as desired. Stow then assumes that all the subdirectories of the package are to be symlinked to their corresponding locations under --prefix (or ${prefix} if you look in the Makefile), so that /opt/stow/nail-9.29/bin becomes /opt/bin Similarly /opt/stow/nail-9.29/man/man1 becomes /opt/man/man1 etc. This convention makes it very easy to isolate files used from the install locations. The only step left is to actually create the symlinks by running stow.

Stowing the binaries
 
[zippy@mybox nail-9.29]$ cd /opt/stow/
[zippy@mybox stow]$ sudo stow -vv nail-9.29/
Stowing package nail-9.29...
Stowing contents of nail-9.29
Stowing directory nail-9.29/bin
LINK /opt/bin to stow/nail-9.29/bin
Stowing directory nail-9.29/man
LINK /opt/man to stow/nail-9.29/man
[zippy@mybox stow]$ ls -ltr /opt/
[zippy@mybox stow]$ ls -ltr /opt
total 4
drwxr-xr-x 3 root root 4096 Jan 9 16:33 stow
lrwxrwxrwx 1 root root 18 Jan 9 16:33 man -> stow/nail-9.29/man
lrwxrwxrwx 1 root root 18 Jan 9 16:33 bin -> stow/nail-9.29/bin
stow/nail-9.29/bin
[zippy@mybox stow]$ PATH=/opt/bin:$PATH type nail
nail is /opt/bin/nail

Some explanation may be in order here, I cd'd to the stow directory (${prefix}/stow by default), and simply typed stow -vv plus the name of the subdirectory to recursively symlink. The -vv simply adds verbose output for illustrative purposes. So now all that needs to be done is to modify the $PATH variable, and your files are installed. Stow has created all the necessary links. Note that to uninstall the files (thus breaking the links) simply unstow them. This will disconnect (unlink) the installed binaries, but will not delete any files, so it's really quite a useful safety net.

Unstowing a directory
[zippy@mybox stow]$ pwd
/opt/stow
[zippy@mybox stow]$ ls -l
total 4
drwxr-xr-x 4 root root 4096 Jan 9 16:33 nail-9.29
[zippy@mybox stow]$ sudo stow -Dvv nail-9.29/
Unstowing in /opt
UNLINK /opt/bin
UNLINK /opt/man
[zippy@mybox stow]$

And all the installed files are neatly out of the way. Of course to restow the files you simply repeat the previous commands. This may seem like a lot of extra work, but once you get in the habit of using it, and experience the convenience of being able to unlink and entire package you'll find it's worth it. Finally, you might want to install nail yourself, and use it, possibly via an alias or shell function, as a mail replacement. But that could be an entire article in itself.

Happy hacking!

References

  1. GNU stow
    Maintained by Guillaume Morin
    http://www.gnu.org/software/stow/stow.html
    GNU stow entry on Savannah
    http://savannah.gnu.org/projects/stow
  2. Checkinstall
    by Itzo http://freshmeat.net/projects/checkinstall/
  3. Nail, a replacement for the mail MUA
    by Gunnar Ritter http://omnibus.ruf.uni-freiburg.de/~gritter/
  4. Linux Filesystem Hierarchy Standard, (FHS)
    Maintained by freestandards.org http://www.pathname.com/fhs/
  5. GNU Autoconf, Automake, and libtool
    By Gary V. Vaughan, Ben Elliston, Tom Tromey, and Ian Lance Taylor
    offers an excellent review of the concepts behind exec-prefix options to the configure script.
    http://sources.redhat.com/autobook/ ISBN 1-57870-190-2

Allan Peda

Allan has been enjoying Linux since about 1995, discovering Perl shortly thereafter. Currently he works as a programmer analyst at Rockefeller University, and does part time Linux consulting work in the NYC area. He enjoys surfing and sailing, and dreams of owning a charter boat in tranquilo Costa Rica.


Copyright © 2002, Allan Peda.
Copying license http://www.linuxgazette.com/copying.html
Published in Issue 75 of Linux Gazette, February 2002

"Linux Gazette...making Linux just a little more fun!"


Why I wrote Install Kernel (ik) and How It Works

By


ik (Install Kernel) is available at http://freshmeat.net/projects/ik and http://www.ramdown.com/war/ik.

In December 2000, after four years of using Linux, compiling and installing kernels became a waste of my time. I chose to write my own kernel installation and setup script called Install Kernel, because no other scripts existed at the time, and I needed something that would install the Linux kernel and automatically setup my bootloader configuration file with no user intervention. Install Kernel interfaces with the Linux operating system by moving and editing files. When not using ik, the majority of time consumed when updating a kernel mainly consists of moving files around and setting up configuration files. The ik script has three basic parts: dependency checks, compiling the kernel and moving the files to their proper locations, and editing boot loader configuration files. Install Kernel aims to help  people who are either new to installing the kernel or people who choose to use their time efficiently.

Every operating system has some type of kernel; the kernel is the core of the operating system. The current kernel version as of this writing is Linux 2.4.17. Most users either recompile or upgrade their kernels. One may choose to upgrade his or her kernel in order to add support for a certain device attached to his or her computer. For instance, if one bought a Universal Serial Bus (USB) scanner, he or she would have to make the appropriate changes to the kernel configuration file, and recompile and install the new kernel. Reasons for upgrading the kernel may include a better virtual memory subsystem, or important security fixes. An example would be Linux kernel version 2.4.11. This kernel was vulnerable to a symlink denial of service attack, prompting users running 2.4.11 to immediately upgrade to 2.4.12 when it became available due to this vulnerability. These are the fundamental reasons of why one may want to either recompile or upgrade his or her kernel.

Install Kernel interfaces with the Linux operating system by running a series of functions or groups of commands that automate the compiling or recompiling and installation process. It consists of three groups of functions: checking dependencies, building the kernel and moving files, and editing the boot loader configuration file. Grouping all of the functions in these three groups makes maintaining and altering the script much easier. Install Kernel can also be considered a program, because a program does checking and makes choices accordingly. A script is usually a file, which contains a certain number of commands with no logic in mind. Therefore, while ik is technically a script, it can also be called a program.

Dependency checks are to make sure the current system configuration and settings are properly setup before proceeding with the kernel build. There are seven dependency checks, they are: a root check, space check, link check, boot check, boot loader check, configuration check, and a module check. First, the root check makes sure the user is a super user; which means they are capable of editing important system files only accessible to the root account. The space check makes sure there is at least 200 megabytes available. The kernel source these days is around 150 megabytes just for the source code. When one compiles the kernel, it may increase the size to 50 megabytes or more. Therefore, ik checks for at least 200MB available in order to successfully compile the kernel without running out of space. Next, it is not required, but it is standard to have a symbolic link of /usr/src/linux pointing to /usr/src/linux-x.y.z. The fourth check makes sure the user has a /boot directory, this is where the Linux kernel files will be installed to. The fifth check determines the bootloader that will be used. There are two main boot loaders in Linux. LILO and GRUB are the two most popular for booting the operating system. This check accurately finds whether the kernel was booted from either LILO or GRUB by checking which bootloader was used last. It then tells the rest of the script to edit the correct one accordingly. The sixth configuration check is to make sure users have created a proper kernel configuration file, which is used in the process of building the Linux kernel. The final check is a module check, if modules are turned off, the script will determine this and alter the installation process to install with no module support. The main idea behind the depdency checks is to make sure the user cannot damage his or her system if they do not do something right.

The installation process also contains seven functions. The installation process is usually several commands. However, because of the differences that can occur in a user's configuration file, each part of the building process must be checked and the building process may need to be altered. The first function makes sure the dependencies are setup correctly for all files in the kernel source tree. The second function deletes stale object files and or old kernel files. Next, the third function is the kernel build function; this function runs a command to build the Linux kernel. Next, functions four and five make and install modules if the user had specified module support in his or her kernel configuration file. The sixth function moves the Linux kernel and its System dependency map to the boot partition. The last function of the build process sets up module dependencies for the new kernel if modules were defined. The installation process also includes a small error check for each part of the kernel build process. If any part of the kernel build process fails; the script will abort, not modifying any boot loader configuration files. This is important; because if it did not abort, it may alter the boot loader configuration files, thus rendering the system unbootable. It is important to support every Linux configuration possible because of the wide use of this script.

The boot loader configuration and setup process is probably the most important aspect of installing a new kernel. An improper boot loader configuration may leave one with system that does not boot; or simply does not boot the new kernel. It is also important, as some systems may have two or more boot loaders installed. There are four functions defined for this process. The first function uses the boot loader, which was defined during the configuration checks. The second function defines where the LILO or GRUB configuration files are located. Next, depending on which boot loader is found, either LILO or GRUB configuration files are edited automatically by sed. Sed is a stream editor, which edits a file with no user intervention. If user intervention were required, the user would have to be present between certain parts of the kernel installation. With ik, it makes efficient use of a user's time because only one command needs to be entered to complete the entire installation and setup process.

Install Kernel is a useful tool for those who are new to Linux, rebuild their kernel often, or value their time. It reduces the commands for installing the kernel from about thirteen to one. Users new to Linux may find this attractive. This is because the entire process is automated; and if something is not correct, in most cases ik will notify the user what is incorrect, and how to fix the error. On the other side, for experienced users who do not wish to spend valuable time installing a new kernel, this is also very handy. Install Kernel is efficient by requiring no user intervention and reducing time spent on kernel installs, and effective by giving new to Linux the option for an easy kernel upgrade. 


Copyright © 2002, Justin Piszcz.
Copying license http://www.linuxgazette.com/copying.html
Published in Issue 75 of Linux Gazette, February 2002

"Linux Gazette...making Linux just a little more fun!"


Writing Documentation, Part III: DocBook/XML

By


To cite from ``DocBook -- The Definitive Guide'' (see Further Reading at the end of this section), DocBook provides a system for writing structured documents using SGML or XML. In the following, I shall focus on the XML-variant of DocBook, because the SGML-variant is being phased out.

DocBook has been developed with a slightly different mindset than the systems I discussed in the two previous articles (POD article, LaTeX/latex2html article).

  • ``Text'' in a DocBook document is better understood as ``textual data''. Along the same lines, a DocBook document is better thought of as a human-readable database.
  • DocBook, as a standard, prescribes how valid documents must be formed and how the output produced from a DocBook document has to ``look''. I put quotes around ``look'', because DocBook documents are not restricted to being viewed on a screen, but can also be transformed into speech, for example in a car navigation system. (Imagine your SUV asking you: ``Do you want to install KDE version 3 now?'')
  • When transformed into any output format, DocBook documents are rigidly verified whether they conform to a given structure. This structure is defined in so-called document type descriptions, or DTDs for short.
    By changing the DTD, almost arbitrary constraints can be imposed on a DocBook document. For example, an organizing committee of a conference might adapt the DocBook DTD in such way that all the article of the conference's proceedings will have a uniform look and all the necessary author information.

The particular features of DocBook mentioned, imply uses of DocBook documents that are not possible, at least not easily, with POD or LaTeX documents.

  • Because of their structure, DocBook documents are easily created, modified, or queried programmatically.

    For example, we load the XML::DOM module into Perl to access XML compliant documents, and Python ships with the xml.dom module, which has been designed for the same purpose.

    The World Wide Web Consortium (W3C, http://www.w3c.org) has even defined a language for XML translations, called XSLT (see for example http://www.w3.org/TR/xslt and http://www.oasis-open.org/cover/xsl.html). XSLT itself is a language defined within the SGML framework, which makes XML and XSL look quite similar: loads of angle brackets.

  • Various tools transform DocBook sources into HTML, TeX, GNU Texinfo and many other -- including audio -- formats. This is again different to the source formats we looked at before, where only a single application does the transformation.

    Popular transformation tools are:

    • OpenJade (http://openjade.sourceforge.net/), which uses DSSSL (see for example http://www.jclark.com/dsssl/ and http://www.oasis-open.org/cover/dsssl.html), as a Lisp-like language to describe the transformations from XML-DocBook to HTML, TeX, and so on and
    • Saxon (http://saxon.sourceforge.net/), which uses XSL to do the job.

    The installation of both tools including the necessary DSSSL stylesheets or XSL stylesheets is quite tricky, thus I would like to recommend to beginners the installation from .deb or .rpm packages.

Being general purpose translators, both tools are not restricted to transforming DocBook documents. If you feed them the right style sheets, they will do other translations, too.

Syntax

The DocBook/XML syntax resembles HTML. The fundamental difference between the two being the strictness with which the syntax is enforced. Many HTML browsers are extremely forgiving about unterminated elements, and they often silently ignore unknown elements or attributes. DocBook/XML translators reject non-DTD complying input with detailed error messages, and refuse to produce any output in such cases.

DocBook/XML is spoken in several variants, where the variants differ in interpreting the closing tag of an element. The most verbose dialect always closes <tag> with </tag>. Another variant allows for abbreviating the closing tag to </>, yet another allows dropping the closing tag for empty elements all together. I prefer writing out every end tag, a style that has proven advantageous in deeply nested structures such as nested lists. So, in this article only the form <tag> ... </tag> will appear.

Special characters are written with the ampersand-semicolon convention as they are in HTML. The most frequently used special characters are

  • Ampersand, ``&amp;''
  • Less-Than Sign, ``&lt;'' and
  • Greater-Than Sign, ``&gt;''.

Comments are bracketed between ``<!--'' and ``-->''.

Document Structure

As already mentioned, DocBook documents must adhere to the structure that is defined in a DTD. Every document starts with selecting a particular DTD:

    <!DOCTYPE                                       (1)
     book                                           (2)
     PUBLIC "-//OASIS//DTD DocBook XML V4.1//EN"    (3)
     "/usr/share/sgml/db41xml/docbookx.dtd"         (4)
     [ ]                                            (5)
    >

where I have broken the expression (from ``<'' to ``>'') into several lines for easier analysis, and added numbers in parentheses for reference.

Part (1) tells the system that we are about to choose our DTD. Part (2) defines element book to be the root element of our document. part (3), the public identifier selects the DTD to use. The public identifier is the string in quotes. The system identifier, part (4) tells the translation tools where to find the DTD on the local computer system. Within the square brackets, part (5), we could place so called entity definitions, but I do not want go into detail on entities in this introduction, so we leave this space empty.

Now, we start the text with the root element, in our case book. What elements go into book is defined in the DocBook DTD. These are, for example, bookinfo or chapter. For a comprehensive list of allowed elements, consult ``The Definitive Guide''. The elements allowed within bookinfo or chapter are also defined in the DocBook DTD as are all elements. The only way constructing a valid document is by obeying all the rules prescribed by the DTD.

What might look like a drag on first sight -- Rules? Rules suck! -- is the key to open up the document to programmatic access. As the document complies to the DTD, all post-processing can rely on that very fact. Good for the programmers of the post-processors! I have to admit that the number of elements and the elements' mutual relationships is tough to pick up. However, the relations are logical: a chapter contains one ore more (introductory) paragraphs and one or more Level 1 sections. No section, on the other hand, contains a chapter, that would be nonsense. Having a copy of ``The Definitive Guide'' right next to the keyboard also helps to learn DocBook. Further down, there is a short compilation of commonly used tags.

Here comes a very short, but complete DocBook document.

    <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1//EN"
                          "/usr/share/sgml/db41xml/docbookx.dtd" []>
    <book>
        <bookinfo>
            <title>XYZ (version 0.8.15) User's Manual</title>
        </bookinfo>
        <chapter id = "chapter-introduction">
            <title>Introduction</title>
            <para>
                This chapter provides a quick introduction to XYZ.
            </para>
            <sect1 id = "section-syntax">
                <title>Syntax</title>
                <para>
                    In this section we present an outline of the
                    syntax of the XYZ language.
                </para>
            </sect1>
            <sect1 id = "section-core-library">
                <title>Core Library</title>
                <para>
                    Even if no additional libraries are loaded to a
                    XYZ program, it has access to some core library
                    functions.
                </para>
            </sect1>
        </chapter>
        <chapter id = "chapter-commands">
            <title>Commands</title>
            <sect1 id = "section-interactive-commands">
                <title>Interactive Commands</title>
                <para>
                    ...
                </para>
                <sect2 id = "section-interactive-commands-argumentless">
                    <title>Argumentless Commands</title>
                    <para>
                        ...
                    </para>
                </sect2>
            </sect1>
            <sect1 id = "section-non-interactive-commands">
                <title>Non-Interactive Commands</title>
                <para>
                    ...
                </para>
                <sect2 id = "section-non-interactive-commands-argumentless">
                    <title>Argumentless Commands</title>
                    <para>
                        ...
                    </para>
                </sect2>
            </sect1>
        </chapter>
    </book>

Useful Tags

To help the aspiring DocBook writer making sense of the loads of elements, the DocBook standard defines, I have compiled a bunch of useful tags, which are used often.

Root Section Tags

Root section tags define the outermost element of any document.

book
<book>
  I<paragraphs or chapters>

</book>

article
<article>
  I<paragraphs or level 1 sections>

</article>

Sectioning Tags

Sectioning elements divide the document into logical parts like chapters, sections, paragraphs, and so on.

chapter, sect1, ..., sect6
<chapter id = "label">

title

followed by

paragraphs or level N+1 sections

</chapter>

Define a section. Commonly, chapter and section elements carry the id attribute, which allows for referencing the elements with, for example, <xref linkend = "label"></xref>.

para
<para>

paragraph text

</para>

Group several lines of text together to form a paragraph. This is the workhorse element in many documents.

programlisting
<programlisting role = "language">

program text

</programlisting>

Render a longish piece of program text -- preserving the line breaks. The program is assumed to be written in the language specified in the role attribute. Note that within programlisting all special characters retain their meaning!

This means in particular that you cannot use the control characters ``<'', ``>'', and ``&'' inside of it. The several workarounds for this problem. Either you replace all control characters with their mnemonic equivalents (``&lt;'', ``&gt;'', and ``&amp;'' in our example), or you wrap the program code in a CDATA, like, for example,

    <programlisting>
        <![CDATA[
            cout << "value = <" << &p << ">\n";
        ]]>
    </programlisting>

or, if the program is stored in file <>my-program.pl</<EM>>, pull in the whole file with


    <programlisting>
        <inlinemediaobject>
            <imageobject>
                <imagedata format = "linespecific"
                           fileref = "my-program.pl"></imagedata>
            </imageobject>
        </inlinemediaobject>
    </programlisting>

List-Making Tags

Generate the three typical types of lists.

The items or definitions are typically formed by one or more paragraphs, but they are allowed to contain program listings, too. The terms usually are one or more words, not paragraphs.

  • Itemized List

    <itemizedlist>

        <listitem>

            first item

        </listitem>

        <listitem>

            second item

        </listitem>

        ...

    </itemizedlist>

  • Enumerated List

    <enumeratedlist>

        <listitem>

            first item

        </listitem>

        <listitem>

            second item

        </listitem>

        ...

    </enumeratedlist>

  • Description List

    <variablelist>

        <varlistentry>

            <term>first term</term>

            <listitem>

                 first definition

            </listitem>

        </varlistentry>

        <varlistentry>

            <term>second term</term>

            <listitem>

                 second definition

            </listitem>

        </varlistentry>

        ...

    </variablelist>

Inline Markup Tags

emphasis
<emphasis>text to be emphasized</emphasis>

Highlight a short part of the document; usually a single word.

filename
<filename>filename or directory name</filename>

Mark word as filename.

literal
<literal>literal something</literal>

<literal role = "classification">literal something</literal>

Mark a word as being a literal expression. Use this tag only as last possibility, if no other more specific tag matches. To calm one's bad conscience, literal often gets decorated with a role attribute, which describes more precisely the kind of literal.

replaceable
<replaceable>placeholder name</replaceable>

Mark a meta-variable.

title
<title>title</title>

Give a name to a section or a formal element, like a table.

Cross References

Cross references refer to other parts of the same DocBook document or to other documents on the World Wide Web. Targets of the former are all elements that carry an id attribute, targets of the latter are selected with universal resource locators (URLs).

link
<link linkend = "target">item</link>

Install a (hyper-)link to the spot identified via target within the current document.

ulink
<ulink url = "complete URL">item</ulink>

Install a hyper-link to a WWW-accessible document identified by a complete URL. A complete URL includes the protocol, for example, http://.

xref
<xref linkend = "target"></xref>

Install a (hyper-)link to the spot identified via target within the current document. A translator will add text around an xref element. For example, a xref to a section might be decorated with the text ``see section''.

What I Have Left Out

Ugh, I left out tons of stuff, but only to give you a smooth, non-frightening introduction. Some great things DocBook handles that I have not discussed are

  • Tables,
  • Graphics (with automatic selection of the ``appropriate'' format), and
  • Automated index generation.

Also left out is everything related to changing the DTD or changing the style sheets.

Pros and Cons

Pros
  • DocBook is an official W3C standard
  • Access to text via (user-defined) programs
  • Texts carry a rich marked up
Cons
  • Slow transformation
  • The DocBook format is very verbose. Unless the writer uses a special editor, a lot of typing is required.

Further Reading

  • Norman Walsh and Leonard Muellner, DocBook: The Definitive Guide, O'Reilly & Associates, first edition, ISBN: 156592-580-7 at Amazon. It is also available online (as second edition)
  • DocBook website
  • Norman Walsh's (chairman of the DocBook steering committee) website
  • DocBook Steering Committee

Next month: Texinfo

Christoph Spiel

Chris runs an Open Source Software consulting company in Upper Bavaria, Germany. Despite being trained as a physicist -- he holds a PhD in physics from Munich University of Technology -- his main interests revolve around numerics, heterogenous programming environments, and software engineering. He can be reached at .


Copyright © 2002, Christoph Spiel.
Copying license http://www.linuxgazette.com/copying.html
Published in Issue 75 of Linux Gazette, February 2002

"Linux Gazette...making Linux just a little more fun!"


The Adventures of Little Linus In GNU/Wonderland

By
Originally published at System Toolbox. Reprinted with permission.


In Which Little Linus Finds GNU/Wonderland

It was a sunny afternoon, and Linus was happily playing in his backyard. He was busy with all the things that little Linuses do on sunny days in their backyards. He was sitting in the shade of a large tree when he noticed something very out of place in a Linuses backyard. Waddling across the yard was a penguin! Every few yards, this penguin would pull out a Compaq Itsy, consult it, put it back in his pocket and say, "I'm late, I'm late, I'm late for my release date!"

Little Linus had never seen a penguin this close before. He had also never seen an Itsy. And he was rather sure that penguins shouldn't be talking or consulting Itsys. So as any curious Linus would do, he followed the penguin. No matter how quickly Linus walked, the penguin seemed to be the same distance away. The penguin didn't waddle any faster, Linus just couldn't seem to get any closer.

Suddenly, the penguin stopped at the very tree Linus had been sitting under. "Ah, here's what I was looking for... root access!" the penguin muttered. Then he popped into a small hole in one of the roots of the tree.

Linus decided to follow. He squeezed into the hole, and suddenly realized that he was falling. Everything below him was dark, so he couldn't see the bottom. He continued to fall wondering what was next. He began to look at his surroundings and noticed that there was a brick wall on one side of the hole. As he looked closer he could make out a set of eyes in the wall, falling at the same speed as he was. One of the eyes winked at him. Linus was slightly startled, but remembered his manners. "Hello, umm, Mr. Wall," Linus began cautiously, not quite sure how one should address walls.

To his surprise, a nose, mustache and mouth formed below the eyes and the entire face continued to slide down the wall at the same speed as Linus. "Hello young man! How are you this fine day?" the wall asked Linus.

"Well", Linus replied, "I'd feel much better if I knew how to stop falling."

"Ah", the wall nodded sagely, "Usually, one stops when they hit the bottom. But, as the camel says, 'there's more than one way to do it'."

Linus didn't quite understand the bit about the camel. However, he was sure that hitting the ground wasn't the best way to stop. He looked at the wall. "Ummm, I'd really rather stop in a way that didn't hurt me..."

The wall looked at him a bit then said, "Well, I suppose I can ask the camel to catch you." The face disappeared.

Linus continued to fall and realized that he hadn't looked down for awhile. Indeed it seemed that there was a light coming up from below. As he looked down, he saw the ground about thirty feet away. There directly under him stood a camel. Before he knew it, he had landed quite softly and safely between the camels humps. The camel turned and smiled at him, flashing his perly white teeth.

The wall spoke again. "The camel will help you get started here. He's quite user friendly." Then the face was gone again.

Linus looked at the camel, then remembered why he was here to begin with, "I was following a penguin, but I seemed to have lost his trail." The camel nodded and began walking towards a nearby wood.

In Which Linus Meets Several Strange Inhabitants of GNU/Wonderland

As they approached the wood, Linus noticed a taco walking up the road towards him. The taco appeared to be carrying several newspapers under his arm. "News for Nerds!" he was calling, "Get your News for Nerds here."

Linus stopped the camel and walked over to get a newspaper. However, before he could reach the taco, he heard a loud noise. Several thousand creatures, boys, girls, rabbits, unicorns, trolls and all other sorts of animals came rushing toward the taco. They all hit the taco at once, grabbing for the newspapers. Linus watched as wave after wave of things rushed across the poor taco. Then as suddenly as they had come, they were gone. Linus ran over to the taco, "Are you hurt?" He asked with concern.

"Not bad, at least this time no one dumped any breakfast cereal on me," the taco replied getting up and brushing himself off. [1]

Linus thought about querying further on the subject of breakfast cereal, however, he decided to skip it. After making sure the taco was OK he climbed back on the camel and set off again.

He had not traveled far when he heard a strange noise in the forest beside the path. "Perhaps it is a bear," he thought. However, before he could urge the camel to pick up the pace a man stepped out of the woods onto the path. He was an odd looking man, with hair that pointed anywhere except where hair usually points. Linus figured the man must have forgotten he owned a beard, since it looked like the beard had wandered off on its own quite awhile ago.

"Hullo, boy!" the man waved at Linus. "I am GNUman. Who are you?"

"My name is Linus, and it's nice to meet you, Neuman." Linus got down to shake the man's hand.

"Not Neuman, it's GNUman. Say it right!" The man said loudly.

Linus looked at the man carefully, then deciding he wasn't dangerous, shook his hand and said, "It's nice to meet you GNUman."

"Well, of course I'm more than happy to meet anyone around these parts. By the way, here's the rules to my game," GNUman said solemnly, handing Linus a scroll. "The rules are, that anyone can change the rules, as long as they tell everyone what rules they changed. That way everyone can make the rules fit their needs."

Linus wasn't quite sure what GNUman was talking about. However, he politely took the scroll and promised to read it. GNUman smiled and wandered off into the woods.

After a few hours of riding around on the camel, Linus noticed party sounds emanating from a nearby clearing. The camel noticed his interest and moved in that direction.

As they broke into the clearing there was an amazing sight. A long table set with coffee, doughnuts, pizza, as well as Chinese, Indian, and Mexican food. At one end was a keg of Guinness. At the head of the table was a man with a bushy black beard, long black hair, sunglasses and a red fedora. He motioned Linus over to a chair.

"I've been waitin' a bloody long time on you," the man said with a British accent. "Do you know how hard it is to keep all this food hot?"

Linus, beginning to get used to the odd people of this land, smiled and apologized for taking so long. Of course he had no idea that he was even expected, let alone late.

"Oh, not to worry," the English fellow said in a nicer tone, "I'm sure you were busy."

They began to eat, and Linus was amazed at the energy that this special food gave him. After eating in silence for awhile, he noticed that other creatures were sitting at the table enjoying the food as well. Oddly, he hadn't seen any of them sit down. Indeed, the large dog sitting next to him had appeared from nowhere. Linus had seen many canines before, but this was the first dog that he had seen with a big white beard.

The dog noticed Linus and flashed him a very big smile. He paused to wipe some white foam from his mouth and began eating again. Linus was a bit concerned that the dog may be 'mad'... Excusing himself, he got up to leave.

The Englishman at the head of the table motioned for him. "You can't leave yet," he exclaimed, "You have to do what you came for."

Linus had no idea what the man was talking about, so he waited patiently while the Englishman fiddled around in a big black box.

"Ah here it is," the man said, pulling out a single kernel of corn. "We need your blessing on this... ummm, here!"

With that the man handed Linus the piece of corn, and a crystal container filled with a yellow liquid. the bottle was labeled "Warning, contains hP2."

Linus stood there for a minute, everyone at the table had stopped eating and was watching him closely. He opened the stopper and sprinkled some of the 'hP2' on the corn. Everyone cheered and the kernel began to shake and jump. It bounced out of Linus' hand and fell onto the ground. It began to sprout and grow, a huge green plant came out of it and grew and grew, all the time the diners at the table were laughing, saying things like "Now that's scalability" or "Look at that, 40 feet high and still standing... How stable can you get!!"

Linus began to worry that he was expected to do something. But, before he could figure it out, the Dog that had sat next to him was again beside him.

"Well, what are you waiting for?" the dog asked. "You should already be climbing it."

"Ummm, why would I climb it?" Linus asked.

"No time for questions, I'll meet you up there," the dog replied, and promptly disappeared. The only thing left was the bushy, white beard which slowly faded.

Linus And The Cornstalk

Linus had been climbing the cornstalk for what seemed like hours when he finally found himself at the top. There before him was a giant building with a sign outside that read "Warning, Home of The RedMond Giant... all trespassers will be 'Embraced and Extended'."

Linus wasn't sure what that meant, but it didn't sound like something he wanted to have done to him. He began to look around, when he noticed fading into existence, a white bushy beard. Following the beard was the rest of the Dog, which he had seen down below.

"Hey again!" the dog said, smiling, "I see you made it."

"Yes, though I have no idea why you wanted me to climb up here. I really don't want to be embraced and extended by a giant."

"Oh, its ok, you have GNUman's rules, don't you? They're the only magic strong enough to defeat the giant."

Linus pulled out the scroll and looked at it carefully. "It doesn't look very magical to me," he said.

The dog smiled and began walking to the castle. Linus shrugged and followed him. As they got closer, he began to hear a loud voice singing, "Biddle, Bidele, Boddle, Bandard, I smell the smell of an Open Standard. Be it old or be it new, I'll make it part of my proprietary brew!"

Linus stopped, the voice was very loud, and a voice that loud had to come from a mouth that was very big. However, the dog continued to trot toward the castle, without a moments pause, so Linus followed. Finally he reached the formidable gates of the building. "There's no way in," Linus said relieved. "There is an awful lot of security around this place."

The dog laughed, "The only thing worse that the giant's silly rhyme, is his security! Trust me, there are many, many ways to get past it."

Sure enough, with just a slight bit of poking, a whole section of the fencing fell apart, leaving a gaping hole. The dog led Linus into the compound. As they walked across the yard toward the front door... several security people rushed to the point where they broke in. One of them, apparently the leader stood up on a podium and began to speak loudly.

"This is only a theoretical way of breaking into the giants compound. Anyone who is concerned about this is just being paranoid. Besides, only bad people would break into the compound, and we all know that bad people are stupid. So they wouldn't know about this hole."

As he spoke, several kiddies began knocking holes in other parts of the fence, following the example of Linus and the dog. The security people ignored them.

"Furthermore, there is very little likelihood that anyone will be able to duplicate this hole. In fact, if this fence were upgraded to version 2.000 then we wouldn't need to be concerned at all."

Immediately, all the other workers began putting up the next version of the fence. It looked bigger and stronger than the earlier fence. Linus looked at the dog. "It will be hard to get back out."

"Nonsense, I told you their security is hopeless. This new fence will likely be even worse than the first."

So Linus and the dog continued into the building, completely unnoticed by the security people. Within a few moments they were inside the building. The dog looked at Linus. "Ok, open the scroll and read the magic words of GNUman," he whispered.

Linus opened the scroll and read, "The GNU General Public License, Preamble..."

Linus read and read and read. Finally, as he reached the end of the very long magic incantation, he heard a noise. He looked up from the scroll, and saw huge cracks forming in the walls and ceiling. The building began to shake and shudder. The dog looked at Linus and said, "Let's get out of here. You've done what you came to do!"

They ran to the door and into the courtyard. Behind them they could hear the giant bellowing for his people to fix the holes and cracks, but it was too late, the home of the RedMond Giant was collapsing. Linus and the dog reached the brand-new fence, and to Linus' surprise, they realized that the entire fence was made of Swiss cheese, they climbed through the holes in the fence, and ran for the cornstalk.

The dog began to fade, he looked around at Linus, "Thank you so much... we all thank you. Have a nice life..."

"Wait," Linus shouted, "What am I supposed to do now?"

The dog was gone again, except for the beard. "Just get to the cornstalk. That new kernel will take care of you."

Linus reached the cornstalk, and began climbing down as fast as he could, but he lost his footing and before he knew it he had begun to fall. The ground was getting closer and closer, and suddenly, he found himself, laying on his back, on the ground. He blinked his eyes, and looked up at the Corn Stalk. He rubbed his eyes and looked again. It wasn't a cornstalk at all. It was the old tree in his back yard!

Linus got up, rubbed his eyes and walked toward the house. Once inside, he noticed a package sitting on the table, there was a card that read "To Our Dear Son". He opened the package, and to his delight there was a brand new 386 computer, just for him.

The End (or is it?)

Footnotes

[1] The author doesn't condone the abuse of any forum by trolls. This includes comments about hot grits. However, this small joke just couldn't be resisted.

D Clyde Williamson

Clyde is a network security specialist for a large corporation in the US. He writes articles on Technology, Open Source Advocacy and History (pre-1600). After writing the above article, he lives in prepetual fear of Lewis Carrol's ghost seeking revenge.


Copyright © 2002, D Clyde Williamson.
Copying license http://www.linuxgazette.com/copying.html
Published in Issue 75 of Linux Gazette, February 2002


Modest home on the web

By


Introduction

We will build a small homepage site without server side scripts. This is suitable for people who do not run their own web servers or have no priviledge to use server side facilities. We will use JavaScript and Lex to simulate some effects of template files to ease the maintaining tasks. We will use Makefile to automate the uploading, and use CSS to provide fancy formatting effects. We will use only standard HTML in our main content file, thus provide a good chance for any browsers to surf our web site easily.

The weird choice of using Lex to present a template effect is because I want to pretend that I am a guru. And gurus often use complicated or even brain damaged tools to fulfil simple and sometime stupid tasks. Of course, if I am a true guru, I'd rather write a similar tool by myself from scratch using LISP or C. But since I am only pretending I am one, so forget about it.

HTML and CSS

There is a wonderful Debian package which provides great documentation on standard CSS and HTML practice. That is wdg-html-reference package. If you are serious into HTML 4 and CSS, then you'd better apt-get into that package, and read the documents there. They're easy to follow. Only remember one thing though, a good understanding on CSS does NOT mean that you should use every possible effects on your homepage. A good taste is more important than a good technique. At the end of this article, I presented some example files, you could keep them handy when reading through.

I will not duplicate those excellent documentation on HTML and CSS here, and there are many more high quality documents outside on the web and in the bookstore. Even better, you could use your browser's "View Source" menu item to sneak in every webpage that you're interested to learn from. I will provide you one advice though, that is you should keep it simple, keep your homepage simple unless you have a big team of webmasters and webmistress work for you, or you have a lot and a lot of free time to work on your homepage.

Simple does not necessarily mean ugly, sometime simple is considered beauty, expecially when the CSS is available to nearly everyone now. So your best practice (pretending that I am an expert. heh) is to use standard HTML in your content file, and use the HTML tags as logically as you can.

For example, you may want to use <i> to empasize a sentence or a word, DON'T, use <em> instead. Then use CSS to provide the desired effects. That's the whole point. And not to forget to appreciate the Mozilla web browser which is nearly the most standard compliant one out there. (Hint, use it to test your webpage!)

Using JavaScript

Why using JavaScript? Since we are only building a modest homepage, we won't need those fancy features, not to mention those annoying pop-ups. The reason we are using JavaScript is that it could present us some template like features which could ease our task maintaining big bunch of webpages. Modest homepage does NOT mean that we cannot put many files there. ;)

For example, if we want to present a navigation menu for our webpage, we will have to copy and paste our menu paragraph in HTML into every content file (as mentioned above, we do not have enough priviledges to use any server side facilities.), and what if we want to change the style used for our menu? That's a big nightmare to adjust each webpages for that.

Instead we could write our menu in a JavaScript, and include the following in each of our webpages:

<script type="text/javascript" src="header.js" charset="iso-8859-1"> </script>

When we want to add an item to our menu, we only need to change the header.js file, then viola, every webpages are changed accordingly.

The syntax of JavaScript is very easy to learn, by reading some examples, you could get nearly the whole idea. Since we are using JavaScript to present navigation menus, we could even ease the task of generating menus by hand too. Go check out the example header.js at the end of this article.

Using Lex

Lex is presented in the Debian package flex. It is a GNU tool. What lex do is to scan the input file, and whenever a regular expression is met, execute some C code. So we can use it to scan our templates then generate the HTML files. Lex could turn your dull project of maitaining a stupid personal homepage into an exciting C programming journey. Isn't it wonderful?

Lex is a scanner generator, which means, we use lex to generate our scanner, then using our scanner to scan our template files to generate HTML files. How could lex generate a scanner? It does this by reading a rules file written by us. Basically, we design some set of rules, then using this rules in our content files. And we write a rules file for Lex, then we use lex to read our rules file and generate a scanner, then we use the scanner to scan our content file to get the desired HTML file. And, it's very simple! Gurus R Us!

What makes a rule? A rule is made of two parts. The first part is a regular expression (regex) similar to that you found in perl or egrep. The second part is a small part of C code. Whenever a regex is found met, then the C code will be executed. The following is a sample rule from our example rules file:

\"header\" {
  if (flag_lex == 1 && flag_key == 1 && current_key == HERE)
    {
      fprintf(yyout,
              "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.0//EN\""
              "\"http://www.w3.org/TR/REC-html40/strict.dtd\">"
              "<html><head><title>{zhaoway} %s</title>"
              "<link rel=\"stylesheet\" href=\"style.css\" type=\"text/css\">"
              "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\">"
              "<meta name=\"description\" content=%s>"
              "<meta name=\"keywords\" content=%s></head><body>"
              "<script type=\"text/javascript\" src=\"header.js\" charset=\"iso-8859-1\">"
              "</script>\n",
              keys[TITLE], keys[DESCRIPTION], keys[KEYWORDS]
              );

      flag_key = 0;
    }
  else ECHO;
}

The above code means that, when "header" is appeared in the input file, and some conditions are satisfied, then we will replace it with a big bunch of HTML codes. The corresonding example content file is as the following:

<lex title="home page" description="zhaoway's homepage." />
<lex keywords="zhaoway, personal, homepage, diary, curriculum, vitae, resume" />
<lex here="header" />

Making the upload

When doing the upload, to decide which file on the server needs to be updated is difficult, and that task should be automated indeed. So we use Make to do it. The basic idea is to touch a blank some.html.upload file whenever some.html is uploaded. When some.html is newer than some.html.upload that means it needs to be uploaded to the server again. The following Makefile rule shows that:

%.upload: %
        lftp -c "open -u \"$(USER),$(PASS)\" $(SITE); put $<"
        touch $@

Conclusion

Makefile and Lex themselves warrantize lengthy articles. They are very traditional Unix tools for C development. But could be very useful in maintaining webpages. We cannot explore the details of them very carefully. This article is just mean to raise your imagination with these traditional Unix tools.

Example files available

You could visit my homepage for the resulted effects. Have fun and good luck!

zhaoway

zhaoway lives in Nanjing, China. He divides his time among his beautiful girlfriend, his old Pentium computer, and pure mathematics. He wants to marry now, which means he needs money, ie., a job. Feel free to help him come into the sweet cage of marriage by providing him a job opportunity. He would be very thankful! His curriculum vitae is at his homepage. He is also another volunteer member of the Debian GNU/Linux project.


Copyright © 2002, zhaoway.
Copying license http://www.linuxgazette.com/copying.html

"Linux Gazette...making Linux just a little more fun!"


The Back Page

The back Page is short this month because Your Editor is similtaneously (A) working on another project, (B) going on vacation, and (C) preparing a talk for the Python conference, all at the same time. I'll be writing an article about the conference for Linux Journal, and presenting a paper called Cheetah: the Python-Powered Template Engine, and leading a BOF (Birds of a Feather) discussion on Cheetah, a project I'm a volunteer developer for.

There will be more Esperanto announcements next month. Meanwhile, baibilu pri Linukson on a new mailing list, linux-esperanto (http://www.ssc.com/mailman/listinfo/linux-esperanto/). (If you missed the Esperanto grammar discussion from January's LG, here it is.)


Happy Linuxing!

Mike ("Iron") Orr
Editor, Linux Gazette,


Copyright © 2002, the Editors of Linux Gazette.
Copying license http://www.linuxgazette.com/copying.html
Published in Issue 75 of Linux Gazette, February 2002