This article is available in: English Castellano Deutsch Francais Nederlands Russian Turkce |
by About the author: Joined the Dutch LF team in 1999 and became second editor earlier this year. Is an informational chemistry student at the University of Nijmegen. Plays basketball and enjoys hiking. Content:
|
Abstract:
This article describes how you can use DocBook to develop PDF documents and will cover tools you need to edit DocBook articles and tools to translate them to PDF documents. Since this article only names the software tools you need and does not tell how to install them, this article is intended for experienced Linux users.
The first part of this article will focus on the format of DocBook documents. When DocBook is introduced, i will try to explain what tools are needed to convert these DocBook documents to PDF documents which can be viewed with Acrobat.
DocBook [1] is an SGML application developed to markup documents, just like HTML marks up web documents. In contrast to HTML, DocBook offers no information on the layout of the document. That is the reason why DocBook documents need to be converted to other formats before they can be viewed. Conversion to other formats is done by tools which apply a certain stylesheet to the DocBook document.
Later in this article will be explained what stylesheet you must use for this conversion and what tool applies the stylesheet to the DocBook document. First we are going to see how documents are put together.
DocBook is able to markup two kinds of documents: articles and books. Since they are in principle the same, I will use the article markup as an example. Before I will give an example of a simple article document, first some basic principles about DocBook.
DocBook is in principle a SGML application, just like HTML. But there is also an XML version of DocBook. The XML version is more strict, but easier to read and therefore to easier learn. Since XML itself is also an SGML application, all SGML tools can still be used. The main difference between the SGML and XML variant are the following (and this holds for every XML application):
Now that we covered these important formalities, we can start writing articles in DocBook.
<?xml version="1.0"?> <article> <title>Writing DocBook articles</title> <artheader> <abstract> This article describes how you can use DocBook to develop PDF documents and will cover tools you need to edit DocBook articles and tools to translate them to PDF documents. </abstract> <author> <firstname>Egon</firstname> <surname>Willighagen</surname> </author> <date></date> </artheader> </article>
Not that difficult I would say. We have started an article with a title, a short abstract, a date on which it was written and the name of the author.
The next step is to add sections to the article by making use of section elements:
<?xml version="1.0"?> <article> <title>Writing DocBook articles</title> <artheader> ... the articles header ... </artheader> <section> <title>Introduction</title> </section> ... other sections ... </article>
We have now added an Introduction section to the article. Additional section elements can be used to give Results, Conclusion or any other section.
All text is contained in para elements, comparable with HTML's p elements:
<section> <title>Introduction</title> <para> DocBook is an SGML application developed to markup documents, just like HTML marks up webdocuments. </para> </section>
But besides text a lot of other elements are available. In the rest of this section it is shown how information like examples, lists, pictures and some others can be inserted into the article.
Adding examplesExamples can be added with the use of the example element, like in the following example where an example program is given:
<example> <title>Perl program that converts an XML document into a HTML page.</title> <programlisting> #!/usr/bin/perl -w use diagnostics; use strict; use XML::XSLT; my $XSLTparser = XML::XSLT->new(); $XSLTparser->open_project ("file.xml", "stylesheet.xsl", "FILE", "FILE"); $XSLTparser->process_project; $XSLTparser->print_result(); </programlisting> </example>
Adding lists
Like in HTML DocBook can also contain lists. Lists are defined by the itemizedlist element that may contain one or more listitem elements:
<itemizedlist> <listitem> <para>an item</para> </listitem> <listitem> <para>another item</para> </listitem> <listitem> <para>and again an item</para> </listitem> </itemizedlist>
Lists can as well be orderd. In that case you can use the orderedlist element instead of the itemizedlist element. By adding a numeration parameter (e.g. <orderedlist numeration="Arabic">) you can set the number type.
Adding picturesImages can be put into the article:
<mediaobject> <imageobject> <imagedata fileref="some_picture.gif" format="gif"/> </imageobject> <textobject> <para> If you were not using <productname>Lynx</productname> you could now see a picture. </para> </textobject> </mediaobject>
Also note that the word Lynx has mark up. This is a feature specific for mark up language where layout is seperated from information. The article simply states that Lynx is a product of which Lynx is the name. The stylesheet later describes that the productname must be shown in a specific layout, for example, italic. In the following section we will see some additional markup for words.
Markup of wordsAs was shown in the picture example just above, words themselves can have markup. In the table below are some markup elements given for words:
Element | Description |
---|---|
abbrev | An abbreviation, especially one followed by a period. Example: <para><abbrev>e.g.</abbrev> means for example.</para> |
acronym | An acronym Example: <para><acronym>DSM</acronym> (chemical company) means "De StaatsMijnen" (=The State Mines).</para> |
Some persons email address Example: <para>My email is <email>egon.w@linuxfocus.org</email></para> |
|
keyword | One of the article keywords Example: <para>In my humble opinion <keyword>chemistry</keyword> is very important.</para> |
Now that a short introduction is given about DocBook elements, it is time to move on and start making a PDF document.
Once we have a DocBook document we can convert them to several formats. Besides the obvious PDF, we could also convert the document to a website, a PostScript document, a Tex source file or a RTF (Rich Text Format) document that can be read with WordPerfect, Word, StarWriter and other wordprocessors. But in this article we are only concerned with conversion into a PDF document.
DocBook documents can be written with any editor like Vi and Nedit. Even better is Emacs: Norman Walsh wrote an Emacs major mode for docbook [3] which adds some usefull aspects, like completing element names or inserting complete template elements.
Besides making your own test article, you can also download my version which contains the examples given in this article.
As explained in the beginning of this article we need both a stylesheet and a tool that uses this stylesheet to convert the DocBook article to the PDF format. The stylesheet actually does not convert DocBook directly into PDF, but a TeX step is in between. The stylesheet we use are Norman Walsh's Modular DocBook Stylesheets which [4] are written in DSSSL.
To use these stylesheet DSSSL stylesheet for conversions we need a DSSSL processor. The processor I used is called Jade [5] and was developed by James Clark (he stopped supporting this tool). It is replaced by OpenJade [6], but I haven't used that tool yet.
On my Debian system Walsh's Modular Stylesheets for conversion to PDF are installed in /usr/lib/sgml/stylesheets/dsssl/docbook/nwalsh/print/ which is given with the "-d" parameter for Jade. The "-t" option tells Jade to use a TeX backend:
egonw@localhost> ls -al total 3 -rw-r--r-- 1 egonw egonw 2887 Apr 8 22:06 docbook_article.xml egonw@localhost> jade -t tex -d /usr/lib/sgml/stylesheets/dsssl/docbook/nwalsh/print/docbook.dsl docbook_article.xml egonw@localhost> ls -al total 21 -rw-r--r-- 1 egonw egonw 2887 Apr 8 22:06 docbook_article.xml -rw-r--r-- 1 egonw egonw 17701 Apr 8 22:29 docbook_article.tex
egonw@localhost> ls -al total 21 -rw-r--r-- 1 egonw egonw 2887 Apr 8 22:06 docbook_article.xml -rw-r--r-- 1 egonw egonw 17701 Apr 8 22:29 docbook_article.tex egonw@localhost> pdfjadetex docbook_article.tex
The DocBook XML language is very extensive. And so are the means of converting them into other formats. This article only gives a very short introduction. Questions can be posted on the talkback pages for this article. More information can be found at references [8] and [9]. Note that this last reference itself is completly written in DocBook!
Advanced topics that are not covered by this article but are available with DocBook are:
|
Webpages maintained by the LinuxFocus Editor team © Egon Willighagen, FDL LinuxFocus.org Click here to report a fault or send a comment to LinuxFocus |
2001-01-27, generated by lfparser version 2.8