.. -*- coding: utf-8 -*-


.. NOTE TO MAINTAINERS: Please add new questions to the end of their
   sections, so section/question numbers remain stable.


===========================================
 Docutils FAQ (Frequently Asked Questions)
===========================================

:Date: $Date: 2015-03-13 20:35:19 +0100 (Fr, 13. Mär 2015) $
:Revision: $Revision: 7830 $
:Web site: http://docutils.sourceforge.net/
:Copyright: This document has been placed in the public domain.

.. NOTE::

   You are viewing the FAQ for *Docutils*, a text processing system which
   can process reStructuredText. For *reStructuredText*, the markup 
   language itself, see the `reStructuredText FAQ`_. 


.. contents::
.. sectnum::


This is a work in progress.  If you are reading a local copy, the
`master copy`_ might be newer.  This document uses are relative links;
if they don't work, please use the `master copy`_.

Please feel free to ask questions and/or provide answers; send email
to the `Docutils-users`_ mailing list.  Project members should feel
free to edit the source text file directly.

.. _master copy: http://docutils.sourceforge.net/FAQ.html
.. _let us know:
.. _Docutils-users: docs/user/mailing-lists.html#docutils-users

.. Docutils:

About Docutils
==============

What is Docutils?
-----------------

Docutils_ is a system for processing plaintext documentation into
useful formats, such as HTML, XML, and LaTeX.  It supports multiple
types of input, such as standalone files (implemented), inline
documentation from Python modules and packages (under development),
`PEPs (Python Enhancement Proposals)`_ (implemented), and others as
discovered.

The Docutils distribution consists of:

* a library (the "docutils" package), which `can be used by client
  code`_;
* several `front-end tools`_ (such as ``rst2html.py``, which converts
  reStructuredText input into HTML output);
* a `test suite`_; and
* extensive documentation_.

For an overview of the Docutils project implementation, see `PEP
258`_, "Docutils Design Specification".

Docutils is implemented in Python_.

.. _Docutils: http://docutils.sourceforge.net/
.. _PEPs (Python Enhancement Proposals):
   http://www.python.org/peps/pep-0012.html
.. _can be used by client code: docs/api/publisher.html
.. _front-end tools: docs/user/tools.html
.. _test suite: docs/dev/testing.html
.. _documentation: docs/index.html
.. _PEP 258: http://www.python.org/peps/pep-0258.html
.. _Python: http://www.python.org/


Why is it called "Docutils"?
----------------------------

Docutils is short for "Python Documentation Utilities".  The name
"Docutils" was inspired by "Distutils", the Python Distribution
Utilities architected by Greg Ward, a component of Python's standard
library.

The earliest known use of the term "docutils" in a Python context was
a `fleeting reference`__ in a message by Fred Drake on 1999-12-02 in
the Python Doc-SIG mailing list.  It was suggested `as a project
name`__ on 2000-11-27 on Doc-SIG, again by Fred Drake, in response to
a question from Tony "Tibs" Ibbs: "What do we want to *call* this
thing?".  This was shortly after David Goodger first `announced
reStructuredText`__ on Doc-SIG.

Tibs used the name "Docutils" for `his effort`__ "to document what the
Python docutils package should support, with a particular emphasis on
documentation strings".  Tibs joined the current project (and its
predecessors) and graciously donated the name.

For more history of reStructuredText and the Docutils project, see `An
Introduction to reStructuredText`_.

Please note that the name is "Docutils", not "DocUtils" or "Doc-Utils"
or any other variation.  It is pronounced as in "DOCumentation
UTILitieS", with emphasis on the first syllable.

.. _An Introduction to reStructuredText: docs/ref/rst/introduction.html
__ http://mail.python.org/pipermail/doc-sig/1999-December/000878.html
__ http://mail.python.org/pipermail/doc-sig/2000-November/001252.html
__ http://mail.python.org/pipermail/doc-sig/2000-November/001239.html
__ http://homepage.ntlworld.com/tibsnjoan/docutils/STpy.html

What is Docutils' relation to reStructuredText?
-----------------------------------------------

Docutils is a software library which can process reStructuredText. 
reStructuredText is a markup language.

To see the `reStructuredText FAQ`_

Is the Docutils document model based on any existing XML models?
----------------------------------------------------------------

Not directly, no.  It borrows bits from DocBook, HTML, and others.  I
(David Goodger) have designed several document models over the years,
and have my own biases.  The Docutils document model is designed for
simplicity and extensibility, and has been influenced by the needs of
the reStructuredText markup.

Docutils Usage
==============

Is there a GUI authoring environment for Docutils?
--------------------------------------------------

DocFactory_ is under development.  It uses wxPython and looks very
promising.

.. _DocFactory:
   http://docutils.sf.net/sandbox/gschwant/docfactory/doc/
   
Can I use Docutils for Python auto-documentation?
-------------------------------------------------

Yes, in conjunction with other projects.

The Sphinx_ documentation generator includes an autodoc module.

.. _Sphinx: http://sphinx.pocoo.org/index.html

Version 2.0 of Ed Loper's `Epydoc <http://epydoc.sourceforge.net/>`_
supports reStructuredText-format docstrings for HTML output.  Docutils
0.3 or newer is required.  Development of a Docutils-specific
auto-documentation tool will continue.  Epydoc works by importing
Python modules to be documented, whereas the Docutils-specific tool,
described above, will parse modules without importing them (as with
`HappyDoc <http://happydoc.sourceforge.net/>`_, which doesn't support
reStructuredText).

The advantages of parsing over importing are security and flexibility;
the disadvantage is complexity/difficulty.

* Security: untrusted code that shouldn't be executed can be parsed;
  importing a module executes its top-level code.
* Flexibility: comments and unofficial docstrings (those not supported
  by Python syntax) can only be processed by parsing.
* Complexity/difficulty: it's a lot harder to parse and analyze a
  module than it is to ``import`` and analyze one.

For more details, please see "Docstring Extraction Rules" in `PEP
258`_, item 3 ("How").


Docutils Development
====================

What is the status of the Docutils project?
-------------------------------------------

Although useful and relatively stable, Docutils is experimental code,
with APIs and architecture subject to change.

Our highest priority is to fix bugs as they are reported.  So the
latest code from the repository_ (or the snapshots_) is almost always
the most stable (bug-free) as well as the most featureful.


What is the Docutils project release policy?
--------------------------------------------

It's "release early & often".  We also have automatically-generated
snapshots_ which always contain the latest code from the repository_.
As the project matures, we may formalize on a
stable/development-branch scheme, but we're not using anything like
that yet.

.. _repository: docs/dev/repository.html
.. _snapshots: http://docutils.sourceforge.net/#download

HTML Writer
===========

What is the status of the HTML Writer?
--------------------------------------

The HTML Writer module, ``docutils/writers/html4css1.py``, is a
proof-of-concept reference implementation.  While it is a complete
implementation, some aspects of the HTML it produces may be incompatible
with older browsers or specialized applications (such as web templating).
The sandbox has some alternative HTML writers, contributions are welcome.


What kind of HTML does it produce?
----------------------------------

It produces XHTML compatible with the `XHTML 1.0`_ specification.  A
cascading stylesheet is required for proper viewing with a modern
graphical browser.  Correct rendering of the HTML produced depends on
the CSS support of the browser.  A general-purpose stylesheet,
``html4css1.css`` is provided with the code, and is embedded by
default.  It is installed in the "writers/html4css1/" subdirectory
within the Docutils package.  Use the ``--help`` command-line option
to see the specific location on your machine.

.. _XHTML 1.0: http://www.w3.org/TR/xhtml1/


What browsers are supported?
----------------------------

No specific browser is targeted; all modern graphical browsers should
work.  Some older browsers, text-only browsers, and browsers without
full CSS support are known to produce inferior results.  Firefox,
Safari, Mozilla (version 1.0 and up), Opera, and MS Internet Explorer
(version 5.0 and up) are known to give good results.  Reports of
experiences with other browsers are welcome.


Unexpected results from tools/rst2html.py: H1, H1 instead of H1, H2.  Why?
--------------------------------------------------------------------------

Here's the question in full:

    I have this text::

        Heading 1
        =========

        All my life, I wanted to be H1.

        Heading 1.1
        -----------

        But along came H1, and so shouldn't I be H2?
        No!  I'm H1!

        Heading 1.1.1
        *************

        Yeah, imagine me, I'm stuck at H3!  No?!?

    When I run it through tools/rst2html.py, I get unexpected results
    (below).  I was expecting H1, H2, then H3; instead, I get H1, H1,
    H2::

        ...
        <html lang="en">
        <head>
        ...
        <title>Heading 1</title>
        </head>
        <body>
        <div class="document" id="heading-1">
        <h1 class="title">Heading 1</h1>                <-- first H1
        <p>All my life, I wanted to be H1.</p>
        <div class="section" id="heading-1-1">
        <h1><a name="heading-1-1">Heading 1.1</a></h1>        <-- H1
        <p>But along came H1, and so now I must be H2.</p>
        <div class="section" id="heading-1-1-1">
        <h2><a name="heading-1-1-1">Heading 1.1.1</a></h2>
        <p>Yeah, imagine me, I'm stuck at H3!</p>
        ...

    What gives?

Check the "class" attribute on the H1 tags, and you will see a
difference.  The first H1 is actually ``<h1 class="title">``; this is
the document title, and the default stylesheet renders it centered.
There can also be an ``<h2 class="subtitle">`` for the document
subtitle.

If there's only one highest-level section title at the beginning of a
document, it is treated specially, as the document title.  (Similarly, a
lone second-highest-level section title may become the document
subtitle.)  See `How can I indicate the document title?  Subtitle?`_ for
details.  Rather than use a plain H1 for the document title, we use ``<h1
class="title">`` so that we can use H1 again within the document.  Why
do we do this?  HTML only has H1-H6, so by making H1 do double duty, we
effectively reserve these tags to provide 6 levels of heading beyond the
single document title.

HTML is being used for dumb formatting for nothing but final display.
A stylesheet *is required*, and one is provided; see `What kind of
HTML does it produce?`_ above.  Of course, you're welcome to roll your
own.  The default stylesheet provides rules to format ``<h1
class="title">`` and ``<h2 class="subtitle">`` differently from
ordinary ``<h1>`` and ``<h2>``::

    h1.title {
      text-align: center }

    h2.subtitle {
      text-align: center }

If you don't want the top section heading to be interpreted as a
title at all, disable the `doctitle_xform`_ setting
(``--no-doc-title`` option).  This will interpret your document
differently from the standard settings, which might not be a good
idea.  If you don't like the reuse of the H1 in the HTML output, you
can tweak the `initial_header_level`_ setting
(``--initial-header-level`` option) -- but unless you match its value
to your specific document, you might end up with bad HTML (e.g. H3
without H2).

.. _doctitle_xform: docs/user/config.html#doctitle-xform
.. _initial_header_level: docs/user/config.html#initial-header-level

(Thanks to Mark McEahern for the question and much of the answer.)


How are lists formatted in HTML?
--------------------------------

If list formatting looks strange, first check that you understand
`list markup`__.

__ `How should I mark up lists?`_

* By default, HTML browsers indent lists relative to their context.
  This follows a long tradition in browsers (but isn't so established
  in print).  If you don't like it, you should change the stylesheet.

  This is different from how lists look in reStructuredText source.
  Extra indentation in the source indicates a blockquote, resulting in
  too much indentation in the browser.

* A list item can contain multiple paragraphs etc.  In complex cases
  list items are separated by vertical space.  By default this spacing
  is omitted in "simple" lists.  A list is simple if every item
  contains a simple paragraph and/or a "simple" nested list.  For
  example:

      * text

        * simple

          * simple
          * simple

        * simple

        text after a nested list

      * multiple

        paragraphs

  In this example the nested lists are simple (and should appear
  compacted) but the outer list is not.

  If you want all lists to have equal spacing, disable the
  `compact_lists`_ setting (``--no-compact-lists`` option).  The
  precise spacing can be controlled in the stylesheet.

  Note again that this is not exactly WYSIWYG: it partially resembles
  the rules about blank lines being optional between list items in
  reStructuredText -- but adding/removing optional blank lines does
  not affect spacing in the output!  It's a feature, not a bug: you
  write it as you like but the output is styled consistently.

  .. _compact_lists: docs/user/config.html#compact-lists


Why do enumerated lists only use numbers (no letters or roman numerals)?
------------------------------------------------------------------------

The rendering of enumerators (the numbers or letters acting as list
markers) is completely governed by the stylesheet, so either the
browser can't find the stylesheet (try enabling the
`embed_stylesheet`_ setting [``--embed-stylesheet`` option]), or the
browser can't understand it (try a recent Firefox, Mozilla, Konqueror,
Opera, Safari, or even MSIE).

.. _embed_stylesheet: docs/user/config.html#embed-stylesheet


There appear to be garbage characters in the HTML.  What's up?
--------------------------------------------------------------

What you're seeing is most probably not garbage, but the result of a
mismatch between the actual encoding of the HTML output and the
encoding your browser is expecting.  Your browser is misinterpreting
the HTML data, which is encoded text.  A discussion of text encodings
is beyond the scope of this FAQ; see one or more of these documents
for more info:

* `UTF-8 and Unicode FAQ for Unix/Linux
  <http://www.cl.cam.ac.uk/~mgk25/unicode.html>`_

* Chapters 3 and 4 of `Introduction to i18n [Internationalization]
  <http://www.debian.org/doc/manuals/intro-i18n/>`_

* `Python Unicode Tutorial
  <http://www.reportlab.com/i18n/python_unicode_tutorial.html>`_

* `Python Unicode Objects: Some Observations on Working With Non-ASCII
  Character Sets <http://effbot.org/zone/unicode-objects.htm>`_

The common case is with the default output encoding (UTF-8), when
either numbered sections are used (via the "sectnum_" directive) or
symbol-footnotes.  3 non-breaking spaces are inserted in each numbered
section title, between the generated number and the title text.  Most
footnote symbols are not available in ASCII, nor are non-breaking
spaces.  When encoded with UTF-8 and viewed with ordinary ASCII tools,
these characters will appear to be multi-character garbage.

You may have an decoding problem in your browser (or editor, etc.).
The encoding of the output is set to "utf-8", but your browswer isn't
recognizing that.  You can either try to fix your browser (enable
"UTF-8 character set", sometimes called "Unicode"), or choose a
different encoding for the HTML output.  You can also try
``--output-encoding=ascii:xmlcharrefreplace`` for HTML or XML, but not
applicable to non-XMLish outputs (if using runtime
settings/configuration files, use ``output_encoding=ascii`` and
``output_encoding_error_handler=xmlcharrefreplace``).

If you're generating document fragments, the "Content-Type" metadata
(between the HTML ``<head>`` and ``</head>`` tags) must agree with the
encoding of the rest of the document.  For UTF-8, it should be::

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Also, Docutils normally generates an XML declaration as the first line
of the output.  It must also match the document encoding.  For UTF-8::

    <?xml version="1.0" encoding="utf-8" ?>

.. _sectnum: docs/ref/rst/directives.html#sectnum


How can I retrieve the body of the HTML document?
-------------------------------------------------

(This is usually needed when using Docutils in conjunction with a
templating system.)

You can use the `docutils.core.publish_parts()`_ function, which
returns a dictionary containing an 'html_body_' entry.

.. _docutils.core.publish_parts(): docs/api/publisher.html#publish-parts
.. _html_body: docs/api/publisher.html#html-body


Why is the Docutils XHTML served as "Content-type: text/html"?
--------------------------------------------------------------

Full question:

    Docutils' HTML output looks like XHTML and is advertised as such::

      <?xml version="1.0" encoding="utf-8" ?>
      <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
       "http://www.w3.org/TR/xht ml1/DTD/xhtml1-transitional.dtd">

    But this is followed by::

      <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

    Shouldn't this be "application/xhtml+xml" instead of "text/html"?

In a perfect web, the Docutils XHTML output would be 100% strict
XHTML.  But it's not a perfect web, and a major source of imperfection
is Internet Explorer.  Despite it's drawbacks, IE still represents the
majority of web browsers, and cannot be ignored.

Short answer: if we didn't serve XHTML as "text/html" (which is a
perfectly valid thing to do), it couldn't be viewed in Internet
Explorer.

Long answer: see the `"Criticisms of Internet Explorer" Wikipedia
entry <http://en.wikipedia.org/wiki/Criticisms_of_Internet_Explorer#XHTML>`__.

However, there's also `Sending XHTML as text/html Considered
Harmful`__.  What to do, what to do?  We're damned no matter what we
do.  So we'll continue to do the practical instead of the pure:
support the browsers that are actually out there, and not fight for
strict standards compliance.

__ http://hixie.ch/advocacy/xhtml

(Thanks to Martin F. Krafft, Robert Kern, Michael Foord, and Alan
G. Isaac.)


..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   End:

.. Here's a code css to make a table colourful::

   /* Table: */
   
   th {
       background-color: #ede;
   }
   
   /* alternating colors in table rows */
   table.docutils tr:nth-child(even) {
       background-color: #F3F3FF;
   }
   table.docutils tr:nth-child(odd) {
       background-color: #FFFFEE;
   }
   
   table.docutils tr {
       border-style: solid none solid none;
       border-width: 1px 0 1px 0;
       border-color: #AAAAAA;
   }  
   

.. _reStructuredText FAQ: FAQ-reStructuredText.html
.. _How can I indicate the document title?  Subtitle?: FAQ-reStructuredText.html#How can I indicate the document title?  Subtitle?
.. _How should I mark up lists?: FAQ-reStructuredText.html##how-should-i-mark-up-lists