.. -*- coding: utf-8 -*- .. NOTE TO MAINTAINERS: Please add new questions to the end of their sections, so section/question numbers remain stable. =========================================== Docutils FAQ (Frequently Asked Questions) =========================================== :Date: $Date: 2015-03-13 20:35:19 +0100 (Fr, 13. Mär 2015) $ :Revision: $Revision: 7830 $ :Web site: http://docutils.sourceforge.net/ :Copyright: This document has been placed in the public domain. .. NOTE:: You are viewing the FAQ for *Docutils*, a text processing system which can process reStructuredText. For *reStructuredText*, the markup language itself, see the `reStructuredText FAQ`_. .. contents:: .. sectnum:: This is a work in progress. If you are reading a local copy, the `master copy`_ might be newer. This document uses are relative links; if they don't work, please use the `master copy`_. Please feel free to ask questions and/or provide answers; send email to the `Docutils-users`_ mailing list. Project members should feel free to edit the source text file directly. .. _master copy: http://docutils.sourceforge.net/FAQ.html .. _let us know: .. _Docutils-users: docs/user/mailing-lists.html#docutils-users .. Docutils: About Docutils ============== What is Docutils? ----------------- Docutils_ is a system for processing plaintext documentation into useful formats, such as HTML, XML, and LaTeX. It supports multiple types of input, such as standalone files (implemented), inline documentation from Python modules and packages (under development), `PEPs (Python Enhancement Proposals)`_ (implemented), and others as discovered. The Docutils distribution consists of: * a library (the "docutils" package), which `can be used by client code`_; * several `front-end tools`_ (such as ``rst2html.py``, which converts reStructuredText input into HTML output); * a `test suite`_; and * extensive documentation_. For an overview of the Docutils project implementation, see `PEP 258`_, "Docutils Design Specification". Docutils is implemented in Python_. .. _Docutils: http://docutils.sourceforge.net/ .. _PEPs (Python Enhancement Proposals): http://www.python.org/peps/pep-0012.html .. _can be used by client code: docs/api/publisher.html .. _front-end tools: docs/user/tools.html .. _test suite: docs/dev/testing.html .. _documentation: docs/index.html .. _PEP 258: http://www.python.org/peps/pep-0258.html .. _Python: http://www.python.org/ Why is it called "Docutils"? ---------------------------- Docutils is short for "Python Documentation Utilities". The name "Docutils" was inspired by "Distutils", the Python Distribution Utilities architected by Greg Ward, a component of Python's standard library. The earliest known use of the term "docutils" in a Python context was a `fleeting reference`__ in a message by Fred Drake on 1999-12-02 in the Python Doc-SIG mailing list. It was suggested `as a project name`__ on 2000-11-27 on Doc-SIG, again by Fred Drake, in response to a question from Tony "Tibs" Ibbs: "What do we want to *call* this thing?". This was shortly after David Goodger first `announced reStructuredText`__ on Doc-SIG. Tibs used the name "Docutils" for `his effort`__ "to document what the Python docutils package should support, with a particular emphasis on documentation strings". Tibs joined the current project (and its predecessors) and graciously donated the name. For more history of reStructuredText and the Docutils project, see `An Introduction to reStructuredText`_. Please note that the name is "Docutils", not "DocUtils" or "Doc-Utils" or any other variation. It is pronounced as in "DOCumentation UTILitieS", with emphasis on the first syllable. .. _An Introduction to reStructuredText: docs/ref/rst/introduction.html __ http://mail.python.org/pipermail/doc-sig/1999-December/000878.html __ http://mail.python.org/pipermail/doc-sig/2000-November/001252.html __ http://mail.python.org/pipermail/doc-sig/2000-November/001239.html __ http://homepage.ntlworld.com/tibsnjoan/docutils/STpy.html What is Docutils' relation to reStructuredText? ----------------------------------------------- Docutils is a software library which can process reStructuredText. reStructuredText is a markup language. To see the `reStructuredText FAQ`_ Is the Docutils document model based on any existing XML models? ---------------------------------------------------------------- Not directly, no. It borrows bits from DocBook, HTML, and others. I (David Goodger) have designed several document models over the years, and have my own biases. The Docutils document model is designed for simplicity and extensibility, and has been influenced by the needs of the reStructuredText markup. Docutils Usage ============== Is there a GUI authoring environment for Docutils? -------------------------------------------------- DocFactory_ is under development. It uses wxPython and looks very promising. .. _DocFactory: http://docutils.sf.net/sandbox/gschwant/docfactory/doc/ Can I use Docutils for Python auto-documentation? ------------------------------------------------- Yes, in conjunction with other projects. The Sphinx_ documentation generator includes an autodoc module. .. _Sphinx: http://sphinx.pocoo.org/index.html Version 2.0 of Ed Loper's `Epydoc `_ supports reStructuredText-format docstrings for HTML output. Docutils 0.3 or newer is required. Development of a Docutils-specific auto-documentation tool will continue. Epydoc works by importing Python modules to be documented, whereas the Docutils-specific tool, described above, will parse modules without importing them (as with `HappyDoc `_, which doesn't support reStructuredText). The advantages of parsing over importing are security and flexibility; the disadvantage is complexity/difficulty. * Security: untrusted code that shouldn't be executed can be parsed; importing a module executes its top-level code. * Flexibility: comments and unofficial docstrings (those not supported by Python syntax) can only be processed by parsing. * Complexity/difficulty: it's a lot harder to parse and analyze a module than it is to ``import`` and analyze one. For more details, please see "Docstring Extraction Rules" in `PEP 258`_, item 3 ("How"). Docutils Development ==================== What is the status of the Docutils project? ------------------------------------------- Although useful and relatively stable, Docutils is experimental code, with APIs and architecture subject to change. Our highest priority is to fix bugs as they are reported. So the latest code from the repository_ (or the snapshots_) is almost always the most stable (bug-free) as well as the most featureful. What is the Docutils project release policy? -------------------------------------------- It's "release early & often". We also have automatically-generated snapshots_ which always contain the latest code from the repository_. As the project matures, we may formalize on a stable/development-branch scheme, but we're not using anything like that yet. .. _repository: docs/dev/repository.html .. _snapshots: http://docutils.sourceforge.net/#download HTML Writer =========== What is the status of the HTML Writer? -------------------------------------- The HTML Writer module, ``docutils/writers/html4css1.py``, is a proof-of-concept reference implementation. While it is a complete implementation, some aspects of the HTML it produces may be incompatible with older browsers or specialized applications (such as web templating). The sandbox has some alternative HTML writers, contributions are welcome. What kind of HTML does it produce? ---------------------------------- It produces XHTML compatible with the `XHTML 1.0`_ specification. A cascading stylesheet is required for proper viewing with a modern graphical browser. Correct rendering of the HTML produced depends on the CSS support of the browser. A general-purpose stylesheet, ``html4css1.css`` is provided with the code, and is embedded by default. It is installed in the "writers/html4css1/" subdirectory within the Docutils package. Use the ``--help`` command-line option to see the specific location on your machine. .. _XHTML 1.0: http://www.w3.org/TR/xhtml1/ What browsers are supported? ---------------------------- No specific browser is targeted; all modern graphical browsers should work. Some older browsers, text-only browsers, and browsers without full CSS support are known to produce inferior results. Firefox, Safari, Mozilla (version 1.0 and up), Opera, and MS Internet Explorer (version 5.0 and up) are known to give good results. Reports of experiences with other browsers are welcome. Unexpected results from tools/rst2html.py: H1, H1 instead of H1, H2. Why? -------------------------------------------------------------------------- Here's the question in full: I have this text:: Heading 1 ========= All my life, I wanted to be H1. Heading 1.1 ----------- But along came H1, and so shouldn't I be H2? No! I'm H1! Heading 1.1.1 ************* Yeah, imagine me, I'm stuck at H3! No?!? When I run it through tools/rst2html.py, I get unexpected results (below). I was expecting H1, H2, then H3; instead, I get H1, H1, H2:: ... ... Heading 1

Heading 1

<-- first H1

All my life, I wanted to be H1.

Heading 1.1

<-- H1

But along came H1, and so now I must be H2.

Heading 1.1.1

Yeah, imagine me, I'm stuck at H3!

... What gives? Check the "class" attribute on the H1 tags, and you will see a difference. The first H1 is actually ``

``; this is the document title, and the default stylesheet renders it centered. There can also be an ``

`` for the document subtitle. If there's only one highest-level section title at the beginning of a document, it is treated specially, as the document title. (Similarly, a lone second-highest-level section title may become the document subtitle.) See `How can I indicate the document title? Subtitle?`_ for details. Rather than use a plain H1 for the document title, we use ``

`` so that we can use H1 again within the document. Why do we do this? HTML only has H1-H6, so by making H1 do double duty, we effectively reserve these tags to provide 6 levels of heading beyond the single document title. HTML is being used for dumb formatting for nothing but final display. A stylesheet *is required*, and one is provided; see `What kind of HTML does it produce?`_ above. Of course, you're welcome to roll your own. The default stylesheet provides rules to format ``

`` and ``

`` differently from ordinary ``

`` and ``

``:: h1.title { text-align: center } h2.subtitle { text-align: center } If you don't want the top section heading to be interpreted as a title at all, disable the `doctitle_xform`_ setting (``--no-doc-title`` option). This will interpret your document differently from the standard settings, which might not be a good idea. If you don't like the reuse of the H1 in the HTML output, you can tweak the `initial_header_level`_ setting (``--initial-header-level`` option) -- but unless you match its value to your specific document, you might end up with bad HTML (e.g. H3 without H2). .. _doctitle_xform: docs/user/config.html#doctitle-xform .. _initial_header_level: docs/user/config.html#initial-header-level (Thanks to Mark McEahern for the question and much of the answer.) How are lists formatted in HTML? -------------------------------- If list formatting looks strange, first check that you understand `list markup`__. __ `How should I mark up lists?`_ * By default, HTML browsers indent lists relative to their context. This follows a long tradition in browsers (but isn't so established in print). If you don't like it, you should change the stylesheet. This is different from how lists look in reStructuredText source. Extra indentation in the source indicates a blockquote, resulting in too much indentation in the browser. * A list item can contain multiple paragraphs etc. In complex cases list items are separated by vertical space. By default this spacing is omitted in "simple" lists. A list is simple if every item contains a simple paragraph and/or a "simple" nested list. For example: * text * simple * simple * simple * simple text after a nested list * multiple paragraphs In this example the nested lists are simple (and should appear compacted) but the outer list is not. If you want all lists to have equal spacing, disable the `compact_lists`_ setting (``--no-compact-lists`` option). The precise spacing can be controlled in the stylesheet. Note again that this is not exactly WYSIWYG: it partially resembles the rules about blank lines being optional between list items in reStructuredText -- but adding/removing optional blank lines does not affect spacing in the output! It's a feature, not a bug: you write it as you like but the output is styled consistently. .. _compact_lists: docs/user/config.html#compact-lists Why do enumerated lists only use numbers (no letters or roman numerals)? ------------------------------------------------------------------------ The rendering of enumerators (the numbers or letters acting as list markers) is completely governed by the stylesheet, so either the browser can't find the stylesheet (try enabling the `embed_stylesheet`_ setting [``--embed-stylesheet`` option]), or the browser can't understand it (try a recent Firefox, Mozilla, Konqueror, Opera, Safari, or even MSIE). .. _embed_stylesheet: docs/user/config.html#embed-stylesheet There appear to be garbage characters in the HTML. What's up? -------------------------------------------------------------- What you're seeing is most probably not garbage, but the result of a mismatch between the actual encoding of the HTML output and the encoding your browser is expecting. Your browser is misinterpreting the HTML data, which is encoded text. A discussion of text encodings is beyond the scope of this FAQ; see one or more of these documents for more info: * `UTF-8 and Unicode FAQ for Unix/Linux `_ * Chapters 3 and 4 of `Introduction to i18n [Internationalization] `_ * `Python Unicode Tutorial `_ * `Python Unicode Objects: Some Observations on Working With Non-ASCII Character Sets `_ The common case is with the default output encoding (UTF-8), when either numbered sections are used (via the "sectnum_" directive) or symbol-footnotes. 3 non-breaking spaces are inserted in each numbered section title, between the generated number and the title text. Most footnote symbols are not available in ASCII, nor are non-breaking spaces. When encoded with UTF-8 and viewed with ordinary ASCII tools, these characters will appear to be multi-character garbage. You may have an decoding problem in your browser (or editor, etc.). The encoding of the output is set to "utf-8", but your browswer isn't recognizing that. You can either try to fix your browser (enable "UTF-8 character set", sometimes called "Unicode"), or choose a different encoding for the HTML output. You can also try ``--output-encoding=ascii:xmlcharrefreplace`` for HTML or XML, but not applicable to non-XMLish outputs (if using runtime settings/configuration files, use ``output_encoding=ascii`` and ``output_encoding_error_handler=xmlcharrefreplace``). If you're generating document fragments, the "Content-Type" metadata (between the HTML ```` and ```` tags) must agree with the encoding of the rest of the document. For UTF-8, it should be:: Also, Docutils normally generates an XML declaration as the first line of the output. It must also match the document encoding. For UTF-8:: .. _sectnum: docs/ref/rst/directives.html#sectnum How can I retrieve the body of the HTML document? ------------------------------------------------- (This is usually needed when using Docutils in conjunction with a templating system.) You can use the `docutils.core.publish_parts()`_ function, which returns a dictionary containing an 'html_body_' entry. .. _docutils.core.publish_parts(): docs/api/publisher.html#publish-parts .. _html_body: docs/api/publisher.html#html-body Why is the Docutils XHTML served as "Content-type: text/html"? -------------------------------------------------------------- Full question: Docutils' HTML output looks like XHTML and is advertised as such:: But this is followed by:: Shouldn't this be "application/xhtml+xml" instead of "text/html"? In a perfect web, the Docutils XHTML output would be 100% strict XHTML. But it's not a perfect web, and a major source of imperfection is Internet Explorer. Despite it's drawbacks, IE still represents the majority of web browsers, and cannot be ignored. Short answer: if we didn't serve XHTML as "text/html" (which is a perfectly valid thing to do), it couldn't be viewed in Internet Explorer. Long answer: see the `"Criticisms of Internet Explorer" Wikipedia entry `__. However, there's also `Sending XHTML as text/html Considered Harmful`__. What to do, what to do? We're damned no matter what we do. So we'll continue to do the practical instead of the pure: support the browsers that are actually out there, and not fight for strict standards compliance. __ http://hixie.ch/advocacy/xhtml (Thanks to Martin F. Krafft, Robert Kern, Michael Foord, and Alan G. Isaac.) .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: .. Here's a code css to make a table colourful:: /* Table: */ th { background-color: #ede; } /* alternating colors in table rows */ table.docutils tr:nth-child(even) { background-color: #F3F3FF; } table.docutils tr:nth-child(odd) { background-color: #FFFFEE; } table.docutils tr { border-style: solid none solid none; border-width: 1px 0 1px 0; border-color: #AAAAAA; } .. _reStructuredText FAQ: FAQ-reStructuredText.html .. _How can I indicate the document title? Subtitle?: FAQ-reStructuredText.html#How can I indicate the document title? Subtitle? .. _How should I mark up lists?: FAQ-reStructuredText.html##how-should-i-mark-up-lists