| Author: | Günter Milde |
|---|---|
| Contact: | milde@users.berlios.de |
| Date: | 2009-05-06 |
| Copyright: | © 2007, 2009 G. Milde, Released without warranties or conditions of any kind under the terms of the Apache License, Version 2.0 http://www.apache.org/licenses/LICENSE-2.0 |
Abstract
Proposal to add syntax highlight of code blocks to the capabilities of Docutils.
Contents
Syntax highlighting significantly enhances the readability of code. However, in the current version, docutils does not highlight literal blocks.
This sandbox project aims to add syntax highlight of code blocks to the capabilities of docutils. To find its way into the docutils core, it should meet the requirements laid out in a mail on Questions about writing programming manuals and scientific documents, by docutils main developer David Goodger:
I'd be happy to include Python source colouring support, and other languages would be welcome too. A multi-language solution would be useful, of course. My issue is providing support for all output formats -- HTML and LaTeX and XML and anything in the future -- simultaneously. Just HTML isn't good enough. Until there is a generic-output solution, this will be something users will have to put together themselves.
There are already docutils extensions providing syntax colouring, e.g:
a LaTeX package providing highly customisable and advanced syntax highlight, though only for LaTeX (and LaTeX derived PS|PDF).
Since Docutils 0.5, the "latex2e" writer supports syntax highlight of literal blocks by listings with the --literal-block-env=lstlistings option. You need to provide a custom style sheet. The stylesheets repository provides two LaTeX style sheets for highlighting literal-blocks with "listings".
features automatic highlighting using the Pygments highlighter. It introduces the custom directives
| code-block: | similar to the proposal below, |
|---|---|
| sourcecode: | an alias to "code-block", and |
| highlight: | configre highlight of "literal blocks". |
is a generic syntax highlighter written completely in Python.
On 2009-02-20, David Goodger wrote in docutils-devel
I'd like to see the extensions implemented in Bruce and Sphinx etc. folded back into core Docutils eventually. Otherwise we'll end up with incompatible systems.
Pygments seems to be the most promising Docutils highlighter.
For printed output and PDFs via LaTeX, the listings package is a viable alternative.
Syntax highlight can be achieved by front-end scripts combining docutils and pygments.
"something users [will have to] put together themselves"
Point 1 and 2 lead to the code-block directive proposal.
Point 3 becomes an issue in literate programming where a code block is the most used block markup. It is addressed in the proposal for a configurable literal block directive).
Note
This is the first draft for a reStructuredText definition, analogue to other directives in directives.txt.
| Directive Type: | "code-block" |
|---|---|
| Doctree Element: | literal_block |
| Directive Arguments: | One (language) or more (class names), optional. |
| Directive Options: | None. |
| Directive Content: | Becomes the body of the literal block. |
The "code-block" directive constructs a literal block where the content is parsed as source code and syntax highlight rules for language are applied. If syntax rules for language are not known to Docutils, it is rendered like an ordinary literal block.
A bit of Python code
.. code-block:: python
def my_function():
"just a test"
print 8/2
The directive content will be parsed and marked up as Python source code. The actual rendering depends on the style-sheet.
Remarks:
If the language argument is missing, a (configurable) default language should be used.
Additional arguments might be defined and passed to the pygments parser or the output document (as class arguments), e.g.
| number-lines: | let pygments include line-numbers |
|---|
The include directive should get a matching new option:
Felix Wiemann provided a proof of concept script that utilizes the pygments parser to parse a source code string and store the result in the document tree.
This concept is used in a pygments_code_block_directive (Source: pygments_code_block_directive.py), to define and register a "code-block" directive.
The writers can use the class information in the <inline> elements to render the tokens. They should ignore the class information if they are unable to use it or to pass it on.
Running the test script ../tools/test_pygments_code_block_directive.py produces example output for a set of writers.
The "html" writer works out of the box.
The conversion of myfunction.py.txt looks like myfunction.py.htm.
The "s5" and "pep" writers are not tested yet.
"xml" and "pseudoxml" work out of the box.
The conversion of myfunction.py.txt looks like myfunction.py.xml respective myfunction.py.pseudoxml
"latex2e" (SVN version) works out of the box.
A style file, e.g. pygments-docutilsroles.sty, is required to actually highlight the code in the output. (As with HTML, the pygments-produced style file will not work with docutils' output.)
Alternatively, the latex writer could reconstruct the original content and pass it to a lstlistings environment.
TODO: This should be the default behaviour with --literal-block-env=lstlistings.
The LaTeX output of myfunction.py.txt looks like myfunction.py.tex and corresponding PDF like myfunction.py.pdf.
The sandbox project odtwriter provided syntax highlight with pygments but used a different syntax and implementation.
(What is the status of the odtwriter now included in the standard distribution?)
A clean and simple syntax for highlighted code blocks -- preserving the space saving feature of the "minimised" literal block marker (:: at the end of a text paragraph). This is especially desirable in documents with many code blocks like tutorials or literate programs.
The role of inline interpreted text can be customised with the "default-role" directive. This allows the use of the concise "backtick" syntax for the most often used role, e.g. in a chemical paper, one could use:
.. default-role:: subscript The triple point of H\ `2`\O is at 0°C.
to produce
The triple point of H2O is at 0°C.
This customisation is currently not possible for block markup.
Analogue to customising the default role of "interpreted text" with the "default-role" directive, the concise :: literal-block markup could be used for e.g.
Example:
ordinary literal block::
some text typeset in monospace
.. default-literal-block:: code-block python
this is colourful Python code::
def hello():
print "hello world"
In the same line, a "default-block-quote" setting or directive could be considered to configure the role of a block quote.
Attention!
The content of this section relates to an old version of the odtwriter. Things changed with the inclusion of the odtwriter into standard Docutils.
This is only kept for historical reasons.
Dave Kuhlman's odtwriter extension can add syntax highlighting to ordinary literal blocks.
The --add-syntax-highlighting command line flag activates syntax highlighting in literal blocks. By default, the "python" lexer is used.
You can change this within your reST document with the sourcecode directive:
.. sourcecode:: off
ordinary literal block::
content set in teletype
.. sourcecode:: on
.. sourcecode:: python
colourful Python code::
def hello():
print "hello world"
The "sourcecode" directive defined by the odtwriter is principally different from the "code-block" directive of rst2html-pygments:
The odtwriter directive does not have content. It is a switch.
The syntax highlighting state and language/lexer set by this directive remain in effect until the next sourcecode directive is encountered in the reST document.
make highlighting active or inactive. <newstate> is either on or off.
change the lexer parsing literal code blocks. <lexer> should be one of aliases listed at pygment's languages and markup formats.
I.e. the odtwriter implements a configurable literal block directive (but with a slightly different syntax than the proposal above).