| Author: | David Goodger |
|---|---|
| Contact: | goodger@users.sourceforge.net |
| Revision: | $Revision$ |
| Date: | $Date: 2003/06/30$ |
| Copyright: | This document has been placed in the public domain. |
The files in this directory contain reStructuredText substitution definitions for character entity sets, from the ISO 8879 & ISO 9573-13 (combined), MathML, and HTML4 standards. They were generated by the tools/unicode2rstsubs.py program from the input file unicode.xml, which is maintained as part of the MathML 2 Recommentation XML source, available at <http://www.w3.org/Math/characters/unicode.xml>.
| Entity Set File | Description |
|---|---|
| html4-lat1.txt | HTML Latin 1 |
| html4-special.txt | HTML Special Characters |
| html4-symbol.txt | HTML Mathematical, Greek and Symbolic Characters |
| isoamsa.txt | Added Mathematical Symbols: Arrows |
| isoamsb.txt | Added Mathematical Symbols: Binary Operators |
| isoamsc.txt | Added Mathematical Symbols: Delimiters |
| isoamsn.txt | Added Mathematical Symbols: Negated Relations |
| isoamso.txt | Added Mathematical Symbols: Ordinary |
| isoamsr.txt | Added Mathematical Symbols: Relations |
| isobox.txt | Box and Line Drawing |
| isocyr1.txt | Russian Cyrillic |
| isocyr2.txt | Non-Russian Cyrillic |
| isodia.txt | Diacritical Marks |
| isogrk1.txt | Greek Letters |
| isogrk2.txt | Monotoniko Greek |
| isogrk3.txt | Greek Symbols |
| isogrk4.txt [1] | Alternative Greek Symbols |
| isolat1.txt | Added Latin 1 |
| isolat2.txt | Added Latin 2 |
| isomfrk.txt [1] | Mathematical Fraktur |
| isomopf.txt [1] | Mathematical Openface (Double-struck) |
| isomscr.txt [1] | Mathematical Script |
| isonum.txt | Numeric and Special Graphic |
| isopub.txt | Publishing |
| isotech.txt [1] | General Technical |
| mmlalias.txt | MathML aliases for entities from other sets |
| mmlextra.txt [1] | Extra names added by MathML |
| [1] | (1, 2, 3, 4, 5, 6) There is a *-wide.txt variant for each of these character entity set files, containing characters outside of the Unicode basic multilingual plane or BMP (wide-Unicode; code points greater than U+FFFF). Most pre-built Python distributions are "narrow" and do not support wide-Unicode characters. Python can be built with wide-Unicode support though; consult the Python build instructions for details. |
These character entity sets can be used in documents using the "include" directive and substitution references. For example:
.. include:: isonum.txt Copyright |copy| 2003 by John Q. Public, all rights reserved.
Individual definitions can also be copied from these entity set files and pasted into documents. This has two advantages: it removes dependencies, and it saves processing of unused characters. However, if more than a few character entities are defined, they add clutter to the document.
Substitution references require separation from the surrounding text with whitespace or punctuation. To use a character without intervening whitespace, you can use the disappearing-whitespace escape sequence, backslash-space:
.. include:: isonum.txt Copyright |copy| 2003, BogusMegaCorp\ |trade|.
The "unicode" directive can be used as well; whitespace is ignored and removed, effectively sqeezing together the text:
.. |copy| unicode:: U+000A9 .. COPYRIGHT SIGN .. |BogusMegaCorp (TM)| unicode:: BogusMegaCorp U+2122 .. with trademark sign Copyright |copy| 2003, |BogusMegaCorp (TM)|.