From a6edc52d485cd4167625b030e4e65fcf0cf40c12 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michael=20Teichgr=C3=A4ber?= Date: Wed, 1 Dec 2010 22:29:50 +0100 Subject: [PATCH] add README.markdown, rm README.peg-markdown --- README.markdown | 102 +++++++++++++++++++++ README.peg-markdown | 214 -------------------------------------------- 2 files changed, 102 insertions(+), 214 deletions(-) create mode 100644 README.markdown delete mode 100644 README.peg-markdown diff --git a/README.markdown b/README.markdown new file mode 100644 index 0000000..9e1acc3 --- /dev/null +++ b/README.markdown @@ -0,0 +1,102 @@ +This is an implementation of John Gruber's [markdown][] in +[Go][]. It is a translation of [peg-markdown][], written by +John MacFarlane in C, into Go. It is using a modified version +of Andrew J Snodgrass' PEG parser [peg][] -- now supporting +LEG grammars --, which is itself based on the parser used +by peg-markdown. + +[markdown]: http://daringfireball.net/projects/markdown/ +[peg-markdown]: https://github.com/jgm/peg-markdown +[peg]: https://github.com/pointlander/peg +[Go]: http://golang.org/ + +Support for HTML output is implemented, but Groff and LaTeX +output have not been ported. The output should be identical +to that of peg-markdown. + +The Go version is around 5x slower than the original C +version. A marked speed improvement has been achieved by +converting function `preformat` from concatenating strings +to using bytes.Buffer. At other places, where this kind of +modification had been tried, performance did not improve. Also, +pre-allocating a large buffer for `element`s didn't show a +significant difference from allocating `element`s one at a time. + +## Installation + +Provided you have a recent copy of Go, and git is available, + + goinstall github.com/knieriem/markdown + +should install the package into +`$GOROOT/src/pkg/github.com/knieriem/markdown`, and build +it. During the build, a copy of [knieriem/peg][] will be +downloaded from github and compiled (`make peg` if done +manually). + +**NOTE:** At the moment, goinstall most likely will fail, +as it does not use the package's Makefile, but generates +its own, which is not sufficient as it does not know how +to build parser.leg.go from parser.leg. As a workaround, +after the failed goinstall, please do the following steps to +finish the installation: + + cd $GOROOT/src/pkg/github.com/knieriem/markdown + gomake install + +See doc.go for an example how to use the package. + +To create the command line program *markdown,* run + + cd $GOROOT/src/pkg/github.com/knieriem/markdown + gomake cmd + +the binary should then be available in subdirectory *cmd.* + +To run the Markdown 1.0.3 test suite, type + + make mdtest + +This will download peg-markdown, in case you have `git` +available, build cmd/markdown, and run the test suite. + +The test suite will fail on one test, for the same reason which +applies to peg-markdown, because the grammar is the same. +See the [original README][] for details. + +[original README]: https://github.com/jgm/peg-markdown/blob/master/README.markdown +[knieriem/peg]: https://github.com/knieriem/peg + +## Known issues + +Emphasis and strong markup within items of lists, as in input +like ./PHP Markdown Extra.mdtest/Emphasis.text from Michel +Fortin's [MDTest][] package, + + 1. ***test test*** + 2. ___test test___ + 3. *test **test*** + 4. **test *test*** + ... + +seem to present a problem for the LEG parser, which needs +(on my system) around four minutes to process that file. + +[MDTest]: http://git.michelf.com/mdtest/ + +## Todo + +* Implement definition lists (work in progress), and perhaps tables + +* Rename element key identifiers, so that they are not public + +* Where appropriate, use more idiomatic Go code + +## Subdirectory Index + +* peg – PEG parser generator (modified) from Andrew J Snodgrass + +* peg/leg – LEG parser generator, based on PEG + +* cmd – command line program `markdown` + diff --git a/README.peg-markdown b/README.peg-markdown deleted file mode 100644 index d61cdc3..0000000 --- a/README.peg-markdown +++ /dev/null @@ -1,214 +0,0 @@ -What is this? -============= - -This is an implementation of John Gruber's [markdown][] in C. It uses a -[parsing expression grammar (PEG)][] to define the syntax. This should -allow easy modification and extension. It currently supports output in -HTML, LaTeX, or groff_mm formats, and adding new formats is relatively -easy. - -[parsing expression grammar (PEG)]: http://en.wikipedia.org/wiki/Parsing_expression_grammar -[markdown]: http://daringfireball.net/projects/markdown/ - -It is pretty fast. A 179K text file that takes 5.7 seconds for -Markdown.pl (v. 1.0.1) to parse takes less than 0.2 seconds for this -markdown. It does, however, use a lot of memory (up to 4M of heap space -while parsing the 179K file, and up to 80K for a 4K file). (Note that -the memory leaks in earlier versions of this program have now been -plugged.) - -Both a library and a standalone program are provided. - -peg-markdown is written and maintained by John MacFarlane (jgm on -github), with significant contributions by Ryan Tomayko (rtomayko). -It is released under both the GPL and the MIT license; see LICENSE for -details. - -Installing -========== - -On a linux or unix-based system -------------------------------- - -This program is written in portable ANSI C. It requires -[glib2](http://www.gtk.org/download.html). Most *nix systems will have -this installed already. The build system requires GNU make. - -The other required dependency, [Ian Piumarta's peg/leg PEG parser -generator](http://piumarta.com/software/peg/), is included in the source -directory. It will be built automatically. (However, it is not as portable -as peg-markdown itself, and seems to require gcc.) - -To make the 'markdown' executable: - - make - -(Or, on some systems, `gmake`.) Then, for usage instructions: - - ./markdown --help - -To run John Gruber's Markdown 1.0.3 test suite: - - make test - -The test suite will fail on one of the list tests. Here's why. -Markdown.pl encloses "item one" in the following list in `

` tags: - - 1. item one - * subitem - * subitem - - 2. item two - - 3. item three - -peg-markdown does not enclose "item one" in `

` tags unless it has a -following blank line. This is consistent with the official markdown -syntax description, and lets the author of the document choose whether -`

` tags are desired. - -Cross-compiling for Windows with MinGW on a linux box ------------------------------------------------------ - -Prerequisites: - -* Linux system with MinGW cross compiler For Ubuntu: - - sudo apt-get install mingw32 - -* [Windows glib-2.0 binary & development files](http://www.gtk.org/download-windows.html). - Unzip files into cross-compiler directory tree (e.g., `/usr/i586-mingw32msvc`). - -Steps: - -1. Create the markdown parser using Linux-compiled `leg` from peg-0.1.4: - - ./peg-0.1.4/leg markdown_parser.leg >markdown_parser.c - - (Note: The same thing could be accomplished by cross-compiling leg, - executing it on Windows, and copying the resulting C file to the Linux - cross-compiler host.) - -2. Run the cross compiler with include flag for the Windows glib-2.0 headers: - for example, - - /usr/bin/i586-mingw32msvc-cc -c \ - -I/usr/i586-mingw32msvc/include/glib-2.0 \ - -I/usr/i586-mingw32msvc/lib/glib-2.0/include -Wall -O3 -ansi markdown*.c - -3. Link against Windows glib-2.0 headers: for example, - - /usr/bin/i586-mingw32msvc-cc markdown*.o \ - -Wl,-L/usr/i586-mingw32msvc/lib/glib,--dy,--warn-unresolved-symbols,-lglib-2.0 \ - -o markdown.exe - -The resulting executable depends on the glib dll file, so be sure to -load the glib binary on the Windows host. - -Compiling with MinGW on Windows -------------------------------- - -These directions assume that MinGW is installed in `c:\MinGW` and glib-2.0 -is installed in the MinGW directory hierarchy (with the mingw bin directory -in the system path). - -Unzip peg-markdown in a temp directory. From the directory with the -peg-markdown source, execute: - - cd peg-0.1.4 - for %i in (*.c) do @gcc -g -Wall -O3 -DNDEBUG -c -o %~ni.o %i - gcc -o leg.exe leg.o tree.o compile.o - cd .. - peg-0.1.4\leg.exe markdown_parser.leg >markdown_parser.c - @for %i in (markdown*.c) do @gcc -mms-bitfields -Ic:/MinGW/include/glib-2.0 -Ic:/MinGW/lib/glib-2.0/include -c -o %~ni.o %i - gcc -O3 -Lc:/MinGW/lib/glib-2.0 -lglib-2.0 -lintl markdown.o markdown_lib.o markdown_output.o markdown_parser.o -o markdown.exe -Wl,--dy,--warn-unresolved-symbols,-lglib-2.0,-Lc:/MinGW/lib/glib-2.0,-lglib-2.0,-lintl - -(Windows instructions courtesy of Matt Wolf.) - -Extensions -========== - -peg-markdown supports extensions to standard markdown syntax. -These can be turned on using the command line flag `-x` or -`--extensions`. `-x` by itself turns on all extensions. Extensions -can also be turned on selectively, using individual command-line -options. To see the available extensions: - - ./markdown --help-extensions - -The `--smart` extension provides "smart quotes", dashes, and ellipses. - -The `--notes` extension provides a footnote syntax like that of -Pandoc or PHP Markdown Extra. - -Using the library -================= - -The library exports two functions: - - GString * markdown_to_g_string(char *text, int extensions, int output_format); - char * markdown_to_string(char *text, int extensions, int output_format); - -The only difference between these is that `markdown_to_g_string` returns a -`GString` (glib's automatically resizable string), while `markdown_to_string` -returns a regular character pointer. The memory allocated for these must be -freed by the calling program, using `g_string_free()` or `free()`. - -`text` is the markdown-formatted text to be converted. Note that tabs will -be converted to spaces, using a four-space tab stop. Character encodings are -ignored. - -`extensions` is a bit-field specifying which syntax extensions should be used. -If `extensions` is 0, no extensions will be used. If it is `0xFFFFFF`, -all extensions will be used. To set extensions selectively, use the -bitwise `&` operator and the following constants: - - - `EXT_SMART` turns on smart quotes, dashes, and ellipses. - - `EXT_NOTES` turns on footnote syntax. [Pandoc's footnote syntax][] is used here. - - `EXT_FILTER_HTML` filters out raw HTML (except for styles). - - `EXT_FILTER_STYLES` filters out styles in HTML. - - [Pandoc's footnote syntax]: http://johnmacfarlane.net/pandoc/README.html#footnotes - -`output_format` is either `HTML_FORMAT`, `LATEX_FORMAT`, or `GROFF_MM_FORMAT`. - -To use the library, include `markdown_lib.h`. See `markdown.c` for an example. - -Hacking -======= - -It should be pretty easy to modify the program to produce other formats -than HTML or LaTeX, and to parse syntax extensions. A quick guide: - - * `markdown_parser.leg` contains the grammar itself. - - * `markdown_output.c` contains functions for printing the `Element` - structure in various output formats. - - * To add an output format, add the format to `markdown_formats` in - `markdown_lib.h`. Then modify `print_element` in `markdown_output.c`, - and add functions `print_XXXX_string`, `print_XXXX_element`, and - `print_XXXX_element_list`. Also add an option in the main program - that selects the new format. Don't forget to add it to the list of - formats in the usage message. - - * To add syntax extensions, define them in the PEG grammar - (`markdown_parser.leg`), using existing extensions as a guide. New - inline elements will need to be added to `Inline =`; new block - elements will need to be added to `Block =`. (Note: the order - of the alternatives does matter in PEG grammars.) - - * If you need to add new types of elements, modify the `keys` - enum in `markdown_peg.h`. - - * By using `&{ }` rules one can selectively disable extensions - depending on command-line options. For example, - `&{ extension(EXT_SMART) }` succeeds only if the `EXT_SMART` bit - of the global `syntax_extensions` is set. Add your option to - `markdown_extensions` in `markdown_lib.h`, and add an option in - `markdown.c` to turn on your extension. - - * Note: Avoid using `[^abc]` character classes in the grammar, because - they cause problems with non-ascii input. Instead, use: `( !'a' !'b' - !'c' . )` -