PanPG

PanPG is a parser generator. Input is a parser expression grammar (PEG) and output is a parser written in JavaScript.

Grammar Reuse

PanPG uses plain PEG grammar files without embedded code. The same grammar file is equally usable by a compiler, a syntax highlighter running in real time in a text editor, a static analysis tool, etc. All of these will need to process the parsed representation in different ways; if the code is embedded in the grammar then the grammar cannot be shared. Embedded code also needlessly ties the grammar to a single implementation language, when grammars could otherwise be shared among tools that generate parsers in different languages.

PanPG uses the PEG formalism, which is inherently composable; this supports reuse of grammars and grammar components. PanPG supports grammar composition by passing grammar patches or lists of grammars to the parser generator, which then merges them.

No Separate Lexer

The PEG formalism does not necessitate a separate lexing or tokenizing stage. There is only one formalism to learn, and the grammar uses the same primitives and operators (and can support the same visualization and debugging tools) "all the way down" — from the top-level rule to the individual characters that make up each token.

Lexers written using dedicated tools or by hand can be very fast, and a fair amount of analysis must be done on a PEG to derive similar performance optimizations. This work is just getting started, and currently PanPG generates slower parsers than many other parser generators available for JavaScript. The benefits of a simpler grammar may outweigh the current performance limitations for some applications, such as language prototyping.

Unicode Support

Unicode support in the JavaScript language is somewhat lacking, for example JavaScript regexes do not transparently support the full Unicode range. PanPG does support the full Unicode range, and transparently handles UTF-16, which is what JavaScript strings are. UTF-8 support is also planned and would be useful with non-standard JavaScript string representations, such as the buffer and byte array representations now arising in server-side JavaScript and elsewhere.

Status

As of v0.0.10 (August 2011), PanPG is usable but generated parsers are slower than those produced by other parser generators for JavaScript. The API for generating parsers and for using them should be relatively stable. New API methods may appear but the ones currently present are not likely to change in ways that would break existing code. The next releases will be focused on speed, as various optimizations are enabled and tested. The current release is hoped to be stable and correct rather than fast. Generated parsers have been tested in IE7 and current versions of Firefox, Chrome, Safari, Opera, and node.js. There is a minimal test page in the build directory.

Versions 0.0.7 through 0.0.10 are mainly bugfix releases; the details are in the changelog. Version 0.0.8 does add one new feature, which is a JavaScript AST. The changes from v0.0.5 to 0.0.6 are summarized in the v0.0.6 release announcement. There is also a project roadmap posted in March 2010.

Contact

Please direct questions, bug reports, patches, etc to inimino@inimino.org, or stop by #inimino on Freenode.

Downloads

The stable version is 0.0.10. The compiling API and utility API builds can be downloaded directly, or you can download the full source tree. The build system is browser-based and a little unusual, so rebuilding if you are hacking the source will require some work and installing some extra dependencies. A separate driver for the build system, using make instead of the browser, also exists; visit the IRC channel for details. If you are using node.js you can use npm to get the latest version. The source tarball contains a package.json file and is identical to the npm package.

In addition to the published packages you can also always browse the source online in the revision store, which is always the most recent source code. There is an rvs_get shell script that can be used to download a local copy of the current tip, or you can use wget.

If you just want the two library files:

If you want to check out the source code:

There's now a vim script by gf3 which gives syntax highlighting for PanPG grammar syntax in vim:

Previous releases: