Malaga is a software package for the development and application of grammars that are used for the analysis of words and sentences of natural languages. It contains a programming language for the modelling of morphology and syntax grammars.
Malaga's grammar formalism is based on non-deterministic finite automatons. The analysis states are augmented by arbitrary complex feature structures that may consist of symbols, strings, lists and attribute value matrices thereof. The transition to a successor state is associated with the consumption of an allomorph in the input. A rule checks the category of the current state and the category of the allomorph (which is looked up in a runtime lexicon) and constructs the category of the successor state.
This grammar formalism is called Left-Associative Grammar (LAG), and has been invented by Roland Hausser.
You want to know what a Malaga development screen looks like? This screenshot has been taken on Linux running KDE.
Malaga will probably run on your computer if it is running Unix or a Unix-like operating system. It is known to work on Debian GNU/Linux 3.0 and later, and Mac OS X. You'll need an ISO90-compliant C compiler and GLib, version 2.0 or later.
Malaga works with UTF-8 as its only character set. To work with Malaga, your OS must support UTF-8.
Malaga comes with an Emacs Lisp file that defines Emacs modes to edit Malaga grammars and for interactive work with Malaga. If you use Emacs for your every-day work, you'll enjoy these modes because they make working and debugging with Malaga easier.
Malaga comes with the program "malshow", which is a GUI that shows the variables and results of the analysis process. It offers a much clearer display of complex feature structures than the terminal output. "malshow" can also display the analysis tree, which presents the overall progress and the ambiguities in the analysis process. The program "malshow" uses the GUI toolkit GTK+, version 2.8 or later.
The Malaga package includes a German toy syntax as well as a simple morphological parser for English number words and some grammars for formal languages.
Please refer to the version documentation.
Malaga is distributed under the GNU General Public License. This means that you can freely use and copy Malaga, as long as you obey the license conditions. These conditions shall essentially provide that Malaga remains free software.
Michael Piotrowski has developed a Perl interface, a Python interface and a Ruby interface for Malaga.
Hannu Väisänen has developed a Java interface and a C# interface for Malaga. You can download them here.
Some of these grammars are free software; others are only commercially available. Please refer to the respective rights-owner.
The grammar test pages have been set up by Cerstin Mahlow and are hosted by the Institut für Computerlinguistik der Universität Zürich. Thanks!
Björn Beutel still maintains Malaga, although he won't implement any major language extensions due to lack of time. Please send any bug reports to björn-beutel@arcor.de (Replace "ö" by "oe").