Introduction

DeSR is a dependency parser for natural language sentences.

Among its notable features:

accuracy: close to state of the art accuracy
efficiency: it can parse up to 200 sentence/sec
multilingual: it can be trained from an annotated corpus on multiple languages
customizable: features used in training can be customized.

Technique

DeSR is a shift-reduce dependency parser, which uses a variant of the approach of (Yamada and Matsumoto, 2003).

Dependency structures are built scanning the input from left to right and deciding at each step whether to perform a shift or to create a dependency between two adjacent tokens.

DeSR uses though a different set of rules and includes additional rules to handle non-projective dependencies that allow parsing to be performed deterministically in a single pass. The algorithm also produces fully labeled dependency trees.

A classifier is used for learning and predicting the proper parsing action. The parser can be configured, selecting among several learning algorithms (Averaged Perceptron, Maximum Entropy, memory-based learning using TiMBL, support vector machines using libSVM), providing user-defined feature models, and selecting input-output formats (including the CoNLL shared task format).

Training

Suppose you have both the parser and the configuration file in the same directory, you call:

   desr -t -m modelFile trainFile

to produce a model from a training corpus in CoNLL format.

Be careful using option SecondOrder, since it may considerably increase the model size.

Parsing

To parse sentences in CoNLL format, use:

   desr -m modelFile parseFile > parsedFile

If you plan to use the downloaded model file, first gunzip it.

For a full list of options, type:

   desr -h

Classifiers

Several classifiers are available, including: Maximum Entropy (-aME), Perceptron (-aAP), MBL (-aMBL) or SVM (-aSVM). The algorithm can also be specified in the configuration file desr.conf as well as the features to be used.

Maximum Entropy classifier
Averaged Perceptron
SVM
Memory Based Learning (requires optional library)
SnOW (requires optional library)