| DeSR Dependency Parser |
#include <Corpus.h>


Public Member Functions | |
| Corpus (Language const &lang) | |
| Corpus (Language const &lang, CorpusFormat &format) | |
| Create from specified CorpusFormat. | |
| Corpus (Language const &lang, char const *formatFile) | |
Read the corpus format from file formatFile. | |
| AttributeId | attributeId (const char *name) |
| virtual SentenceReader * | sentenceReader (std::istream *is) |
| virtual void | print (std::ostream &os, Sentence const &sent) const |
| Print the sentence in the standard format for the corpus. | |
Static Public Member Functions | |
| static Corpus * | create (Language const &language, char const *inputFormat) |
| Factory pattern for creating a Corpus based on the provided format. | |
| static Corpus * | create (char const *language, char const *inputFormat) |
| static CorpusFormat * | parseFormat (char const *formatFile) |
Read the corpus format from file formatFile. | |
Public Attributes | |
| Language const & | language |
| AttributeIndex | index |
| TokenFields | tokenFields |
Static Protected Member Functions | |
| static CorpusFormat * | parseFormat (std::istream &is) |
Definition at line 98 of file Corpus.h.
| Tanl::Corpus::Corpus | ( | Language const & | lang | ) | [inline] |
| Tanl::Corpus::Corpus | ( | Language const & | lang, | |
| CorpusFormat & | format | |||
| ) | [inline] |
Create from specified CorpusFormat.
| lang | the default language for sentences in the corpus. |
| Tanl::Corpus::Corpus | ( | Language const & | lang, | |
| char const * | formatFile | |||
| ) |
Read the corpus format from file formatFile.
| lang | the default language for sentences in the corpus. |
Definition at line 41 of file Corpus.cpp.
References Tanl::CorpusFormat::index, parseFormat(), and Tanl::CorpusFormat::tokenFields.
| AttributeId Tanl::Corpus::attributeId | ( | const char * | name | ) | [inline] |
Factory pattern for creating a Corpus based on the provided format.
| lang | the default language for sentences in the corpus. | |
| inputFormat | is either the name of a builtin format (either CoNLL, conll08, DgaXML, Text, TokenizedText) or the name of a file containing the specifications of the format. |
Definition at line 53 of file Corpus.cpp.
References Corpus(), Tanl::CorpusFormat::name, and parseFormat().
| CorpusFormat * Tanl::Corpus::parseFormat | ( | char const * | formatFile | ) | [static] |
Read the corpus format from file formatFile.
Definition at line 74 of file Corpus.cpp.
| virtual SentenceReader* Tanl::Corpus::sentenceReader | ( | std::istream * | is | ) | [virtual] |
filename. Reimplemented in Tanl::TextCorpus, and Tanl::TokenizedTextCorpus.