| DeSR Dependency Parser |
#include <RegExp.h>
Public Member Functions | |
| Pattern (std::string &expression, int cflags=0) | |
| Pattern (char const *expression, int cflags=0) | |
| Pattern & | operator= (Pattern const &other) |
| Assignement. | |
| bool | test (std::string const &str, int eflags=0) |
| Tests if the pattern matches at given string str. | |
| bool | test (char const *str, size_t len=0, int eflags=0) |
| Tests if the pattern matches at given string str, within the given length len. | |
| int | matchSize (std::string const &text, int eflags=0) |
| compute the size of the match. | |
| int | match (const char *start, const char *end, MatchGroups &pos, int eflags=0) |
| Matches the text between start and end and returns the matching positions in pos, expressed as byte-offset from start. | |
| int | match (std::string const &text, MatchGroups &pos, int eflags=0) |
| Matches the text in text and returns the matching positions in pos, expressed as byte-offset from start. | |
| std::vector< std::string > | match (std::string const &str, int eflags=0) |
| std::string | replace (std::string &text, std::string &with, bool replaceAll=false) |
| Replaces the first substring matching the expression within text with the string with. | |
Static Public Member Functions | |
| static std::string | escape (std::string &str) |
| Escapes all meta characters. | |
| static const unsigned char * | setLocale (char const *locale) |
| Set the locale for use during matching. | |
Static Public Attributes | |
| static const unsigned char * | CharTables = Pattern::setLocale(setlocale(LC_CTYPE, 0)) |
| The current chartable to use for matching. | |
A pattern is compiled from a regular expression and used in matching. Regular expressions are written using the Perl 5 syntax.
A simple use for testing whether a string matches a pattern is::
Pattern p("a*b");
bool b = p.test("aaab");
In order to extract the portions of the string that match, MatchGroups can be used:
Pattern p("(a*)b");
MatchGroups m(2);
string s("daaab");
int n = p.matches(s, m);
n is the number of groups matched: group 0 represents the substring captured by the whole pattern.
Definition at line 114 of file RegExp.h.
| Tanl::Text::RegExp::Pattern::Pattern | ( | std::string & | expression, | |
| int | cflags = 0 | |||
| ) |
| expression | the regular expression | |
| cflags | a combination of CompileFlags |
| Tanl::Text::RegExp::Pattern::Pattern | ( | char const * | expression, | |
| int | cflags = 0 | |||
| ) |
| expression | the regular expression | |
| cflags | a combination of CompileFlags |
Definition at line 42 of file RegExp.cpp.
References CharTables.
| std::vector<std::string> Tanl::Text::RegExp::Pattern::match | ( | std::string const & | str, | |
| int | eflags = 0 | |||
| ) |
| str | the text to match. | |
| eflags | any combinations of EvaluateFlags |
| int Tanl::Text::RegExp::Pattern::match | ( | std::string const & | text, | |
| MatchGroups & | pos, | |||
| int | eflags = 0 | |||
| ) |
Matches the text in text and returns the matching positions in pos, expressed as byte-offset from start.
| text | the string to match. | |
| pos | the identified matching positions. | |
| eflags | any combinations of EvaluateFlags |
| int Tanl::Text::RegExp::Pattern::match | ( | const char * | start, | |
| const char * | end, | |||
| MatchGroups & | pos, | |||
| int | eflags = 0 | |||
| ) |
Matches the text between start and end and returns the matching positions in pos, expressed as byte-offset from start.
| start | start of the text to match. | |
| end | end of the text to match. | |
| pos | the identified matching positions. | |
| eflags | any combinations of EvaluateFlags |
Definition at line 144 of file RegExp.cpp.
References Tanl::Text::RegExp::MatchGroups::size().
Referenced by Tanl::TokenSentenceReader::MoveNext(), and Tanl::ConllXSentenceReader::MoveNext().
| int Tanl::Text::RegExp::Pattern::matchSize | ( | std::string const & | text, | |
| int | eflags = 0 | |||
| ) |
compute the size of the match.
| text | the text to match. | |
| eflags | any combinations of EvaluateFlags. |
Assignement.
Hack to avoid freeing twice _pcre.
Definition at line 155 of file RegExp.h.
References _errorCode, _pcre, _pcre_extra, and subpatterns.
| std::string Tanl::Text::RegExp::Pattern::replace | ( | std::string & | text, | |
| std::string & | with, | |||
| bool | replaceAll = false | |||
| ) |
Replaces the first substring matching the expression within text with the string with.
If replaceAll is true, all occurrences are replaced.
| const unsigned char * Tanl::Text::RegExp::Pattern::setLocale | ( | char const * | locale | ) | [static] |
Set the locale for use during matching.
Use "en_US.iso885915" or similar for recognizing ISO Latin-15 letters.
Definition at line 30 of file RegExp.cpp.
References CharTables.
| bool Tanl::Text::RegExp::Pattern::test | ( | char const * | str, | |
| size_t | len = 0, |
|||
| int | eflags = 0 | |||
| ) |
Tests if the pattern matches at given string str, within the given length len.
| str | the string to match. | |
| len | the length of the string to match. | |
| eflags | any combinations of EvaluateFlags |
Definition at line 91 of file RegExp.cpp.
| bool Tanl::Text::RegExp::Pattern::test | ( | std::string const & | str, | |
| int | eflags = 0 | |||
| ) |
Tests if the pattern matches at given string str.
| str | the string to match. | |
| eflags | any combinations of EvaluateFlags |
Referenced by Parser::State::predicates(), and Parser::ParseState::transition().