| DeSR Dependency Parser |
Namespaces | |
| namespace | RegExp |
| Regular Expression matching. | |
| namespace | Unicode |
| Utilities to handle UTF-8 strings. | |
Classes | |
| class | Char |
| Representation of Unicode characters. More... | |
| class | Utf8Char |
| This is just a type specifier for use in CharBuffer. More... | |
| class | CChar |
| This is just a type specifier for use in CharBuffer. More... | |
| class | CharBuffer |
| A text buffer that provides a random access iterator through it. More... | |
| class | Encoding |
| class | HtmlTokenizer |
| Similar to StringTokenizer, except that it skips HTML tags. More... | |
| class | StreamTokenizer |
| class | String |
| String class This class stores and manipulates strings of characters defined according to ISO10646. More... | |
| class | StringTokenizer |
| class | Suffixes |
| List of string suffix. More... | |
| struct | eqstr |
| struct | eqstrcase |
| struct | WordIndex |
| Associates an ID to each word in a set. More... | |
| class | WordSetBase |
| class | WordSet |
| Set of words. More... | |
| struct | NormEqual |
| Compare strings by normalizing to lowercase and discarding non alphanumeric characters. More... | |
| struct | NormHash |
| class | NormWordSet |
Typedefs | |
| typedef unsigned short | UCS2 |
| UCS2 holds a single UTF-16 code unit. | |
| typedef int | UCS4 |
| UCS4 represents a Unicode code point. | |
Functions | |
| char | iso8859_to_ascii (char c) |
| Convert an 8-bit ISO 8859-1 (Latin 1) character to its closest 7-bit ASCII equivalent. | |
| bool | operator== (const String &s1, const String &s2) |
| bool | operator== (const String &s1, const std::string &s2) |
| bool | operator== (const String &s1, const char *s2) |
| bool | operator== (const std::string &s1, const String &s2) |
| bool | operator== (const char *s1, const String &s2) |
| bool | operator!= (const String &s1, const String &s2) |
| bool | operator< (const String &s1, const String &s2) |
| bool | operator> (const String &s1, const String &s2) |
| bool | operator<= (const String &s1, const String &s2) |
| bool | operator>= (const String &s1, const String &s2) |
| String | operator+ (const String &s1, const String &s2) |
| String | operator+ (const String &s1, String::CharType *c) |
| String | operator+ (String::CharType *c, const String &s1) |
| String | operator+ (const String &s1, String::CharType c) |
| String | operator+ (String::CharType c, const String &s1) |
| bool | strStartsWith (const char *s1, const char *init) |
| Determine whether string s1 starts with the sequence in init, disregarding case. | |
| void | itoa (register long n, register char *s) |
| Convert a long integer to a string. | |
| void | to_lower (register char *d, register char const *s) |
| Convert a string to lower case. | |
| char * | to_lower (register char *s) |
| Destructively convert a string to lower case. | |
| string & | to_lower (string &s) |
| Convert a string to lower case. | |
| void | to_upper (register char *d, register char const *s) |
| Convert a string to upper case. | |
| char * | to_upper (register char *s) |
| Destructively convert a string to upper case. | |
| string & | to_upper (string &s) |
| Convert a string to upper case. | |
| char const * | next_token (char const *&ptr, const char *sep, char esc) |
| simple string tokenizer, with escape. | |
| char * | strstr (const char *haystack, const char *needle, size_t count) |
Variant of strstr() which limits search to count characters in haystack. | |
| std::string | operator+ (const std::string s, const int i) |
| std::string | operator+ (const int i, const std::string s) |
| std::string | operator+ (const std::string s, const unsigned i) |
| std::string | operator+ (const unsigned i, const std::string s) |
| void | itoa (long, char *) |
| String utilities. | |
| char | to_lower (char c) |
| char * | to_lower (char *) |
| std::string & | to_lower (std::string &) |
| char | to_upper (char c) |
| char * | to_upper (char *) |
| std::string & | to_upper (std::string &) |
| int | strncasecmp (const char *s1, const char *s2) |
| bool | strempty (const char *s) |
| Test for empty string. | |
Variables | |
| char const | iso8859_map [256] |
| char Tanl::Text::iso8859_to_ascii | ( | char | c | ) | [inline] |
Convert an 8-bit ISO 8859-1 (Latin 1) character to its closest 7-bit ASCII equivalent.
(This mostly means that accents are stripped.)
This function exists to ensure that the value of the character used to index the iso8859_map[] vector declared above is unsigned.
| c | The character to be converted. |
International Standards Organization. "ISO 8859-1: Information Processing -- 8-bit single-byte coded graphic character sets -- Part 1: Latin alphabet No. 1," 1987.
| void Tanl::Text::itoa | ( | register long | n, | |
| register char * | s | |||
| ) |
Convert a long integer to a string.
| n | The long integer to be converted. | |
| s | A pointer to the string. |
Definition at line 59 of file strings.cpp.
| char const * Tanl::Text::next_token | ( | char const *& | ptr, | |
| const char * | sep, | |||
| char | esc | |||
| ) |
simple string tokenizer, with escape.
if preceded by
| esc. | A token is a sequence of characters delimited by characters in | |
| sep | except when preceded by | |
| esc. | ||
| sep | sequence of delimiting characters |
| ptr. | Advances ptr to the end of the token. | |
| esc | is an escape character for line continuation |
Definition at line 223 of file strings.cpp.
References next_token().
Referenced by next_token().
| bool Tanl::Text::strempty | ( | const char * | s | ) | [inline] |
| string& Tanl::Text::to_lower | ( | string & | s | ) |
Convert a string to lower case.
| s | The string to be converted. |
Definition at line 121 of file strings.cpp.
| char* Tanl::Text::to_lower | ( | register char * | s | ) |
Destructively convert a string to lower case.
| s | The string to be converted. |
Definition at line 105 of file strings.cpp.
References to_lower().
| void Tanl::Text::to_lower | ( | register char * | d, | |
| register char const * | s | |||
| ) |
Convert a string to lower case.
| d | The destination string. | |
| s | The string to be converted. |
Definition at line 90 of file strings.cpp.
Referenced by strStartsWith(), and to_lower().
| string& Tanl::Text::to_upper | ( | string & | s | ) |
Convert a string to upper case.
| s | The string to be converted. |
Definition at line 172 of file strings.cpp.
| char* Tanl::Text::to_upper | ( | register char * | s | ) |
Destructively convert a string to upper case.
| s | The string to be converted. |
Definition at line 156 of file strings.cpp.
References to_upper().
| void Tanl::Text::to_upper | ( | register char * | d, | |
| register char const * | s | |||
| ) |
Convert a string to upper case.
| d | The destination string. | |
| s | The string to be converted. |
Definition at line 141 of file strings.cpp.
Referenced by to_upper().