DeSR Dependency Parser

Tanl::Text Namespace Reference

Text handling and internationalization support. More...


Namespaces

namespace  RegExp
 Regular Expression matching.
namespace  Unicode
 Utilities to handle UTF-8 strings.

Classes

class  Char
 Representation of Unicode characters. More...
class  Utf8Char
 This is just a type specifier for use in CharBuffer. More...
class  CChar
 This is just a type specifier for use in CharBuffer. More...
class  CharBuffer
 A text buffer that provides a random access iterator through it. More...
class  Encoding
class  HtmlTokenizer
 Similar to StringTokenizer, except that it skips HTML tags. More...
class  StreamTokenizer
class  String
 String class This class stores and manipulates strings of characters defined according to ISO10646. More...
class  StringTokenizer
class  Suffixes
 List of string suffix. More...
struct  eqstr
struct  eqstrcase
struct  WordIndex
 Associates an ID to each word in a set. More...
class  WordSetBase
class  WordSet
 Set of words. More...
struct  NormEqual
 Compare strings by normalizing to lowercase and discarding non alphanumeric characters. More...
struct  NormHash
class  NormWordSet

Typedefs

typedef unsigned short UCS2
 UCS2 holds a single UTF-16 code unit.
typedef int UCS4
 UCS4 represents a Unicode code point.

Functions

char iso8859_to_ascii (char c)
 Convert an 8-bit ISO 8859-1 (Latin 1) character to its closest 7-bit ASCII equivalent.
bool operator== (const String &s1, const String &s2)
bool operator== (const String &s1, const std::string &s2)
bool operator== (const String &s1, const char *s2)
bool operator== (const std::string &s1, const String &s2)
bool operator== (const char *s1, const String &s2)
bool operator!= (const String &s1, const String &s2)
bool operator< (const String &s1, const String &s2)
bool operator> (const String &s1, const String &s2)
bool operator<= (const String &s1, const String &s2)
bool operator>= (const String &s1, const String &s2)
String operator+ (const String &s1, const String &s2)
String operator+ (const String &s1, String::CharType *c)
String operator+ (String::CharType *c, const String &s1)
String operator+ (const String &s1, String::CharType c)
String operator+ (String::CharType c, const String &s1)
bool strStartsWith (const char *s1, const char *init)
 Determine whether string s1 starts with the sequence in init, disregarding case.
void itoa (register long n, register char *s)
 Convert a long integer to a string.
void to_lower (register char *d, register char const *s)
 Convert a string to lower case.
char * to_lower (register char *s)
 Destructively convert a string to lower case.
string & to_lower (string &s)
 Convert a string to lower case.
void to_upper (register char *d, register char const *s)
 Convert a string to upper case.
char * to_upper (register char *s)
 Destructively convert a string to upper case.
string & to_upper (string &s)
 Convert a string to upper case.
char const * next_token (char const *&ptr, const char *sep, char esc)
 simple string tokenizer, with escape.
char * strstr (const char *haystack, const char *needle, size_t count)
 Variant of strstr() which limits search to count characters in haystack.
std::string operator+ (const std::string s, const int i)
std::string operator+ (const int i, const std::string s)
std::string operator+ (const std::string s, const unsigned i)
std::string operator+ (const unsigned i, const std::string s)
void itoa (long, char *)
 String utilities.
char to_lower (char c)
char * to_lower (char *)
std::string & to_lower (std::string &)
char to_upper (char c)
char * to_upper (char *)
std::string & to_upper (std::string &)
int strncasecmp (const char *s1, const char *s2)
bool strempty (const char *s)
 Test for empty string.

Variables

char const iso8859_map [256]


Detailed Description

Text handling and internationalization support.

See also:
ICU


Function Documentation

char Tanl::Text::iso8859_to_ascii ( char  c  )  [inline]

Convert an 8-bit ISO 8859-1 (Latin 1) character to its closest 7-bit ASCII equivalent.

(This mostly means that accents are stripped.)

This function exists to ensure that the value of the character used to index the iso8859_map[] vector declared above is unsigned.

Parameters:
c The character to be converted.
Returns:
The said character.
SEE ALSO

International Standards Organization. "ISO 8859-1: Information Processing -- 8-bit single-byte coded graphic character sets -- Part 1: Latin alphabet No. 1," 1987.

Definition at line 57 of file charmap.h.

void Tanl::Text::itoa ( register long  n,
register char *  s 
)

Convert a long integer to a string.

Parameters:
n The long integer to be converted.
s A pointer to the string.

Definition at line 59 of file strings.cpp.

char const * Tanl::Text::next_token ( char const *&  ptr,
const char *  sep,
char  esc 
)

simple string tokenizer, with escape.

if preceded by

Parameters:
esc. A token is a sequence of characters delimited by characters in
sep except when preceded by
esc. 
sep sequence of delimiting characters
Returns:
the first token from
Parameters:
ptr. Advances ptr to the end of the token.
esc is an escape character for line continuation

Definition at line 223 of file strings.cpp.

References next_token().

Referenced by next_token().

bool Tanl::Text::strempty ( const char *  s  )  [inline]

Test for empty string.

Returns:
true if string s is null or empty.

Definition at line 114 of file strings.h.

string& Tanl::Text::to_lower ( string &  s  ) 

Convert a string to lower case.

Parameters:
s The string to be converted.
Returns:
The modified string converted to lower-case.

Definition at line 121 of file strings.cpp.

char* Tanl::Text::to_lower ( register char *  s  ) 

Destructively convert a string to lower case.

Parameters:
s The string to be converted.
Returns:
The same string, after convertion.

Definition at line 105 of file strings.cpp.

References to_lower().

void Tanl::Text::to_lower ( register char *  d,
register char const *  s 
)

Convert a string to lower case.

Parameters:
d The destination string.
s The string to be converted.

Definition at line 90 of file strings.cpp.

Referenced by strStartsWith(), and to_lower().

string& Tanl::Text::to_upper ( string &  s  ) 

Convert a string to upper case.

Parameters:
s The string to be converted.
Returns:
The modified string converted to upper-case.

Definition at line 172 of file strings.cpp.

char* Tanl::Text::to_upper ( register char *  s  ) 

Destructively convert a string to upper case.

Parameters:
s The string to be converted.
Returns:
The same string, after convertion.

Definition at line 156 of file strings.cpp.

References to_upper().

void Tanl::Text::to_upper ( register char *  d,
register char const *  s 
)

Convert a string to upper case.

Parameters:
d The destination string.
s The string to be converted.

Definition at line 141 of file strings.cpp.

Referenced by to_upper().

 
Copyright © 2005-2007 G. Attardi. Generated on 13 Aug 2009 by doxygen 1.5.7.1.