DeSR Dependency Parser

Tanl::Text::HtmlTokenizer Class Reference

Similar to StringTokenizer, except that it skips HTML tags. More...

#include <HtmlTokenizer.h>

Inheritance diagram for Tanl::Text::HtmlTokenizer:

Inheritance graph
Collaboration diagram for Tanl::Text::HtmlTokenizer:

Collaboration graph

List of all members.

Public Member Functions

 HtmlTokenizer (istream &is, char const *delim=delimitersNL)
 Tokenize into words delimited by.
 HtmlTokenizer (char const *s, char const *end=0, char const *delim=delimitersNL)
 Tokenize into words delimited by.
char const * next ()
char const * hasNext ()
 Tell whether there is a next token.

Static Public Attributes

static char const delimitersNL [] = " \t\n\r"
 Default newline delimiters.


Detailed Description

Similar to StringTokenizer, except that it skips HTML tags.

Definition at line 38 of file HtmlTokenizer.h.


Constructor & Destructor Documentation

Tanl::Text::HtmlTokenizer::HtmlTokenizer ( istream &  is,
char const *  delim = delimitersNL 
) [inline]

Tokenize into words delimited by.

Parameters:
delim. Read text from stream
is input stream
delim string of deliminting characters

Definition at line 58 of file HtmlTokenizer.h.

Tanl::Text::HtmlTokenizer::HtmlTokenizer ( char const *  s,
char const *  end = 0,
char const *  delim = delimitersNL 
) [inline]

Tokenize into words delimited by.

Parameters:
delim. Read from text between
start and
end. 
start string beginning
end string end
delim string of deliminting characters

Definition at line 69 of file HtmlTokenizer.h.


Member Function Documentation

char const * Tanl::Text::HtmlTokenizer::next (  ) 

Returns:
next token.

Definition at line 75 of file HtmlTokenizer.cpp.


The documentation for this class was generated from the following files:
 
Copyright © 2005-2007 G. Attardi. Generated on 13 Aug 2009 by doxygen 1.5.7.1.