SourceForge.net Logo

FAXPP_Tokenizer Struct Reference

The tokenizer structure. More...

#include <tokenizer.h>


Related Functions

(Note that these are not member functions.)

FAXPP_TokenizerFAXPP_create_tokenizer (FAXPP_Transcoder encode)
 Creates a tokenizer object.
void FAXPP_free_tokenizer (FAXPP_Tokenizer *tokenizer)
 Frees a tokenizer object.
FAXPP_DecodeFunction FAXPP_get_tokenizer_decode (const FAXPP_Tokenizer *tokenizer)
 Returns the current FAXPP_DecodeFunction that the tokenizer is using.
void FAXPP_set_tokenizer_decode (FAXPP_Tokenizer *tokenizer, FAXPP_DecodeFunction decode)
 Sets the FAXPP_DecodeFunction that the tokenizer uses to decode the XML document.
FAXPP_Error FAXPP_init_tokenize (FAXPP_Tokenizer *tokenizer, void *buffer, unsigned int length, unsigned int done)
 Initialize the tokenizer to tokenize the given buffer, returning strings encoded using the given encoding function.
FAXPP_Error FAXPP_tokenizer_release_buffer (FAXPP_Tokenizer *tokenizer, void **buffer_position)
 Instructs the tokenizer to release any dependencies it has on it's current buffer.
FAXPP_Error FAXPP_continue_tokenize (FAXPP_Tokenizer *tokenizer, void *buffer, unsigned int length, unsigned int done)
 Provides a new buffer for the tokenizer to continue tokenizing.
FAXPP_Error FAXPP_next_token (FAXPP_Tokenizer *tokenizer)
 Reads the next token from the buffer, placing the information for it into the current token.
const FAXPP_TokenFAXPP_get_current_token (const FAXPP_Tokenizer *tokenizer)
 Returns the current token produced by the tokenizer when FAXPP_next_token() was called.
unsigned int FAXPP_get_tokenizer_nesting_level (const FAXPP_Tokenizer *tokenizer)
 Returns the current element nesting level in the XML document.
unsigned int FAXPP_get_tokenizer_error_line (const FAXPP_Tokenizer *tokenizer)
 Returns the line that the current error occured on.
unsigned int FAXPP_get_tokenizer_error_column (const FAXPP_Tokenizer *tokenizer)
 Returns the column that the current error occured on.


Detailed Description

The tokenizer structure.

Details of the structure are private.

See also:
tokenizer.h


Friends And Related Function Documentation

FAXPP_Tokenizer * FAXPP_create_tokenizer ( FAXPP_Transcoder  encode  )  [related]

Creates a tokenizer object.

Parameters:
encode The transcoder to use when encoding token values
Returns:
A pointer to the tokenizer object, or 0 if out of memory.

void FAXPP_free_tokenizer ( FAXPP_Tokenizer tokenizer  )  [related]

Frees a tokenizer object.

Parameters:
tokenizer The tokenizer to free

FAXPP_DecodeFunction FAXPP_get_tokenizer_decode ( const FAXPP_Tokenizer tokenizer  )  [related]

Returns the current FAXPP_DecodeFunction that the tokenizer is using.

Parameters:
tokenizer 
Returns:
The decode function

void FAXPP_set_tokenizer_decode ( FAXPP_Tokenizer tokenizer,
FAXPP_DecodeFunction  decode 
) [related]

Sets the FAXPP_DecodeFunction that the tokenizer uses to decode the XML document.

This will typically be called when an encoding declaration is read, to switch to the correct decode function.

Parameters:
tokenizer 
decode The decode function

FAXPP_Error FAXPP_init_tokenize ( FAXPP_Tokenizer tokenizer,
void *  buffer,
unsigned int  length,
unsigned int  done 
) [related]

Initialize the tokenizer to tokenize the given buffer, returning strings encoded using the given encoding function.

Parameters:
tokenizer The tokenizer to initialize
buffer A pointer to the start of the buffer to tokenize
length The length of the given buffer
done Set to non-zero if this is the last buffer from the input
Return values:
UNSUPPORTED_ENCODING If the encoding sniffing algorithm cannot recognize the encoding of the buffer
NO_ERROR 

FAXPP_Error FAXPP_tokenizer_release_buffer ( FAXPP_Tokenizer tokenizer,
void **  buffer_position 
) [related]

Instructs the tokenizer to release any dependencies it has on it's current buffer.

This is typically called on recieving a PREMATURE_END_OF_BUFFER error, before using FAXPP_continue_tokenize() to provide a new buffer. In this case, the buffer data between *buffer_position and the end of the buffer need to be copied into the start of the new buffer.

Parameters:
tokenizer 
[out] buffer_position Set to a pointer in the current buffer that the tokenizer has tokenized up to
Return values:
OUT_OF_MEMORY 
NO_ERROR 

FAXPP_Error FAXPP_continue_tokenize ( FAXPP_Tokenizer tokenizer,
void *  buffer,
unsigned int  length,
unsigned int  done 
) [related]

Provides a new buffer for the tokenizer to continue tokenizing.

FAXPP_tokenizer_release_buffer() should have been called before this, and the remaining data in the old buffer transferred to the new one.

Parameters:
tokenizer 
buffer A pointer to the start of the buffer to tokenize
length The length of the given buffer
done Set to non-zero if this is the last buffer from the input
Return values:
NO_ERROR 

FAXPP_Error FAXPP_next_token ( FAXPP_Tokenizer tokenizer  )  [related]

Reads the next token from the buffer, placing the information for it into the current token.

Parameters:
tokenizer 
Return values:
DOUBLE_DASH_IN_COMMENT 
PREMATURE_END_OF_BUFFER 
INVALID_START_OF_COMMENT 
INVALID_CHAR_IN_START_ELEMENT 
INVALID_CHAR_IN_ATTRIBUTE 
INVALID_CHAR_IN_END_ELEMENT 
NON_WHITESPACE_OUTSIDE_DOC_ELEMENT 
BAD_ENCODING 
UNSUPPORTED_ENCODING 
ADDITIONAL_DOCUMENT_ELEMENT 
INVALID_CHAR_IN_PI_NAME 
INVALID_PI_NAME_OF_XML 
INVALID_CHAR_IN_ELEMENT_NAME 
INVALID_CHAR_IN_ATTRIBUTE_NAME 
RESTRICTED_CHAR 
INVALID_CHAR_IN_ENTITY_REFERENCE 
INVALID_CHAR_IN_CHAR_REFERENCE 
INVALID_CHAR_IN_XML_DECL 
EXPECTING_EQUALS 
EXPECTING_WHITESPACE 
UNKNOWN_XML_VERSION 
INVALID_ENCODING_VALUE 
OUT_OF_MEMORY 
NO_ERROR 

const FAXPP_Token * FAXPP_get_current_token ( const FAXPP_Tokenizer tokenizer  )  [related]

Returns the current token produced by the tokenizer when FAXPP_next_token() was called.

Parameters:
tokenizer 
Returns:
The current token

unsigned int FAXPP_get_tokenizer_nesting_level ( const FAXPP_Tokenizer tokenizer  )  [related]

Returns the current element nesting level in the XML document.

Parameters:
tokenizer 
Returns:
The current nesting level

unsigned int FAXPP_get_tokenizer_error_line ( const FAXPP_Tokenizer tokenizer  )  [related]

Returns the line that the current error occured on.

Parameters:
tokenizer 
Returns:
The line number

unsigned int FAXPP_get_tokenizer_error_column ( const FAXPP_Tokenizer tokenizer  )  [related]

Returns the column that the current error occured on.

Parameters:
tokenizer 
Returns:
The column number


The documentation for this struct was generated from the following file:
Generated on Thu Mar 20 02:12:09 2008 for Faxpp by  doxygen 1.5.1