AbstractTokenizer

Defines a general tokenizer.

Description

The AbstractTokenizer class defines a general tokenizer.

Fields

LastTokenType

Last token type (TokenType)

LastTokenType: int

NextTokenValue

Next token value

NextTokenValue: *Token

Scanner

Scanner

Scanner: IScanner

commentState

Comment state

commentState: ICommentState

decodeStrings

Boolean that defines the option to decode strings or not.

decodeStrings: bool

mergeWhitespaces

Boolean that defines the option to unify white spaces.

mergeWhitespaces: bool

numberState

Number state

numberState: INumberState

quoteState

Quote state

quoteState: IQuoteState

skipComments

Boolean that defines the option to skip comments.

skipComments: bool

skipEof

Boolean that defines the option to skip EOF.

skipEof: bool

skipUnknown

Boolean that defines the option to skip unknowns.

skipUnknown: bool

skipWhitespaces

Boolean that defines the option to skip white spaces.

skipWhitespaces: bool

symbolState

Symbol state

symbolState: ISymbolState

unifyNumbers

Boolean that defines the option to unify numbers.

unifyNumbers: bool

whitespaceState

White space state.

whitespaceState: IWhitespaceState

wordState

Word state.

wordState: IWordState

Methods

ClearCharacterStates

Clears all character states.

(c *AbstractTokenizer) ClearCharacterStates()

GetCharacterState

Gest the state for a given character.

(c *AbstractTokenizer) GetCharacterState(symbol rune) ITokenizerState

HasNextToken

Finds out if the tokenizer has a next token.

(c *AbstractTokenizer) HasNextToken() bool

  • returns: bool - true if it has a next token, false otherwise.

nextToken

Gets the next token.

(c *AbstractTokenizer) NextToken() *Token

  • returns: *Token - next token

ReadNextToken

Reads the next token.

(c *AbstractTokenizer) ReadNextToken() *Token

  • returns: *Token - next token

SetCharacterState

Sets the characters' state.

(c *AbstractTokenizer) SetCharacterState(fromSymbol rune, toSymbol rune, state ITokenizerState)

  • fromSymbol: rune - first symbol
  • toSymbol: rune - last symbol
  • state: ITokenizerState - tokenizer state

TokenizeBuffer

Provides a token for a string buffer.

(c *AbstractTokenizer) TokenizeBuffer(buffer string) []*Token

  • buffer: string - buffer
  • returns: []*Token - token

TokenizeBufferToStrings

Creates a list of token values.

(c *AbstractTokenizer) TokenizeBufferToStrings(buffer string) []string

  • buffer: string - buffer
  • returns: []string - list of token values

TokenizeStream

Creates a list of tokens

(c *AbstractTokenizer) TokenizeStream(scanner IScanner) []*Token

TokenizeStreamToStrings

Creates a list of token values.

(c *AbstractTokenizer) TokenizeStreamToStrings(scanner IScanner) []string

  • scanner: IScanner - scanner
  • returns: []string - list of token values