AbstractTokenizer

Defines a general tokenizer.

Implements: ITokenizer

Description

The AbstractTokenizer class defines a general tokenizer.

Fields

_lastTokenType

Last token type

protected TokenType _lastTokenType = TokenType.Unknown

_nextToken

Next token

protected Token _nextToken

_scanner

Scanner

protected IScanner _scanner

commentState

Comment state

private ICommentState commentState

decodeStrings

Boolean that defines the option to decode strings or not.

private boolean decodeString = false

mergeWhitespaces

Boolean that defines the option to unify white spaces.

private boolean mergeWhitespaces = false

numberState

Number state

public INumberState numberState

quoteState

Quote state

private IQuoteState quoteState

skipComments

Boolean that defines the option to skip comments.

private boolean skipComments = false

skipEof

Boolean that defines the option to skip EOF.

private boolean skipEof = false

skipUnknown

Boolean that defines the option to skip unknowns.

private boolean skipUnknown = false

skipWhitespaces

Boolean that defines the option to skip white spaces.

private boolean skipWhitespaces = false

symbolState

Symbol state

private symbolState: ISymbolState

unifyNumbers

Boolean that defines the option to unify numbers.

private boolean unifyNumbers = false

whitespaceState

White space state.

private IWhitespaceState whitespaceState

wordState

Word state.

private IWordState wordState

Properties

scanner

Scanner

protected IScanner _scanner

Instance methods

clearCharacterStates

Clears all character states.

public void clearCharacterStates()

getCharacterState

Gest the state for a given character.

public ITokenizerState getCharacterState(int symbol)

hasNextToken

Finds out if the tokenizer has a next token.

public Boolean hasNextToken() throws Exception

  • returns: Boolean - true if it has a next token, false otherwise.

nextToken

Gets the next token.

public Token nextToken() throws Exception

  • returns: Token - next token

readNextToken

Reads the next token.

protected Token readNextToken() throws Exception

  • returns: Token - next token

setCharacterState

Sets the characters' state.

public void setCharacterState(int fromSymbol, int toSymbol, ITokenizerState state) throws Exception

  • fromSymbol: int - first symbol
  • toSymbol: int - last symbol
  • state: ITokenizerState - tokenizer state

tokenizeBuffer

Provides a token for a string buffer.

public List<Token[]> tokenizeBuffer(String buffer) throws Exception

  • buffer: String - buffer
  • returns: Token[] - token

tokenizeBufferToStrings

Creates a list of token values.

public List tokenizeBufferToStrings(String buffer) throws Exception

  • buffer: String - buffer
  • returns: List- list of token values

tokenizeStream

Creates a list of tokens

public List<Token[]> tokenizeStream(IScanner scanner) throws Exception

tokenizeStreamToStrings

Creates a list of token values.

public List tokenizeStreamToStrings(IScanner scanner) throws Exception

  • scanner: IScanner - scanner
  • returns: List - list of token values