0 follower

Abstract Class Yiisoft\Db\Sqlite\AbstractTokenizer

InheritanceYiisoft\Db\Sqlite\AbstractTokenizer
SubclassesYiisoft\Db\Sqlite\SqlTokenizer

Splits an SQL query into individual SQL tokens.

You can use it to obtain addition information from an SQL code.

Usage example:

$tokenizer = new SqlTokenizer("SELECT * FROM {{%user}} WHERE [[id]] = 1");
$root = $tokenizer->tokenize();
$sqlTokens = $root->getChildren();

Tokens are instances of {@see \Yiisoft\Db\Sqlite\SqlToken}.

Protected Properties

Hide inherited properties

Property Type Description Defined By
$length integer SQL code string length. Yiisoft\Db\Sqlite\AbstractTokenizer
$offset integer SQL code string current offset. Yiisoft\Db\Sqlite\AbstractTokenizer

Public Methods

Hide inherited methods

Method Description Defined By
__construct() Yiisoft\Db\Sqlite\AbstractTokenizer
tokenize() Tokenizes and returns a code type token. Yiisoft\Db\Sqlite\AbstractTokenizer

Protected Methods

Hide inherited methods

Method Description Defined By
indexAfter() Returns an index after the given string in the SQL code starting with the specified offset. Yiisoft\Db\Sqlite\AbstractTokenizer
isComment() Returns whether there's a commentary at the current offset. Yiisoft\Db\Sqlite\AbstractTokenizer
isIdentifier() Returns whether there's an identifier at the current offset. Yiisoft\Db\Sqlite\AbstractTokenizer
isKeyword() Returns whether the given string is a keyword. Yiisoft\Db\Sqlite\AbstractTokenizer
isOperator() Returns whether there's an operator at the current offset. Yiisoft\Db\Sqlite\AbstractTokenizer
isStringLiteral() Returns whether there's a string literal at the current offset. Yiisoft\Db\Sqlite\AbstractTokenizer
isWhitespace() Returns whether there's a space or blank at the current offset. Yiisoft\Db\Sqlite\AbstractTokenizer
startsWithAnyLongest() Returns whether the longest common prefix equals to the SQL code of the same length at the current offset. Yiisoft\Db\Sqlite\AbstractTokenizer
substring() Returns a string of the given length starting with the specified offset. Yiisoft\Db\Sqlite\AbstractTokenizer

Property Details

Hide inherited properties

$length protected property

SQL code string length.

protected integer $length 0
$offset protected property

SQL code string current offset.

protected integer $offset 0

Method Details

Hide inherited methods

__construct() public method

public mixed __construct ( string $sql )
$sql string

                public function __construct(private string $sql) {}

            
indexAfter() protected method

Returns an index after the given string in the SQL code starting with the specified offset.

protected integer indexAfter ( string $string, integer|null $offset null )
$string string

String to find.

$offset integer|null

SQL code offset, defaults to current if null is passed.

return integer

Index after the given string or end of string index.

                protected function indexAfter(string $string, ?int $offset = null): int
{
    if ($offset === null) {
        $offset = $this->offset;
    }
    if ($offset + mb_strlen($string, 'UTF-8') > $this->length) {
        return $this->length;
    }
    $afterIndexOf = mb_strpos($this->sql, $string, $offset, 'UTF-8');
    if ($afterIndexOf === false) {
        $afterIndexOf = $this->length;
    } else {
        $afterIndexOf += mb_strlen($string, 'UTF-8');
    }
    return $afterIndexOf;
}

            
isComment() protected abstract method

Returns whether there's a commentary at the current offset.

If this method returns true, it has to set the $length parameter to the length of the matched string.

protected abstract boolean isComment ( integer &$length )
$length integer

Length of the matched string.

return boolean

Whether there's a commentary at the current offset.

                abstract protected function isComment(int &$length): bool;

            
isIdentifier() protected abstract method

Returns whether there's an identifier at the current offset.

If this method returns true, it has to set the $length parameter to the length of the matched string. It may also set $content to a string that will be used as a token content.

protected abstract boolean isIdentifier ( integer &$length, string|null &$content )
$length integer

Length of the matched string.

$content string|null

Optional content instead of the matched string.

return boolean

Whether there's an identifier at the current offset.

                abstract protected function isIdentifier(int &$length, ?string &$content): bool;

            
isKeyword() protected abstract method

Returns whether the given string is a keyword.

The method may set $content to a string that will be used as a token content.

protected abstract boolean isKeyword ( string $string, string|null &$content )
$string string

String to match.

$content string|null

Optional content instead of the matched string.

return boolean

Whether the given string is a keyword.

                abstract protected function isKeyword(string $string, ?string &$content): bool;

            
isOperator() protected abstract method

Returns whether there's an operator at the current offset.

If this method returns true, it has to set the $length parameter to the length of the matched string. It may also set $content to a string that will be used as a token content.

protected abstract boolean isOperator ( integer &$length, string|null &$content )
$length integer

Length of the matched string.

$content string|null

Optional content instead of the matched string.

return boolean

Whether there's an operator at the current offset.

                abstract protected function isOperator(int &$length, ?string &$content): bool;

            
isStringLiteral() protected abstract method

Returns whether there's a string literal at the current offset.

If this method returns true, it has to set the $length parameter to the length of the matched string. It may also set $content to a string that will be used as a token content.

protected abstract boolean isStringLiteral ( integer &$length, string|null &$content )
$length integer

Length of the matched string.

$content string|null

Optional content instead of the matched string.

return boolean

Whether there's a string literal at the current offset.

                abstract protected function isStringLiteral(int &$length, ?string &$content): bool;

            
isWhitespace() protected abstract method

Returns whether there's a space or blank at the current offset.

If this method returns true, it has to set the $length parameter to the length of the matched string.

protected abstract boolean isWhitespace ( integer &$length )
$length integer

Length of the matched string.

return boolean

Whether there's a space or blank at the current offset.

                abstract protected function isWhitespace(int &$length): bool;

            
startsWithAnyLongest() protected method

Returns whether the longest common prefix equals to the SQL code of the same length at the current offset.

protected boolean startsWithAnyLongest ( array $with, boolean $caseSensitive, integer &$length, string|null &$content null )
$with array

Strings to test. The method will change this parameter to speed up lookups.

$caseSensitive boolean

Whether to perform a case-sensitive comparison.

$length integer

Length of the matched string.

$content string|null

Matched string.

return boolean

Whether there is a match.

                protected function startsWithAnyLongest(
    array $with,
    bool $caseSensitive,
    int &$length,
    ?string &$content = null,
): bool {
    if (empty($with)) {
        return false;
    }
    if (!is_array(reset($with))) {
        usort($with, static fn(string $string1, string $string2) => mb_strlen($string2, 'UTF-8') - mb_strlen($string1, 'UTF-8'));
        $map = [];
        foreach ($with as $string) {
            $map[mb_strlen($string, 'UTF-8')][$caseSensitive ? $string : mb_strtoupper($string, 'UTF-8')] = true;
        }
        $with = $map;
    }
    /** @psalm-var array<int, array> $with */
    foreach ($with as $testLength => $testValues) {
        $content = $this->substring($testLength, $caseSensitive);
        if (isset($testValues[$content])) {
            $length = $testLength;
            return true;
        }
    }
    return false;
}

            
substring() protected method

Returns a string of the given length starting with the specified offset.

protected string substring ( integer $length, boolean $caseSensitive true, integer|null $offset null )
$length integer

String length to return.

$caseSensitive boolean

If it's false, the string will be uppercase.

$offset integer|null

SQL code offset, defaults to current if null is passed.

return string

Result string, it may be empty if there's nothing to return.

                protected function substring(int $length, bool $caseSensitive = true, ?int $offset = null): string
{
    if ($offset === null) {
        $offset = $this->offset;
    }
    if ($offset + $length > $this->length) {
        return '';
    }
    $cacheKey = $offset . ',' . $length;
    if (!isset($this->substrings[$cacheKey . ',1'])) {
        $this->substrings[$cacheKey . ',1'] = mb_substr($this->sql, $offset, $length, 'UTF-8');
    }
    if (!$caseSensitive && !isset($this->substrings[$cacheKey . ',0'])) {
        $this->substrings[$cacheKey . ',0'] = mb_strtoupper($this->substrings[$cacheKey . ',1'], 'UTF-8');
    }
    return $this->substrings[$cacheKey . ',' . (int) $caseSensitive];
}

            
tokenize() public method

Tokenizes and returns a code type token.

public Yiisoft\Db\Sqlite\SqlToken tokenize ( )
return Yiisoft\Db\Sqlite\SqlToken

Code type token.

throws InvalidArgumentException

If the SQL code is invalid.

                public function tokenize(): SqlToken
{
    $this->length = mb_strlen($this->sql, 'UTF-8');
    $this->offset = 0;
    $this->substrings = [];
    $this->buffer = '';
    $token = (new SqlToken())->type(SqlToken::TYPE_CODE)->content($this->sql);
    $this->tokenStack = new SplStack();
    $this->tokenStack->push($token);
    $token[] = (new SqlToken())->type(SqlToken::TYPE_STATEMENT);
    $this->tokenStack->push($token[0]);
    $this->currentToken = $this->tokenStack->top();
    $length = 0;
    while (!$this->isEof()) {
        if ($this->isWhitespace($length) || $this->isComment($length)) {
            $this->addTokenFromBuffer();
            $this->advance($length);
            continue;
        }
        /** @psalm-suppress ConflictingReferenceConstraint */
        if ($this->tokenizeOperator($length) || $this->tokenizeDelimitedString($length)) {
            $this->advance($length);
            continue;
        }
        $this->buffer .= $this->substring(1);
        $this->advance(1);
    }
    $this->addTokenFromBuffer();
    if (
        $token->getHasChildren()
        && $token[-1] instanceof SqlToken
        && !$token[-1]->getHasChildren()
    ) {
        unset($token[-1]);
    }
    return $token;
}