Yandex.Cloud
  • Services
  • Why Yandex.Cloud
  • Pricing
  • Documentation
  • Contact us
Get started
Yandex Database
  • Getting started
    • Overview
    • Create databases
    • Examples of YQL queries
    • Examples of operations in the YDB CLI
    • Launch a test app
    • Document API
    • Developing in NodeJS through the Document API
  • Step-by-step instructions
    • Database management
    • How to connect to a database
    • Table management
    • Reading and writing data
    • Working with secondary indexes
  • Working with the SDK
  • Concepts
    • Overview
    • Data model and schema
    • Serverless and Dedicated operation modes
    • Data types
    • Transactions
    • Secondary indexes
    • Time to Live (TTL)
    • Terms and definitions
    • Quotas and limits
  • Access management
  • Pricing policy
    • Overview
    • Serverless mode
    • Dedicated mode
  • Recommendations
    • Schema design
    • Partitioning tables
    • Secondary indexes
    • Paginated output
    • Loading large data volumes
    • Using timeouts
  • YDB API and API reference
    • Database limits
    • Handling errors in the API
  • Amazon DynamoDB-compatible HTTP API
    • API reference
      • All methods
      • Actions
        • BatchGetItem
        • BatchWriteItem
        • CreateTable
        • DeleteItem
        • DeleteTable
        • DescribeTable
        • GetItem
        • ListTables
        • PutItem
        • Query
        • Scan
        • TransactGetItems
        • TransactWriteItems
        • UpdateItem
      • Common errors
  • YQL reference guide
    • Overview
    • Data types
      • Simple
      • Optional
      • Containers
      • Special
    • Syntax
      • Unsupported statements
      • For text representation of data types
      • Expressions
      • CREATE TABLE
      • DROP TABLE
      • INSERT INTO
      • UPSERT INTO
      • REPLACE INTO
      • UPDATE
      • DELETE
      • SELECT
      • GROUP BY
      • JOIN
      • FLATTEN
      • ACTION
      • DISCARD
      • PRAGMA
      • DECLARE
      • OVER, PARTITION BY, and WINDOW
    • Built-in functions
      • Basic
      • Aggregate
      • Window
      • For lists
      • For dictionaries
      • For JSON
      • For structures
      • For types
    • Preset user-defined functions
      • HyperScan
      • Pcre
      • Pire
      • Re2
      • String
      • Unicode
      • Datetime
      • Url
      • Ip
      • Digest
      • Math
      • Histogram
    • For text representation of data types
  • YQL tutorial
    • Overview
    • Creating a table
    • Adding data to a table
    • Selecting data from all columns
    • Selecting data from specific columns
    • Sorting and filtering
    • Data aggregation
    • Additional selection criteria
    • Joining tables by JOIN
    • Data insert and update by REPLACE
    • Data insert and update by UPSERT
    • Data insert by INSERT
    • Data update by UPDATE
    • Deleting data
    • Adding and deleting columns
    • Deleting a table
  • Maintenance
    • Backups
  • Diagnostics
    • System views
  • Questions and answers
    • General questions
    • Errors
    • YQL
    • All questions on the same page
  • Public materials
  1. YQL reference guide
  2. Preset user-defined functions
  3. Re2

Re2 UDF

  • Re2::Grep / Re2::Match
  • Re2::Capture
  • Re2::FindAndConsume
  • Re2::Replace
  • Re2::Count

List of functions

  • Re2::Grep(String) -> (String?) -> Bool
  • Re2::Match(String) -> (String?) -> Bool
  • Re2::Capture(String) -> (String?) -> Struct<_1:String?,foo:String?,...>
  • Re2::FindAndConsume(String) -> (String?) -> List<String>
  • Re2::Replace(String) -> (String?, String) -> String?
  • Re2::Count(String) -> (String?) -> Uint32
  • Re2::Options([CaseSensitive:Bool?,DotNl:Bool?,Literal:Bool?,LogErrors:Bool?,LongestMatch:Bool?,MaxMem:Uint64?,NeverCapture:Bool?,NeverNl:Bool?,OneLine:Bool?,PerlClasses:Bool?,PosixSyntax:Bool?,Utf8:Bool?,WordBoundary:Bool?]) -> Struct<CaseSensitive:Bool,DotNl:Bool,Literal:Bool,LogErrors:Bool,LongestMatch:Bool,MaxMem:Uint64,NeverCapture:Bool,NeverNl:Bool,OneLine:Bool,PerlClasses:Bool,PosixSyntax:Bool,Utf8:Bool,WordBoundary:Bool>

As Pire has certain limitations needed to ensure efficient string matching against regular expressions, it might be too complex or even impossible to use Pire UDF for some tasks. For such situations, we added another module to support regular expressions based on google::RE2. It offers a broader range of features (see the official documentation).

By default, the UTF-8 mode is enabled automatically if the regular expression is a valid UTF-8-encoded string, but is not a valid ASCII string. You can manually control the settings of the re2 library, if you pass the result of the Re2::Options function as the second argument to other module functions, next to the regular expression.

Alert

Make sure to double all the backslashes in your regular expressions (if they are within a quoted string): standard string literals are treated as C-escaped strings in SQL. You can also format regular expressions as raw strings @@regexp@@: double slashes are not needed in this case.

Examples

$value = "xaaxaaxaa";
$options = Re2::Options(false AS CaseSensitive);
$match = Re2::Match("[ax]+\\d");
$grep = Re2::Grep("a.*");
$capture = Re2::Capture(".*(?P<foo>xa?)(a{2,}).*");
$replace = Re2::Replace("x(a+)x");
$count = Re2::Count("a", $options);

SELECT
  $match($value) AS match,
  $grep($value) AS grep,
  $capture($value) AS capture,
  $capture($value)._1 AS capture_member,
  $replace($value, "b\\1z") AS replace,
  $count($value) AS count;

/*
- match: `false`
- grep: `true`
- capture: `(_0: 'xaaxaaxaa', _1: 'aa', foo: 'x')`
- capture_member: `"aa"`
- replace: `"baazaaxaa"`
- count:: `6`
*/

Re2::Grep / Re2::Match

If you leave out the details of implementation and syntax of regular expressions, those functions are totally similar to the applicable functions from the Pire modules. With other things equal and no specific preferences, we recommend that you use Pire::Grep or Pire::Match.

Re2::Capture

Unlike Pire::Capture , Re2:Capture supports multiple and named capturing groups.
Result type: a structure with the fields of the type String?.

  • Each field corresponds to a capturing group with the applicable name.
  • For unnamed groups, the following names are generated: _1, _2, etc.
  • The result always includes the _0 field containing the entire substring matching the regular expression.

For more information about working with structures in YQL, see the section on containers.

Re2::FindAndConsume

Searches for all occurrences of the regular expression in the passed text and returns a list of values corresponding to the parenthesized part of the regular expression for each occurrence.

Re2::Replace

Works as follows:

  • In the input string (first argument), all the non-overlapping substrings matching the regular expression are replaced by the specified string (second argument).
  • In the replacement string, you can use the contents of capturing groups from the regular expression using back-references in the format: \\1, \\2 etc. The \\0 back-reference stands for the whole substring that matches the regular expression.

Re2::Count

Returns the number of non-overlapping substrings of the input string that have matched the regular expression.

In this article:
  • Re2::Grep / Re2::Match
  • Re2::Capture
  • Re2::FindAndConsume
  • Re2::Replace
  • Re2::Count
Language
Careers
Privacy policy
Terms of use
© 2021 Yandex.Cloud LLC