How To

This section is meant to provide a set of answers to most common questions on how to run the tabled-based query. If your answer is not present, you can always reach to us via our community slack channel.

Select a nested field

Storing nested data is quite common, therefore being able to select nested fields is a common requirement. Below is a sample of how to select nested fields:

SELECT fieldC.fieldD.fieldE
FROM table
WHERE fieldA.fieldB like '%Lenses%'

Select an array field

Tables can store data containing arrays. Here is the SQL to query for an array item:

SELECT fieldA[1], fieldB.fieldC[2].x
FROM table
WHERE fieldA[1] like '%Lenses%'

When working with arrays is good to check the array bounds. See SIZEOF function in the list of supported functions.

Limit the number of records read

To restrict the number of records read at one time, the LIMIT syntax should be used

SELECT ...
FROM table
LIMIT 1000

Limit the number of records based on data size

Sometimes, restricting the number of records returned based on how much data is returned is feasible. To achieve that the SET max.size construct is required. Here is the code to limit the records to a maximum of 100 megabytes

SET max.size = '100m';
SELECT ...
FROM table
LIMIT 1000

Set a time limit for a query

Searching for a record within a million records could take some time. To restrict the time to run the query, use SET max.query.time:

SET  max.query.time = '1h';
SELECT ...
FROM table
LIMIT 1000

Filter on table partition

A table (synonym with a Kafka topic) can be split into multiple parts called partitions. To read records from a specific partition the following query can be used:

SELECT fieldA
     , ...
FROM table
WHERE _meta.partition = 1
   OR _meta.partition = 8

Filter on text

If your field is of type string, any of the operators below can be used as part of a WHERE condition.

Operation Description
<>

Checks inequality. When you want to filter all records with field <> 'ABC'
= Checks for equality. When you want to filter all records with field == 'ABC'
< Checks for less than. When you want to filter all records with field < 'ABC'
<= Checks for less than or equal. When you want to filter all records with field <= 'ABC'
> Checks for greater than. When you want to filter all records with field > 'ABC'
>= Checks for greater than or equal. When you want to filter all records with field >= 'ABC'
like
Checks for prefix/contains/postfix. When you want to filter all records with the field containing the text:
field like '%ABC%'. To check for prefix field like 'ABC%'. To check for postfix field like '%ABC'.
Comparison is case insensitive!.

Let’s assume you have an audit topic where you store changes made to data in your system and you want to see all the changes made by people whose name contains Alex.

SELECT *
FROM audit
WHERE username LIKE '%Alex%'

Date math

A Kafka record carries with it a timestamp. It can be event time (when it was created) or received time (when it was received by the Kafka brokers to store it). You can use the SQL to filter the records timestamp metadata or use them as one of the fields returned:

-- All records since yesterday
SELECT fieldA
     , ...
FROM table
WHERE _meta.timestamp > YESTERDAY()

-- All records since 2 days ago
SELECT fieldA
     , ...
FROM table
WHERE _meta.timestamp > YESTERDAY()- "2d"

-- All records since 1h ago
SELECT fieldA
     , ...
FROM table
WHERE _meta.timestamp > now()- "1h"

-- All records since 15 minutes ago
SELECT fieldA
     , ...
FROM table
WHERE _meta.timestamp > now()- "15m"

-- All records since 10 seconds ago
SELECT fieldA
     , ...
FROM table
WHERE _meta.timestamp > now()- "10s"

-- All records since 1 week ago
SELECT fieldA
     , ...
FROM table
WHERE _meta.timestamp > now()- "1w"

-- All records since 2 months ago
SELECT fieldA
     , ...
FROM table
WHERE _meta.timestamp > now()- "2M"

-- All records since a year ago
SELECT fieldA
     , ...
FROM table
WHERE _meta.timestamp > now()- "1y"

Search for a record on a specific offset

Here is the query to use when the record offset and partition are known:

SELECT fieldA
     , ...
FROM table
WHERE _meta.partition = 2
  AND _meta.offset = 8
LIMIT 1

Filter on true/false

Here is an example of returning only those records when a field is true or false

SELECT fieldA
     , fieldB
     , ...
FROM table
WHERE fieldA = true and fieldB = false

Query for a null value

To filter your records when a field is null you can run the following

-- simple field example
SELECT ...
FROM table
WHERE fieldA IS null

-- nested field example
SELECT ...
FROM table
WHERE fieldA.subfieldB IS null

-- nested field example
SELECT ...
FROM table
WHERE fieldA.subfieldB IS null

SELECT *
FROM visitorTracking
WHERE location.country IS NOT NULL

Query for array size

Sometimes you want to find out how many items are in your array. To do so you can run:

SELECT SIZEOF(arrayFieldA)
FROM table

Filter on numbers

With the SQL code, the user can apply filters for fields of numbers. A number can be integer, short, byte, long, double, float or decimal.

Operation Description
<> Checks for not equal. When you want to filter all records with field <> 2
= Checks for equality. When you want to filter all records with field == 'ABC'
< Checks for less than. When you want to filter all records with field < 123.4
<= Checks for less than or equal. When you want to filter all records with field <= 1000
> Checks for greater than. When you want to filter all record with field > -10
>= Checks for greater than or equal. When you want to filter all records with field >= 122.122315
SELECT *
FROM topic
WHERE location.latitude < -10

Filter on Avro fields with a union of many types

The Avro payload is the standard. Sometimes the payload contains a field which is a UNION of multiple types. At runtime, the fields take the value of one of the types in question.

-- select the type
SELECT TYPEOF(fieldA) as typeofA
     , TYPEOF(fieldA.fieldB) as typeOfB
FROM table

-- filter on the type
SELECT fieldA,
      , ...
FROM table
WHERE TYPEOF(fieldC) = 'io.lenses.domain.LensesIsGreat'

The IF Function

Sometimes it is required to pick a value based on a certain condition being met. The IF functions supports this scenario and here is an example of to use it:

SELECT IF((field1 + field2)/2 > 10 AND field3 IS NULL, field3 *10, field4+field6 * field8)
FROM table