How To

This section is meant to provide a set of answers to the most common questions on how to run a Lenses SQL query. If the answer you are looking for is not present, you can always reach to us via our community slack channel.

For the SQL engine, the term table is equivalent to Kafka topic. The two are used interchangeably.

Select a nested field

Storing nested data is quite common, therefore being able to select nested fields is a common requirement. Below is a sample of how to select nested fields:

SELECT fieldC.fieldD.fieldE
FROM table
WHERE fieldA.fieldB like '%Lenses%'

Select an array field

Tables can store data containing arrays. Here is a SQL statement for querying an array item:

SELECT fieldA[1], fieldB.fieldC[2].x
FROM table
WHERE fieldA[1] like '%Lenses%'

When working with arrays is good to check the array bounds. See the SIZEOF function in the list of supported functions.

Limit the number of records read

Use LIMIT syntax should be used to restrict the number of entries returned. Here is an example:

SELECT ...
FROM table
LIMIT 1000

Limit the number of records based on data size

Sometimes, restricting the number of records returned based on how much data is returned is feasible. To achieve that the SET max.size construct is required. Here is the code to limit the records to a maximum of 100 megabytes

SET max.size = '100m';
SELECT ...
FROM table
LIMIT 1000

Set a time limit for a query

Searching for a record within a million records could take some time. To restrict the time to run the query, use SET max.query.time:

SET  max.query.time = '1h';
SELECT ...
FROM table
LIMIT 1000

Filter on table partition

A table (synonym for a Kafka topic) can be split into multiple parts called partitions. To read records from a specific partition, the following query can be used:

SELECT fieldA
     , ...
FROM table
WHERE _meta.partition = 1
   OR _meta.partition = 8

Filter on text

If your field is of type string, any of the operators below can be used as part of a WHERE condition.

Operation Description
<>

Checks inequality. When you want to filter all records with field <> 'ABC'
= Checks for equality. When you want to filter all records with field == 'ABC'
< Checks for less than. When you want to filter all records with field < 'ABC'
<= Checks for less than or equal. When you want to filter all records with field <= 'ABC'
> Checks for greater than. When you want to filter all records with field > 'ABC'
>= Checks for greater than or equal. When you want to filter all records with field >= 'ABC'
like
Checks for prefix/contains/postfix. When you want to filter all records with the field containing the text:
field like '%ABC%'. To check for prefix field like 'ABC%'. To check for postfix field like '%ABC'.
Comparison is case insensitive!.

Let us assume that you have an audit topic where you store changes made to data in your system and you want to see all the changes made by people whose name contains Alex.

SELECT *
FROM audit
WHERE username LIKE '%Alex%'

Date math

A Kafka record carries with it a timestamp. It can be event time (when it was created) or received time (when it was received by the Kafka brokers to store it). You can use SQL to filter the records timestamp metadata or use them as one of the fields returned:

-- All records since yesterday
SELECT fieldA
     , ...
FROM table
WHERE _meta.timestamp > YESTERDAY()

-- All records since 2 days ago
SELECT fieldA
     , ...
FROM table
WHERE _meta.timestamp > YESTERDAY()- "2d"

-- All records since 1h ago
SELECT fieldA
     , ...
FROM table
WHERE _meta.timestamp > now()- "1h"

-- All records since 15 minutes ago
SELECT fieldA
     , ...
FROM table
WHERE _meta.timestamp > now()- "15m"

-- All records since 10 seconds ago
SELECT fieldA
     , ...
FROM table
WHERE _meta.timestamp > now()- "10s"

-- All records since 1 week ago
SELECT fieldA
     , ...
FROM table
WHERE _meta.timestamp > now()- "1w"

-- All records since 2 months ago
SELECT fieldA
     , ...
FROM table
WHERE _meta.timestamp > now()- "2M"

-- All records since a year ago
SELECT fieldA
     , ...
FROM table
WHERE _meta.timestamp > now()- "1y"

Search for a record on a specific offset

Here is the query to use when the record offset and partition are known:

SELECT fieldA
     , ...
FROM table
WHERE _meta.partition = 2
  AND _meta.offset = 8
LIMIT 1

Filter on true/false

Here is an example of returning the records that have the value of fieldA set to true and the value of the fieldB field set to false:

SELECT fieldA
     , fieldB
     , ...
FROM table
WHERE fieldA = true and fieldB = false

Query for a null value

To filter your records when a field is null you can run the following kind of queries:

-- simple field example
SELECT ...
FROM table
WHERE fieldA IS null

-- nested field example
SELECT ...
FROM table
WHERE fieldA.subfieldB IS null

-- nested field example
SELECT ...
FROM table
WHERE fieldA.subfieldB IS null

SELECT *
FROM visitorTracking
WHERE location.country IS NOT NULL

Query for array size

Sometimes you want to find out how many items are in your array. To do so you can run:

SELECT SIZEOF(arrayFieldA)
FROM table

Filter on numbers

With the SQL code, the user can apply filters for fields of numbers. A number can be integer, short, byte, long, double, float or decimal.

Operation Description
<> Checks for not equal. When you want to filter all records with field <> 2
= Checks for equality. When you want to filter all records with field == 'ABC'
< Checks for less than. When you want to filter all records with field < 123.4
<= Checks for less than or equal. When you want to filter all records with field <= 1000
> Checks for greater than. When you want to filter all record with field > -10
>= Checks for greater than or equal. When you want to filter all records with field >= 122.122315
SELECT *
FROM topic
WHERE location.latitude < -10

Filter on Avro fields with a union of many types

The Avro payload is the standard. Sometimes the payload contains a field which is a UNION of multiple types. At runtime, the fields take the value of one of the types in question.

-- select the type
SELECT TYPEOF(fieldA) as typeofA
     , TYPEOF(fieldA.fieldB) as typeOfB
FROM table

-- filter on the type
SELECT fieldA,
      , ...
FROM table
WHERE TYPEOF(fieldC) = 'io.lenses.domain.LensesIsGreat'

The IF Function

Sometimes it is required to pick a value based on a specific condition being met. The IF function supports this scenario and here is an example of its use:

SELECT IF((field1 + field2)/2 > 10 AND field3 IS NULL, field3 *10, field4+field6 * field8)
FROM table

Insert complex key

There are scenarios where a record key is a complex type. Regardless of the storage format, JSON or Avro, the SQL engine allows the insertion of such entries:

//creates a smart_devices table where the key is a type with one field {deviceId:100}
CREATE TABLE smart_devices(_key.deviceId int, room int, temperature double) format(avro, avro);

INSERT INTO smart_devices(_key.deviceId, room, temperature) VALUES(11223, 99, 22.1);
SELECT * FROM smart_devices;