Where Clause

Querying for data and filtering out records based on various criteria is a common scenario, hence LSQL has support for complex filter expression.

String Filter

If your field is of type string, any of the operators below can be used as part of the predicate describing which records to select and which ones to drop.

Operation Description
<>

Checks inequality. When you want to filter all records with field <> 'ABC'
= Checks for equality. When you want to filter all records with field == 'ABC'
< Checks for less than. When you want to filter all records with field < 'ABC'
<= Checks for less than or equal. When you want to filter all records with field <= 'ABC'
> Checks for greater than. When you want to filter all records with field > 'ABC'
>= Checks for greater than or equal. When you want to filter all records with field >= 'ABC'
like

Checks for prefix/contains/postfix. When you want to filter all records with the field containing the text:
field like '%ABC%'. To check for prefix field like 'ABC%'. To check for postfix field like '%ABC'.
Comparison is case insensitive!.

Let’s assume you have an audit topic where you store changes made to data in your system and you want to see all the changes made by people whose name contains Alex. To achieve that the following query can be used;

SELECT *
FROM audit
WHERE username LIKE '%Alex%'

Null Check

Some of the fields in the payload can be null (or missing in case of JSON). Using the operations below a query can filter out any records where a field is null or not null:

Operation Description
is null Checks the value is null. When you want to filter all records with field is null
is not null Checks the value is not null. When you want to filter all records with field is not null
SELECT *
FROM `visitorTracking`
WHERE location.country IS NOT NULL

SELECT *
FROM `visitorTracking`
WHERE location.country IS NULL

Filter on Numbers

LSQL allows you to apply filters for fields of number type: integer, short, byte, long, double, float, decimal. The equality comparison for float and double happens within a precision range. Java uses a subset of IEEE 754 standard to represent floating point numbers. Some floating point numbers, for example, 1/3, cannot be represented exactly using float and double in Java. As a result, equality needs to consider an epsilon. LSQL uses 0.000001 for the value of the epsilon. If the two operands differ by less than the epsilon they are considered to be equal.

Operation Description
<> Checks for not equal. When you want to filter all records with field <> 2
= Checks for equality. When you want to filter all records with field == 'ABC'
< Checks for less than. When you want to filter all records with field < 123.4
<= Checks for less than or equal. When you want to filter all records with field <= 1000
> Checks for greater than. When you want to filter all record with field > -10
>= Checks for greater than or equal. When you want to filter all records with field >= 122.122315
SELECT *
FROM topic
WHERE location.latitude < -10

Filter on Partition

Sometimes a user will want to be able to look at one partition or a subset of partitions when applying a SELECT statement. Then the query needs to use the _partition keyword to instruct which ones. For example:

SELECT
    score
    , userId
    , sessionToken
FROM livescore
WHERE _partition = 1

--or

SELECT
    score
    , userId
    , sessionToken
FROM livescore
WHERE _partition in (2,3)

Filter on Timestamp

LSQL allows for filtering by timestamp to enable selection of time windows you are interested in. For allowing such filtering LSQL uses the _ts filter.

Let’s say we want to see all the messages on a topic and partition where their timestamp is greater than 1501593142000L - epoch time.

SELECT
    tradeId
    , isin
    , user
FROM `systemxtrades`
WHERE _partition = 1
AND _ts > 1501593142000

This is the same as if you were writing:

SELECT
    tradeId
    , isin
    , user
FROM `systemxtrades`
WHERE _partition = 1
AND _ts > '2017-08-01 13:12:22'

Important

LSQL understands timestamp date format following this rule: yyyy-MM-dd HH:mm:ss[.SSS]

Boolean filter

There are scenarios where checking on a boolean field value is a requirement. The SQL allows you to do so by simply specifying: is true or is false. Let’s say we have a topic where credit cards metadata is stored and we want to only pick those ones which have been blocked. Running the query below will allow you to achieve such functionality:

SELECT *
FROM credit_cards
WHERE blocked is true

IF Function

Sometimes it is required to pick a value based on a certain condition being met. The IF functions supports this scenario. Below is an example of using it:

SELECT IF((field1 + field2)/2 > 10 AND field3 IS NULL, field3 *10, field4+field6 * field8
FROM some_topic
WHERE blocked is true

Complex Filter

Applying a filter on a field sometimes is not enough. A user might want to apply algebra on a set of fields and filter based on the result, or maybe a user wants to look at a subset of a string field. LSQL allows you do do that.

SELECT *
FROM topicA
WHERE (a.d.e + b) /c > 100

LSQL allows the user to combine in the where clause fields from both key and value part. Here is an example of doing so:

SELECT *
FROM topicA
WHERE (_key.a.d.e + b) /_key.c < 100


SELECT *
FROM topicA
WHERE ((abs(fieldA) < 100 AND fieldB >= 2) OR (fieldC like '%wow'))

EXISTS Function

There are case when a field might not be present in the incoming payload. Say there is a topic tracking payments and a field source is not always present. To query such a topic for records the source is partnerA you can write:

SELECT *
FROM topicPayments
WHERE EXISTS(source) AND source = 'partnerA'

More often than not there are those topics used for sourcing events. You can imagine Avro records with field payload being an union between AccountRegistration, AccountSuspended and Payment (but the list can be larger) with the 3 types quite understandable not having the same fields names. To query for new accounts registrations you would write:

SELECT *
FROM `events-source`
WHERE EXISTS(account_holder_name) AND account_holder_name like '%Brindisi'