Not all currencies are the same, so we would like to add a specific threshold for each currency. As a first cut, we combine multiple conditions with ANDs and ORs:
SET defaults.topic.autocreate=true;INSERT INTO big_eur_usd_paymentsSELECT STREAM *FROM paymentsWHERE (amount >=5000AND currency ='EUR') OR (amount >=5500AND currency ='USD')
As an improvement, we want to capture the threshold for each currency in a single expression. We will use a CASE statement for that:
SET defaults.topic.autocreate=true;INSERT INTO big_payments_caseSELECT STREAM *FROM paymentsWHERE amount >= (CASEWHEN currency ='EUR'THEN5000WHEN currency ='USD'THEN5500WHEN currency ='GBP'THEN4500ELSE5000END)
In this section, we will find all the transactions that happened during the (UTC) night. To do that we can use one of our many date and time functions.
You will also see how to use a CAST expression to convert from one type to another.
SET defaults.topic.autocreate=true;INSERT INTO night_paymentsSELECT STREAM *FROM paymentsWHERECAST(HOUR(time) ASINT) >=0ANDCAST(HOUR(time) ASINT) <=5
Checking the output, we can see that only one transaction satisfied our predicate:
Let’s imagine that we have to build some intelligence around all the payments we process, but we do not have the capacity and the need to process all of them.
We decided then to build a reduced copy of the payments topic, with only 10% of the original records.
To do that we are going to use our RANDINT function:
SET defaults.topic.autocreate=true;INSERT INTO payments_sampleSELECT STREAM *FROM paymentsWHERECAST(ABS(RANDINT()) AS DOUBLE) /2147483647<0.01
RANDINT generates a random integer, we take its absolute value to make sure it is positive, and we then normalise the result dividing it by the max possible integer, getting an (almost) uniform sample of numbers between 0 and 1.
We have to CAST to double on the way; otherwise, the division would be between integers, and the result would always be 0.