User Defined Functions

This page describes how to write UDFs for Lenses SQL.

When the pre-defined set of functions is not enough, lenses allows users to define their own functions.

To code such a function, first it is required to pull this JVM dependency, and implement the UserDefinedFunction interface:

io.lenses:lenses-sql-udf:4.0.0

The resulting artefact should be added to Lenses deployment location, and it will be loaded automatically by Lenses.

Implementing a UDF

In order to implement a UDF, the first step is to define how many arguments a function takes.

Lenses currently supports defining functions that take 1,2,3 or a variable number of arguments.

For each variant, lenses provides a corresponding interface:

  • Functions taking 1 argument: UserDefinedFunction1

  • Functions taking 2 argument: UserDefinedFunction2

  • Functions taking 3 argument: UserDefinedFunction3

  • Functions taking a variable number of arguments: UserDefinedFunctionVarArg

Note that, currently, UDFs need to be put in package io.lenses.sql.udf, otherwise Lenses will not pick them up.

Package

Please make sure that the UDF implementation class belongs to one of the packages specified by the lenses.sql.udf.packages configuration option.

Interface

For this section, only the variable argument number function will be analysed. The remaining variants work in a very similar way with the only difference being the fact that the typer function will have a set number of arguments instead or receiving a variable number of arguments.

public interface UserDefinedFunctionVarArg extends UserDefinedFunction {


    String name();
    /**
     * Defines a mapping from input to output types.
     *
     * @param argTypes List of argument data types.
     * @return The resulting data type.
     * @throws UdfException if one or more of the given data types are not supported by this UDF.
     */
    DataType typer(List<DataType> argTypes) throws UdfException;

    /**
     * Evaluates this UDF for the given argument.
     *
     * @param args List of arguments.
     * @return The result of the evaluation.
     * @throws UdfException if the evaluation failed for some reason.
     */
    Value evaluate(List<Value> args) throws UdfException;
}

String name()

When a query specifies a function name:

SELECT STREAM foo(bar1,bar2,bar3) FROM source;

Lenses will first check the list of pre-defined functions for a matching name. If no match is found, Lenses will then proceed to check if a user defined function (UDF/UDAF) exists. It does so by checking if a function exists for which its name (foo in the example above) matches the one returned by the method name()

DataType typer(List argTypes) throws UdfException

In order to reduce the need for multiple functions to be defined (one for each accepted argument type), lenses allows function definitions to specify their type based on the types of its arguments.

The typer for a function that transforms its arguments from Strings into Integers could be defined as such:

public interface UserDefinedFunctionVarArg extends UserDefinedFunction {
    DataType typer(List<DataType> argTypes) throws UdfException {
        for (DataType dt : argTypes) {
            if (!dt.isString()) {
                throw new UdfException("Invalid argument. Function foo only accepts String types.");
            }
        }
        return new LTInt();
    }
}

Value evaluate(List args) throws UdfException;

Finally, in order to specify the function’s behavior the evaluate method can be used.

The following example will convert all arguments to an integer and then return their sum.

public interface UserDefinedFunctionVarArg extends UserDefinedFunction {
    Value evaluate(List<Value> args) throws UdfException {
        int result = 0;
        for (Value value : args) {
            int v = Integer.parseInt(value.get());
            result += v;
        }
        return result;
    }
}

Special Considerations

State

User defined functions are meant to stateless. Because of this, no guarantees are made regarding using instance variables and their usage is highly discouraged.

Nullability

Nullability in Lenses is a type level concern. As such, if a function can return null values, the typing information must reflect that.

When a function can return a NullValue, one should define the typer as an LTOptional<T>.

Testing

Testing is an important part of any development. In order to test your udf we recommend following the example tests published in the Lenses UDF Example Repository

Last updated

Logo

2024 © Lenses.io Ltd. Apache, Apache Kafka, Kafka and associated open source project names are trademarks of the Apache Software Foundation.