Provisioning management


Introduction 

The Provisioning Feature empowers users to define and manage data connections using two distinct input mechanisms: API and Filesystem (FS). Both mechanisms are centered around the provisioning.yaml file descriptor, which represents the desired connections state. This document outlines the nuances of each method and provides guidance on their usage.

Key Point: Provisioning Behavior 

All manual adjustments to both the connection settings and the product license will be overridden by whichever provisioning method is in use, be it the API or the FS approach. It’s crucial to keep your provisioning descriptor, license information, and related files up-to-date to avoid unexpected changes.

1. API Input Mechanism 

The API input mechanism is continuously available, offering a direct route to configure data connections and manage the product license.

  • Availability: Always ON. Cannot be disabled.

  • Data Input: Data is provisioned through multipart API requests. The multipart request differentiates between attached files and the provisioning.yaml descriptor using the part’s content type.

    • Provisioning File: For the provisioning.yaml file, use a content type of text/plain; charset=utf-8.
    • Attached Files: For attached files, such as certificates or any other required materialized files, use the content type application/octet-stream.
    • Part Naming: The name of the part in the multipart request should match the value of the property pointing to the mounted file in the provisioning.yaml descriptor. This ensures accurate mapping and referencing of files.
  • Documentation: Detailed API endpoints, multipart request formats, and examples are provided on our API Documentation Site.

  • Endpoints:

    • Connections: PUT "api/v1/state/connections"

      • queryString Parameters:
        • validateOnly: By default set to false. If set to true, it only validates the provided request without applying changes to the system.
        • validateConnectivity: By default set to true. If set to false, it avoids checking the connectivity to the described systems.
    • License: PUT "api/v1/state/license"

      • Use this endpoint to manage and update your product license.

2. Filesystem (FS) Input Mechanism 

FS provisioning is a method where data is ingested directly from specified directories in the filesystem. It provides a consistent, automated way of updating connections and the product license.

  • Availability: Enabled by configuring the necessary filesystem directories and properties.
  • Data Input: The system periodically reads the provisioning.yaml descriptor and license.json from the configured directory. Additionally, any requisite certificates or materialized files should be placed in the designated subfolder.
  • Documentation: Below, you’ll find detailed configuration options, directory structure, and essential files to aid in setting up and understanding the FS provisioning mechanism.

Configuration Properties: 

To activate the FS provisioning approach, ensure you configure the following:

  1. lenses.provisioning.path:

    • Description: Determines the directory the system reads data from.
    • Requirement: Must be a valid filesystem path.
  2. lenses.provisioning.interval:

    • Description: Defines how frequently the system checks the specified directory for updates.
    • Requirement: Must adhere to the java.time.FiniteDuration format.

Directory Structure: 

- configured-provisioning-folder
├── files/
│   ├── file1.txt
│   ├── file2.txt
├── provisioning.yaml   (To define desired state of connections)
└── license.json        (To manage the product's license data)

Key Points:

  • provisioning.yaml: Outlines the desired state of data connections, detailing services, protocols, and configurations.

  • license.json: Captures the product’s licensing information.

  • files/: This subfolder should house any necessary certificates or materialized files. Connection properties within the provisioning.yaml pointing to these files must accurately reference the appropriate file within this directory.

The provisioning.yaml Descriptor 

Both the API and FS mechanisms employ the provisioning.yaml descriptor to outline the desired state of data connections.

Descriptor Structure - provisioning.yaml 

componentName:

  - name: uniqueName
    version: versionNumber
    tags: [ 'tag1', 'tag2', ... ]
    configuration:
    directValueProperty:
      value: propertyValue
    fileReferenceProperty:
      file: fileName
    referenceProperty:
      reference: anotherUniqueName
    ...
  • componentName: Represents the type or category of the connection or service being described ( e.g., kafka, confluentSchemaRegistry, connect). This serves as a general classifier.

    • name: A unique identifier for the connection or service instance within its category. This ensures each connection is distinguishable from the others.

    • version: Represents the version number of the connection descriptor. This can help track changes or upgrades to the connection setup.

    • tags: An array of strings that serve as labels or markers for the connection. These tags can be used for categorization, filtering, or simply adding context (e.g., ‘dev’, ‘production’, ‘finance-team’).

    • configuration: This is where the specific settings and parameters for the connection are defined. It contains a mixture of direct values, file references, and references to other configurations:

      • propertyName: The name of the specific setting or parameter. Its structure can be one of three:

        • value: Directly provides the setting’s value. This can be a string, number, boolean, or even a list, depending on the requirement. Example: Setting a protocol (protocol: SSL) or providing a list of bootstrap servers.

        • file: Points to a specific file that contains data or information related to the connection. Typically, it refers to certificates, keys, or any other file needed to model the connection.

          • FS Input. The value here should match the file name stored under the files subdirectory of the provisioning folder. For instance, if the system requires a keystore for an SSL connection, this might point to keystore.jks in the files subdirectory.

          • API Input, The value should match the part name in the multipart request. For instance, if the system requires a keystore for an SSL connection, the value for such property should be the name of the part in the multipart request that contains the keystore file.

              kafka:
                configuration: 
                  sslKeystore:
                    file: keystoreFileName
            
               curl --location --request PUT "${LENSES_ENV}/api/v1/state/connections" \
               --header "X-Kafka-Lenses-Token: ${LENSES_SESSION_TOKEN}" \
               --header 'Content-Type: multipart/form-data' \
               --header 'Content-Disposition: form-data;' \
               --form "keystoreFileName=@${PATH_TO_KEYSTORE_FILE};type=application/octet-stream" \
               --form 'provisioning=@"resources/provisioning.yaml";type=text/plain(utf-8)' 
            

            In the above example, we can see that the part name is keystoreFileName, which is the value of the kafka.configuration.sslKeystore.file yaml property.

        • reference: This is a reference to another connection’s property. Instead of providing a direct value or file, it denotes that the value for this property should be fetched from another previously defined connection. The value here should be the name of the connection from which the property value will be inherited.

This structure allows for a clear and organized definition of connections and their configurations, making implementation by systems more streamlined. Moreover, by utilizing direct values, file references, and inter-connection references, it offers flexibility in how configurations are set up and managed.

For a comprehensive list of connection properties tailored to each supported connection type, we encourage users to consult our Connections API documentation. This resource provides in-depth details, ensuring you have all the information needed to optimally configure your connections. You can access the documentation directly via Connection Templates API Documentation.

provisioning.yaml Example: 

kafka:
  - name: kafka-dev
    version: 1
    tags: [ 'kafka', 'dev' ]
    configuration:
      kafkaBootstrapServers:
        value:
          - SSL://broker1:port
      protocol:
        value: SSL
      sslKeystore:
        file: keystore

confluentSchemaRegistry:
  - name: schema-reg-dev
    version: 1
    tags: [ 'dev' ]
    configuration:
      schemaRegistryUrls:
        value:
          - host1:port
      metricsPort:
        value: 9582

connect:
  - name: connect-cluster-dev
    version: 1
    tags: [ 'dev' ]
    configuration:
      aes256Key:
        value: customPassword
      workers:
        value:
          - host:13001

Example dissection 

Overall, this provisioning.yaml file is meant to set up and define configurations for a Kafka cluster, a Schema Registry (Confluent compatible API, AWS Glue), and a Kafka Connect cluster. The specific configurations like brokers, registry URLs, encryption keys, and others are specified in each component’s configuration.

Kafka Component 

  • Component Type: kafka
    • This section deals with provisioning a Kafka connection or service.
  • name: kafka-dev:
    • The unique identifier for this particular Kafka instance.
  • version: 1
    • The version of Kafka connection descriptor.
  • tags: [ 'kafka', 'dev' ]
    • Labels that provide context or categorization. This Kafka instance is tagged as ‘kafka’ and ‘dev’ which likely signifies a development environment.
  • configuration:
    • kafkaBootstrapServers:
      • Lists the broker(s) for Kafka in SSL protocol at the specified location (SSL://broker1:port).
    • protocol:
      • Specifies the communication protocol used, which is SSL in this instance.
    • sslKeystore:
      • Points to a file named keystore (assumed to be in the files directory). This is likely the keystore file for SSL encryption and authentication.

Confluent Schema Registry Component 

  • Component Type: confluentSchemaRegistry
    • This section deals with provisioning a connection to the Confluent Schema Registry.
  • name: schema-reg-dev
    • The unique identifier for this Schema Registry instance.
  • version: 1
    • The version of Confluent Schema Registry connection descriptor.
  • configuration:
    • schemaRegistryUrls:
      • Lists the URL(s) for the schema registry (host1:port).
    • metricsPort:
      • Specifies the port (9582) on which metrics are exposed for this service.

Connect Component 

  • Component Type: connect
    • This section deals with provisioning a Kafka Connect service.
  • name: connect-cluster-dev
    • The unique identifier for this Connect instance.
  • version: 1
    • The version of Kafka Connect connection descriptor.
  • configuration:
    • aes256Key:
      • Specifies the encryption key (likely for securing certain pieces of sensitive data).
    • workers:
      • Lists the worker(s) for this Connect cluster. In this example, there’s one worker at host:13001.

Always ensure the provisioning.yaml descriptor accurately represents the desired system state. Any connections not present in this file but present in the system will be dropped.