Data Policies

Lenses Data Policies module can protect data in motion. As a user you can track, secure and govern your sensitive data, as it flows, is shared or analysed via Lenses.

Data protection of sensitive or classified data is not a new problem. Analyzing and acting on data insights in real time requires to address risk for data in motion. As a Data officer you can set up global Data Policies and secure data via redaction modes, so that sensitive data can be protected while it gets analysed.

Lenses will automatically detect and apply rules to the relevant datasets, in the case of Apache Kafka the topics, across all APIs and client libraries.

Protect your data in motion

Individuals, as well as businesses, face challenges protecting Personally Identifiable Information (PII). Individuals are responsible for exposing their own information and understand the risks. However, businesses have a greater liability for exposing sensitive customer data. And since businesses are built on top of people and processes, they are fully responsible for their employee’s actions and how well their internal processes avoid exposing PII.

Businesses that don’t protect their customers and employees personally identifiable information, risk paying substantial fines as well as incur reputation damage in case of a data breach.

According to the Guide to Protecting the Confidentiality of Personally Identifiable Information (PII) from NIST (National Institute of Standards and Technology), organizations should identify and manage all PII residing in their environments.

Due to their continuous flux, Data-In-Motion poses additional challenges to organizations. With data streaming technologies (like Apache Kafka) being used at the heart of modern systems, the demand for Intelligent Data Protection requires new processes for the modern Data Officer.


Data protection requirements

When it comes to protecting the personally identifiable information, there are a few fundamental requirements to fulfill:

  • Identify where PII data is stored – without knowing where this information is retained then it is impossible to provide adequate protection.
  • Audit and control access to your data – a key control for protecting the privacy of data is access control.
  • Use Data Policies for PII data – rules to control the impact and reduction levels regarding access to the data.
  • Educate your users – everyone in your organizations handling PII should know the risks and their responsibilities for mishandling the data

The ever-evolving nature of data and schemas and the adoption of streaming data make it even more challenging to keep up with compliance and operate a certified data platform. Lenses ® gives data officers and data stewards control over sensitive data.

Data Access and Control

Lenses provides protection on data read. The original data are stored in Apache Kafka, and Lenses enables the governed access to the data content. This comes both from user/group permissions on what data a user can access, as well as from field level protection enforced via the data officer policies put in place.

Managing data policies

Enabling data protection is achieved via data policies. As a Data steward or Data Officer, you need to know your data and protect sensitive data by adding a new policy via the Policies screen:


On the data policies page, add a new policy


And create a data policy by filling in the bellow info:

  • Policy Name - A short description to say what’s the policy for
  • Redaction - How to protect the data
  • Category - A logical group to better classify the data
  • Impact - Set the confidentiality impact level
  • Fields - A collection of data fields names to protect by the new data policy rule. If field credit_card is added, for example,

Lenses will make sure the field is protected in accordance to the redaction level. If your user has the Data Officer role, you can create, update and delete policies.

Data policy details

Adding a data policy will impact the results returned by the SQL engine. If the topic data type contain the fields specified in the policy, those fields values on each query, will be redacted. Each policy will impact, potentially, different topics. A list of all affected can be seen on the policy details view.

The details policy page highlights the risk of exposing the data. For each policy the user can see the applications using the topics where the policy entry fields are present. They are grouped in three different categories: Lenses Connectors, Lenses SQL Processors and Custom Applications. The screenshot below gives you an example of a data policy entry for a field named creditCard:


Data Taxonomy

Based on the NIST specifications, Lenses provides a set of policies out of the box. This data taxonomy is configurable and can be tuned to the specific business domain and business requirements. Here is an example:

  • Name: Full Name, Maiden Name, Mother’s name, Alias
  • Personal Identification Information: SSN, Passport number, Driver’s license Number, TaxPayer identification number, Patient Identification Number, Financial Account, Credit Card Number, Login name / Username
  • Address Information: Street Address, Email Address, Zip Code, City, Country
  • Asset Information: IP Address, MAC Address
  • Telephone Number: Mobile, Business and Personal number
  • Personally owned property: Vehicle registration Number
  • Personal linkable Information: Data of birth, age, place of birth, religion, race, weight, height, activities, geographical indicators, employment information, medical info, educational info, financial info

Data Obfuscation

Lenses protects the data by obfuscating parts of it. Out of the box, the DataOps platform provides a set of functions defining the obfuscation behavior. Here is the full list:

Name Description
Masks the matching fields. For example: ‘(123) 800 2999’
will be translate to ‘**** * **‘.
None No obfuscation is applied. The matching fields values stay as they are.
Masks matching emails keeping only the first character as
unmasked. For example, '‘ will translate to
Extracts the first character of every word. For example: ‘Lenses
all the way ‘ will translate to ‘L a t w’.
Masks matching fields keeping the first character as unmasked.
For example: ‘Lenses’ will translate to ‘L*****’.
Masks matching fields keeping the first 2 characters as
unmasked. For example: ‘Lenses’ will translate to ‘Le****’.
Masks matching fields keeping the first 3 characters as
unmasked. For example: ‘Lenses’ will translate to ‘Len***’.
Masks matching fields keeping the first 4 characters as
unmasked. For example: ‘Lenses’ will translate to ‘Lens**’.
Masks matching fields keeping the last character as unmasked.
For example: ‘Lenses’ will translate to ‘*****s’.
Masks matching fields keeping the last 2 characters as
unmasked. For example: ‘Lenses’ will translate to ‘****es’.
Masks matching fields keeping the last 3 characters as
unmasked. For example: ‘Lenses’ will translate to ‘***ses’.
Masks matching fields keeping the last 4 characters as
unmasked. For example: ‘Lenses’ will translate to ‘***ses’.

Confidentiality Impact Level

Not all data has the same confidentiality impact level. Fields that can be used to fully identify a person (i.e. Passport or Social Security Number) have to be treated with additional care in comparison to information like Country or Post-Code, that can only partially reveal the identity of a subject. Lenses provides three levels of impact: high, medium and low.

Default Data Policies

Policies are applied and defined on a record field level. We have identified the most commonly used fields, in accordance to national standards in US and EU that organisations need to comply with, and you can optionally load them when you get started with Lenses policies.


When you load the default data policies, any topic that has fields matching the below ones, the privacy policy will be applied when you access data from Lenses web ui, endpoints or native clients.

Policy Name Category Impact Redaction Policy Fields  
Social Security Number PII HIGH First2 ssn, social_security, social_security_number  
Passport PII HIGH First2 passport, passport_number, national_id  
Drivers License PII HIGH First2 driver_license  
Tax Payer ID PII HIGH First2 tax_payer_id, taxpayerid, unique_taxpayer, nino, utr, tin, atin, itin, tax_reference  
Patient ID PII HIGH First2 patient_id, patientID  
Financial Account PII HIGH Last4 account_number, sort_code, accountnumber, sortcode  
Credit Card PII HIGH Last4 credit_card, creditcard  
Email PII HIGH Email email  
User Name PII LOW NoObfuscation username, user_name, login_name  
Full Name Name HIGH NoObfuscation full_name, fullname  
Surname Name MEDIUM NoObfuscation surname, lastname, last_name  
Maiden Name Name MEDIUM NoObfuscation maiden_name  
Mother’s Name Name MEDIUM NoObfuscation mother_name  
First Name Name MEDIUM NoObfuscation name  
Street Address Address MEDIUM Initials home_address, street_address, address  
Post Code Address LOW NoObfuscation post_code, postcode, zip_code, zipcode  
City Address LOW NoObfuscation city  
Country Address LOW NoObfuscation country  
IP Address Asset MEDIUM First4 ip_address, ipaddress  
MAC Address Asset MEDIUM First4 mac_address, macaddress  
Phone number Asset MEDIUM First3 phone_number, mobile_number, mobile_phone  
Vehicle number Asset MEDIUM First2 vehicle_registration_number, vehicle_number  
Date of birth Personal Linkable LOW NoObfuscation date_of_birth, dob  
Place of birth Personal Linkable LOW NoObfuscation place_of_birth  
Religion Personal Linkable LOW NoObfuscation religion  
Nationality Personal Linkable LOW NoObfuscation ethnicity, nationality