Data Policies

The Lenses Data Policies module can protect data in motion. As a user, you can track, secure and govern your sensitive data as it flows, it is shared or analyzed via Lenses.

Data protection of sensitive or classified data is not a new problem. Analyzing and acting on data insights in real time requires to address risk for data in motion. As a Data officer, you can set up global Data Policies and secure data via redaction modes so that sensitive data can be protected while is being analyzed.

Lenses will automatically detect and apply rules to the relevant datasets, which in the case of Apache Kafka are Kafka topics, across all APIs and client libraries.

Protect your data in motion

Individuals, as well as businesses, face challenges protecting Personally Identifiable Information (PII). Individuals are responsible for exposing their own information and understanding the risks. However, businesses have greater liability for exposing sensitive customer data. Since businesses are built on top of people and processes, they are fully responsible for their employee’s actions and how well their internal processes avoid exposing PII.

Businesses that do not protect their customers and employees personally identifiable information risk of paying substantial fines as well as incur reputation damages in case of a data breach.

According to the Guide to Protecting the Confidentiality of Personally Identifiable Information (PII) from NIST (National Institute of Standards and Technology), organizations should identify and manage all PII residing in their environments.

Due to their continuous flux, Data-In-Motion poses additional challenges to organizations. With data streaming technologies (like Apache Kafka) being used at the heart of modern systems, the demand for Intelligent Data Protection requires new processes for the modern Data Officer.

../../_images/dataofficer.png

Data protection requirements

When it comes to protecting the personally identifiable information, there are a few fundamental requirements to fulfill:

  • Identify where PII data is stored – without knowing where this information is retained, it is impossible to provide adequate protection.
  • Audit and control access to your data – a key control for protecting the privacy of data is access control.
  • Use Data Policies for PII data – rules to control the impact and reduction levels regarding data access.
  • Educate your users – everyone in your organizations handling PII should know the risks and the responsibilities for the mishandling of data.

The ever-evolving nature of data and schemas and the adoption of streaming data make it even more challenging to keep up with compliance and operate a certified data platform. Lenses ® gives data officers and data stewards control over sensitive data.

Data Access and Control

Lenses provides protection on data read. The original data is stored in Apache Kafka, and Lenses enables the governed access to the data content. This comes both from user/group permissions that specify what data a user can access, as well as from field level protection enforced via the data officer policies put in place.

Managing data policies

Enabling data protection is achieved via data policies. As a Data steward or Data Officer, you need to know your data and protect sensitive data by adding a new policy via the Policies screen:

../../_images/navigate.png

On the data policies page, add a new policy:

../../_images/new_entry.png

And create a data policy by filling in the following information:

  • Policy Name - A short description to say what the policy is for.
  • Redaction - How to protect the data.
  • Category - A logical group to better classify the data.
  • Impact - Set the confidentiality impact level.
  • Fields - A collection of data field names to protect by the new data policy rule. If a field named credit_card is added, for example, Lenses will make sure that the field is protected in accordance to the redaction level. If your user has the Data Officer role, they can

create, update and delete policies.

Data policy details

Adding a new data policy will impact the results returned by the SQL engine. If the topic data contains the fields specified in the policy, those field values will be redacted on each query. Each policy might potentially impact different topics. A list of all affected topics can be seen on the policy details view.

The details policy page highlights the risks of exposing the data. For each policy, the user can see the applications using the topics where the policy entry fields are present. These fields are grouped in three different categories: Lenses Connectors, Lenses SQL Processors and Custom Applications. The screenshot below gives you an example of a data policy entry for a field named creditCard:

../../_images/affected_apps.png

Data Taxonomy

Based on the NIST specifications, Lenses provides a set of policies out of the box. This data taxonomy is configurable and can be tuned to the specific business domain and business requirements. Here is an example:

  • Name: Full Name, Maiden Name, Mother’s name, Alias
  • Personal Identification Information: SSN, Passport number, Driver’s license Number, TaxPayer identification number, Patient Identification Number, Financial Account, Credit Card Number, Login name / Username
  • Address Information: Street Address, Email Address, Zip Code, City, Country
  • Asset Information: IP Address, MAC Address
  • Telephone Number: Mobile, Business and Personal number
  • Personally owned property: Vehicle registration Number
  • Personal linkable Information: Data of birth, age, place of birth, religion, race, weight, height, activities, geographical indicators, employment information, medical info, educational info, financial info

Data Obfuscation

Lenses protects the data by obfuscating parts of it. The DataOps platform comes with a predefined set of functions that define the obfuscation behavior. Here is the full list:

Name Description
All
Masks the matching fields. For example: ‘(123) 800 2999’
will be translate to ‘**** * **‘.
None No obfuscation is applied. The matching fields values stay as they are.
Email
Masks matching emails keeping only the first character as
unmasked. For example, 'support@lenses.io‘ will translate to
Initials
Extracts the first character of every word. For example: ‘Lenses
all the way ‘ will translate to ‘L a t w’.
First-1
Masks matching fields keeping the first character as unmasked.
For example: ‘Lenses’ will translate to ‘L*****’.
First-2
Masks matching fields keeping the first 2 characters as
unmasked. For example: ‘Lenses’ will translate to ‘Le****’.
First-3
Masks matching fields keeping the first 3 characters as
unmasked. For example: ‘Lenses’ will translate to ‘Len***’.
First-4
Masks matching fields keeping the first 4 characters as
unmasked. For example: ‘Lenses’ will translate to ‘Lens**’.
Last-1
Masks matching fields keeping the last character as unmasked.
For example: ‘Lenses’ will translate to ‘*****s’.
Last-2
Masks matching fields keeping the last 2 characters as
unmasked. For example: ‘Lenses’ will translate to ‘****es’.
Last-3
Masks matching fields keeping the last 3 characters as
unmasked. For example: ‘Lenses’ will translate to ‘***ses’.
Last-4
Masks matching fields keeping the last 4 characters as
unmasked. For example: ‘Lenses’ will translate to ‘***ses’.

Confidentiality Impact Level

Not all data has the same confidentiality impact level. Fields that can be used to fully identify a person (i.e. Passport or Social Security Number) have to be treated with extra care in comparison to information like Country or Postal Code that can only partially reveal the identity of a subject. Lenses provides three levels of impact: high, medium and low.

Default Data Policies

Policies are applied and defined on the record field level. We have identified the most commonly used fields that organizations need to comply with in accordance to national standards in the US and EU. You can optionally load them when you get started with Lenses policies.

Note

When you load the default data policies, the privacy policy will be applied to any topic that has fields matching the ones below.

This will automatically happen when you access data from the Lenses web UI or any endpoints or native clients.
Policy Name Category Impact Redaction Policy Fields  
Social Security Number PII HIGH First2 ssn, social_security, social_security_number  
Passport PII HIGH First2 passport, passport_number, national_id  
Drivers License PII HIGH First2 driver_license  
Tax Payer ID PII HIGH First2 tax_payer_id, taxpayerid, unique_taxpayer, nino, utr, tin, atin, itin, tax_reference  
Patient ID PII HIGH First2 patient_id, patientID  
Financial Account PII HIGH Last4 account_number, sort_code, accountnumber, sortcode  
Credit Card PII HIGH Last4 credit_card, creditcard  
Email PII HIGH Email email  
User Name PII LOW NoObfuscation username, user_name, login_name  
Full Name Name HIGH NoObfuscation full_name, fullname  
Surname Name MEDIUM NoObfuscation surname, lastname, last_name  
Maiden Name Name MEDIUM NoObfuscation maiden_name  
Mother’s Name Name MEDIUM NoObfuscation mother_name  
First Name Name MEDIUM NoObfuscation name  
Street Address Address MEDIUM Initials home_address, street_address, address  
Post Code Address LOW NoObfuscation post_code, postcode, zip_code, zipcode  
City Address LOW NoObfuscation city  
Country Address LOW NoObfuscation country  
IP Address Asset MEDIUM First4 ip_address, ipaddress  
MAC Address Asset MEDIUM First4 mac_address, macaddress  
Phone number Asset MEDIUM First3 phone_number, mobile_number, mobile_phone  
Vehicle number Asset MEDIUM First2 vehicle_registration_number, vehicle_number  
Date of birth Personal Linkable LOW NoObfuscation date_of_birth, dob  
Place of birth Personal Linkable LOW NoObfuscation place_of_birth  
Religion Personal Linkable LOW NoObfuscation religion  
Nationality Personal Linkable LOW NoObfuscation ethnicity, nationality