Data Policies

Lenses introduces Data Policies, a new module for protecting your data in motion. Lenses Data Policies, enables users to create secure and govern sensitive data as it flows, shared or analysed via Lenses Platform.

Data protection of sensitive data is not a new problem. Analyzing and acting on data insights in real time requires to address risk for data in motion. Data officers can set centralised rules, based on Lenses Data Policies redaction modes, so that sensitive data can be protected while it gets analysed

Lenses will automatically apply the rules to the relevant datasets, in the case of Apache Kafka the topics, which will reflect the changes to Lenses endpoints: web user interface, integration APIs, native clients or JDBC driver.

Protect your data in motion

Individuals, as well as businesses, face challenges protecting Personally Identifiable Information (PII). As individuals, we are responsible for exposing our own information to the risk of being sold for malicious usage. However, businesses have a greater liability for exposing sensitive customer data. And since businesses are built on top of people and processes, they are fully responsible for their employee’s actions and how well their internal processes avoid exposing PII.

Businesses that don’t protect their customers and employees personally identifiable information, risk paying a substantial fine as well as incur reputation damage in case of a data breach.

According to the Guide to Protecting the Confidentiality of Personally Identifiable Information (PII) from NIST (National Institute of Standards and Technology), organizations should identify and manage all PII residing in their environments.

Due to their continuous flux, Data-In-Motion poses additional challenges to organizations. With data streaming technologies (like Apache Kafka) being used at the heart of modern systems, the demand for Intelligent Data Protection requires new processes for the Data Officer.

../../_images/dataofficer.png

Data protection requirements

When it comes to protecting the personally identifiable information, there are a few fundamental requirements to fulfill:

  • Identify where PII data is stored – without knowing where this information is retained then it is impossible to provide adequate protection.
  • Know who can access your data – a key control for protecting the privacy of data is access control.
  • Create policies for handling PII data – set rules regarding access to the data. With the rules in place, only those who have a business need to access the data have the relevant rights to access it.
  • Educate your users – everyone in your organizations handling PII should know the risks and their responsibilities for mishandling the data

The ever-evolving nature of data and schemas and the adoption of streaming data make it even more challenging to keep the policies up-to-date. The Lenses DataOps platform has been extended in order to give data officers and data stewards control over sensitive data. All the points mentioned earlier, apart from the last, can be achieved easily with Lenses.

Controlling Access and Data

Lenses provides protection on data read. This means the data store in Apache Kafka, for example, is still unprotected. However, with Lenses platform sitting on top of middleware data access, as well as data content, can be governed. The first comes via Lenses user group permissions that allow being explicit on which user can access what data. The latter is enforced through the data officer policies put in place to control which data part(-s) to be obfuscated when access by the users.

Data Taxonomy

Based on the NIST specifications, Lenses provides a set of policies out of the box. This data taxonomy is configurable and can be tuned to the specific business domain and business requirements. Here is an example:

  • Name: Full Name, Maiden Name, Mother’s name, Alias
  • Personal Identification Information: SSN, Passport number, Driver’s license Number, TaxPayer identification number, Patient Identification Number, Financial Account, Credit Card Number, Login name / Username
  • Address Information: Street Address, Email Address, Zip Code, City, Country
  • Asset Information: IP Address, MAC Address
  • Telephone Number: Mobile, Business and Personal number
  • Personally owned property: Vehicle registration Number
  • Personal linkable Information: Data of birth, age, place of birth, religion, race, weight, height, activities, geographical indicators, employment information, medical info, educational info, financial info

Data Obfuscation

Lenses protects the data by obfuscating parts of it. Out of the box, the DataOps platform provides a set of functions defining the obfuscation behavior. Here is the full list:

Name Description
All
Masks the matching fields. For example: ‘(123) 800 2999’
will be translate to ‘**** * **‘.
None No obfuscation is applied. The matching fields values stay as they are.
Email
Masks matching emails keeping only the first character as
unmasked. For example, 'support@lenses.io‘ will translate to
Initials
Extracts the first character of every word. For example: ‘Lenses
all the way ‘ will translate to ‘L a t w’.
First-1
Masks matching fields keeping the first character as unmasked.
For example: ‘Lenses’ will translate to ‘L*****’.
First-2
Masks matching fields keeping the first 2 characters as
unmasked. For example: ‘Lenses’ will translate to ‘Le****’.
First-3
Masks matching fields keeping the first 3 characters as
unmasked. For example: ‘Lenses’ will translate to ‘Len***’.
First-4
Masks matching fields keeping the first 4 characters as
unmasked. For example: ‘Lenses’ will translate to ‘Lens**’.
Last-1
Masks matching fields keeping the last character as unmasked.
For example: ‘Lenses’ will translate to ‘*****s’.
Last-2
Masks matching fields keeping the last 2 characters as
unmasked. For example: ‘Lenses’ will translate to ‘****es’.
Last-3
Masks matching fields keeping the last 3 characters as
unmasked. For example: ‘Lenses’ will translate to ‘***ses’.
Last-4
Masks matching fields keeping the last 4 characters as
unmasked. For example: ‘Lenses’ will translate to ‘***ses’.

Confidentiality Impact Level

Not all data has the same confidentiality impact level. Fields that can be used to fully identify a person (i.e. Passport or Social Security Number) have to be treated with additional care in comparison to information like Country or Post-Code, that can only partially reveal the identity of a subject. Lenses provides three levels of impact: high, medium and low

Managing data policies

Add

Enabling data protection is achieved via data policies. Data stewards, and ultimately Chief Data Officer, know which data is sensitive and needs to be protected. Adding a new policy means navigating to the _Policies_ screen:

../../_images/navigate.png

On the data policies page, the + New policy button allows to add a new entry.

../../_images/new_entry.png

Once the input has been provided, the Save button will persist the policy.

  • Policy Name - A short description to say what’s the policy for. For example: Credit Card
  • Redaction - The obfuscation method to use. For example: Last-4
  • Category - Provides a logical grouping for the data policies. For example: Address
  • Impact - Sets the confidentiality level
  • Fields - A collection of data fields names to protect by the new data policy rule. If field creditCard is added, for example,

Lenses will make sure the field (irrespective of the actual table) value will not be rendered in plain text to the user.

Update

An existing data policy can be easily updated. While on the Policies screen, click on the entry that requires to be updated. The Web application will take the user to the data policy details page. Clicking on the Edit button, allows to set all the policy parameters.

../../_images/edit.png

Remove

To remove a data policy entry all it is required is to navigate to the Policies screen and for the target entry just click on the bin icon. A confirmation dialog will appear. Once confirmed the entry will be removed and the fields protection it covered will not be applicable anymore.

Data policy details

Adding a data policy will impact the results returned by the SQL engine. If the table data type contain the fields specified in the policy,those fields value, on each query made, will be redacted. Each policy will impact, potentially, different tables. A list of all affected tables can be seen on the policy details view.

The details policy page highlights the risk of exposing the data. For each policy the user can see the applications using the topics where the policy entry fields are present. They are grouped in three different categories: Lenses Connectors, Lenses SQL Processors and Custom Applications. The screenshot below gives you an example of a data policy entry for a field named creditCard:

../../_images/affected_apps.png

Default Data Policies

Policies are applied and defined on a record field level. We have identified the most commonly used fields, in accordance to national standards in US and EU that organisations need to comply with, and you can optionally load them when you get started with Lenses policies. .. note:

When you load the default data policies, any topic that has fields matching the below ones, the privacy policy will be
applied when you access data from Lenses web ui, endpoints or native clients.
Policy Name Category Impact Redaction Policy Fields  
Social Security Number PII HIGH First2 ssn, social_security, social_security_number  
Passport PII HIGH First2 passport, passport_number, national_id  
Drivers License PII HIGH First2 driver_license  
Tax Payer ID PII HIGH First2 tax_payer_id, taxpayerid, unique_taxpayer, nino, utr, tin, atin, itin, tax_reference  
Patient ID PII HIGH First2 patient_id, patientID  
Financial Account PII HIGH Last4 account_number, sort_code, accountnumber, sortcode  
Credit Card PII HIGH Last4 credit_card, creditcard  
Email PII HIGH Email email  
User Name PII LOW NoObfuscation username, user_name, login_name  
Full Name Name HIGH NoObfuscation full_name, fullname  
Surname Name MEDIUM NoObfuscation surname, lastname, last_name  
Maiden Name Name MEDIUM NoObfuscation maiden_name  
Mother’s Name Name MEDIUM NoObfuscation mother_name  
First Name Name MEDIUM NoObfuscation name  
Street Address Address MEDIUM Initials home_address, street_address, address  
Post Code Address LOW NoObfuscation post_code, postcode, zip_code, zipcode  
City Address LOW NoObfuscation city  
Country Address LOW NoObfuscation country  
IP Address Asset MEDIUM First4 ip_address, ipaddress  
MAC Address Asset MEDIUM First4 mac_address, macaddress  
Phone number Asset MEDIUM First3 phone_number, mobile_number, mobile_phone  
Vehicle number Asset MEDIUM First2 vehicle_registration_number, vehicle_number  
Date of birth Personal Linkable LOW NoObfuscation date_of_birth, dob  
Place of birth Personal Linkable LOW NoObfuscation place_of_birth  
Religion Personal Linkable LOW NoObfuscation religion  
Nationality Personal Linkable LOW NoObfuscation ethnicity, nationality