Lenses introduces Data Policies, a new module for protecting your data in motion. Lenses Data Policies, enables users to create secure and govern sensitive data as it flows, shared or analysed via Lenses Platform.
Data protection of sensitive data is not a new problem. Analyzing and acting on data insights in real time requires to address risk for data in motion. Data officers can set centralised rules, based on Lenses Data Policies redaction modes, so that sensitive data can be protected while it gets analysed
Lenses will automatically apply the rules to the relevant datasets, in the case of Apache Kafka the topics, which will reflect the changes to Lenses endpoints: web user interface, integration APIs, native clients or JDBC driver.
Protect your data in motion¶
Individuals, as well as businesses, face challenges protecting Personally Identifiable Information (PII). As individuals, we are responsible for exposing our own information to the risk of being sold for malicious usage. However, businesses have a greater liability for exposing sensitive customer data. And since businesses are built on top of people and processes, they are fully responsible for their employee’s actions and how well their internal processes avoid exposing PII.
Businesses that don’t protect their customers and employees personally identifiable information, risk paying a substantial fine as well as incur reputation damage in case of a data breach.
According to the Guide to Protecting the Confidentiality of Personally Identifiable Information (PII) from NIST (National Institute of Standards and Technology), organizations should identify and manage all PII residing in their environments.
Due to their continuous flux, Data-In-Motion poses additional challenges to organizations. With data streaming technologies (like Apache Kafka) being used at the heart of modern systems, the demand for Intelligent Data Protection requires new processes for the Data Officer.
Data protection requirements¶
When it comes to protecting the personally identifiable information, there are a few fundamental requirements to fulfill:
- Identify where PII data is stored – without knowing where this information is retained then it is impossible to provide adequate protection.
- Know who can access your data – a key control for protecting the privacy of data is access control.
- Create policies for handling PII data – set rules regarding access to the data. With the rules in place, only those who have a business need to access the data have the relevant rights to access it.
- Educate your users – everyone in your organizations handling PII should know the risks and their responsibilities for mishandling the data
The ever-evolving nature of data and schemas and the adoption of streaming data make it even more challenging to keep the policies up-to-date. The Lenses DataOps platform has been extended in order to give data officers and data stewards control over sensitive data. All the points mentioned earlier, apart from the last, can be achieved easily with Lenses.
Controlling Access and Data¶
Lenses provides protection on data read. This means the data store in Apache Kafka, for example, is still unprotected. However, with Lenses platform sitting on top of middleware data access, as well as data content, can be governed. The first comes via Lenses user group permissions that allow being explicit on which user can access what data. The latter is enforced through the data officer policies put in place to control which data part(-s) to be obfuscated when access by the users.
Based on the NIST specifications, Lenses provides a set of policies out of the box. This data taxonomy is configurable and can be tuned to the specific business domain and business requirements. Here is an example:
- Name: Full Name, Maiden Name, Mother’s name, Alias
- Personal Identification Information: SSN, Passport number, Driver’s license Number, TaxPayer identification number, Patient Identification Number, Financial Account, Credit Card Number, Login name / Username
- Address Information: Street Address, Email Address, Zip Code, City, Country
- Asset Information: IP Address, MAC Address
- Telephone Number: Mobile, Business and Personal number
- Personally owned property: Vehicle registration Number
- Personal linkable Information: Data of birth, age, place of birth, religion, race, weight, height, activities, geographical indicators, employment information, medical info, educational info, financial info
Lenses protects the data by obfuscating parts of it. Out of the box, the DataOps platform provides a set of functions defining the obfuscation behavior. Here is the full list:
Masks the matching fields. For example: ‘(123) 800 2999’
will be translate to ‘**** * **‘.
|None||No obfuscation is applied. The matching fields values stay as they are.|
Extracts the first character of every word. For example: ‘Lenses
all the way ‘ will translate to ‘L a t w’.
Masks matching fields keeping the first character as unmasked.
For example: ‘Lenses’ will translate to ‘L*****’.
Masks matching fields keeping the first 2 characters as
unmasked. For example: ‘Lenses’ will translate to ‘Le****’.
Masks matching fields keeping the first 3 characters as
unmasked. For example: ‘Lenses’ will translate to ‘Len***’.
Masks matching fields keeping the first 4 characters as
unmasked. For example: ‘Lenses’ will translate to ‘Lens**’.
Confidentiality Impact Level¶
Not all data has the same confidentiality impact level. Fields that can be used to fully identify a person (i.e. Passport or Social Security Number) have to be treated with additional care in comparison to information like Country or Post-Code, that can only partially reveal the identity of a subject. Lenses provides three levels of impact: high, medium and low
Managing data policies¶
Enabling data protection is achieved via data policies. Data stewards, and ultimately Chief Data Officer, know which data is sensitive and needs to be protected. Adding a new policy means navigating to the _Policies_ screen:
On the data policies page, the + New policy button allows to add a new entry.
Once the input has been provided, the Save button will persist the policy.
- Policy Name - A short description to say what’s the policy for. For example: Credit Card
- Redaction - The obfuscation method to use. For example: Last-4
- Category - Provides a logical grouping for the data policies. For example: Address
- Impact - Sets the confidentiality level
- Fields - A collection of data fields names to protect by the new data policy rule. If field creditCard is added, for example,
Lenses will make sure the field (irrespective of the actual table) value will not be rendered in plain text to the user.
An existing data policy can be easily updated. While on the Policies screen, click on the entry that requires to be updated. The Web application will take the user to the data policy details page. Clicking on the Edit button, allows to set all the policy parameters.
To remove a data policy entry all it is required is to navigate to the Policies screen and for the target entry just click on the bin icon. A confirmation dialog will appear. Once confirmed the entry will be removed and the fields protection it covered will not be applicable anymore.
Data policy details¶
Adding a data policy will impact the results returned by the SQL engine. If the table data type contain the fields specified in the policy,those fields value, on each query made, will be redacted. Each policy will impact, potentially, different tables. A list of all affected tables can be seen on the policy details view.
The details policy page highlights the risk of exposing the data. For each policy the user can see the applications using the topics where the policy entry fields are present. They are grouped in three different categories: Lenses Connectors, Lenses SQL Processors and Custom Applications. The screenshot below gives you an example of a data policy entry for a field named creditCard:
Default Data Policies¶
Policies are applied and defined on a record field level. We have identified the most commonly used fields, in accordance to national standards in US and EU that organisations need to comply with, and you can optionally load them when you get started with Lenses policies. .. note:
|Policy Name||Category||Impact||Redaction Policy||Fields|
|Social Security Number||PII||HIGH||First2||ssn, social_security, social_security_number|
|Passport||PII||HIGH||First2||passport, passport_number, national_id|
|Tax Payer ID||PII||HIGH||First2||tax_payer_id, taxpayerid, unique_taxpayer, nino, utr, tin, atin, itin, tax_reference|
|Patient ID||PII||HIGH||First2||patient_id, patientID|
|Financial Account||PII||HIGH||Last4||account_number, sort_code, accountnumber, sortcode|
|Credit Card||PII||HIGH||Last4||credit_card, creditcard|
|User Name||PII||LOW||NoObfuscation||username, user_name, login_name|
|Full Name||Name||HIGH||NoObfuscation||full_name, fullname|
|Surname||Name||MEDIUM||NoObfuscation||surname, lastname, last_name|
|Street Address||Address||MEDIUM||Initials||home_address, street_address, address|
|Post Code||Address||LOW||NoObfuscation||post_code, postcode, zip_code, zipcode|
|IP Address||Asset||MEDIUM||First4||ip_address, ipaddress|
|MAC Address||Asset||MEDIUM||First4||mac_address, macaddress|
|Phone number||Asset||MEDIUM||First3||phone_number, mobile_number, mobile_phone|
|Vehicle number||Asset||MEDIUM||First2||vehicle_registration_number, vehicle_number|
|Date of birth||Personal Linkable||LOW||NoObfuscation||date_of_birth, dob|
|Place of birth||Personal Linkable||LOW||NoObfuscation||place_of_birth|
|Nationality||Personal Linkable||LOW||NoObfuscation||ethnicity, nationality|