Skip to content

blackline-core

Blackline’s open-source project makes implementing GDPR compliance easy for developers. Blackline’s CLI tool allows you to define your data, set retention periods, de-identify data and fulfil user access and forget-me requests at scale. Check out our docs to get started!

Test blackline Release to PyPI PyPI Verssion Status Python License Maintainability black

Features

  • Supports common databases out of the box, and flexible enough to easily add support for additional data stores.
  • Eliminates the need to write custom queries.
  • Maintains the structural integrity of your data.
  • Easily inject conditions to suit specific use cases.
  • Project definitions in YAML.
  • Provides a consolidated, single point of collaboration for managing data privacy compliance.

Docs

Please check the blackline documentation!

Installation

Install the latest version of blackline-core

pip install blackline-core

blackline-core includes an adapters for SQLite databases. Multiple adapters can be installed. Please see the [supported ]ta stores(https://docs.getblackline.com/supported-platforms/) section for more information regarding adapters. To include additional adaptors use pip install blackline-<adapter name>. For example, the PostgreSQL adapter is installed using

pip install blackline-postgres

Quickstart

Requirements

  • Python 3.9+

Getting Started

  1. The quickest way to start a Blackline project is to use blackline init -p <folder name>. This will create the required folder structure and project file. You can change the names of these folders but it is important that the folder layout for the “adapters” and “catalogue” mirror each other.

foo@bar:~$ pip install blackline

foo@bar:~$ blackline init -p quickstart
Initialized blackline project at: quickstart

# apt-get install tree
foo@bar:~$ tree quickstart
quickstart/
├── adapters
│   └── organization
│       └── system
│           └── resource
│               └── dataset.yaml
├── blackline_project.yml
└── catalogue
    └── organization
        ├── organization.yaml
        └── system
            ├── resource
            │   ├── dataset
            │   │   └── dataset.yaml
            │   └── resource.yaml
            └── system.yaml
9 directories, 6 files
2. Before working on your production data, familiarise yourself with how Blackline works with our sample database. To create this:

foo@bar:~$ blackline sample -p quickstart --data-only
Created sample data at: quickstart
3. In order to interact with the sample database, you’ll need to define the adapter configuration located at: quickstart/adapters/organization/system/resource/dataset.yaml

quickstart/adapters/organization/system/resource/dataset.yaml
profiles:
  dev:
    type: sqlite
    config:
      connection:
        database: "blackline_sample.db"
        uri: true # TODO: check if required
4. To ensure the adapter is configured correctly, test the connection using:

foo@bar:~$ blackline debug -p quickstart --profile dev
Testing connections for profile: dev
  dataset: good
5. Explore the sample database created in step 2 to view the unmodified data.

explore_data.py
from sqlite3 import connect
conn = connect("blackline_sample.db")
tables = conn.execute(
    "SELECT name FROM sqlite_master WHERE type='table';"
    ).fetchall()

print([name for name in tables])
>>> ['user', 'shipment']

user = conn.execute("SELECT * FROM user")
shipment = conn.execute("SELECT * FROM shipment")
User
print([column[0] for column in user.description])
['id', 'name', 'email', 'ip', 'verified', 'created_at']
print(user.fetchall())
[
    ('00', 'Bar', 'bar@example.com', '555.444.3.2', 1, '2021-02-01 00:00:00'),
    ('01', 'Biz', 'biz@example.com', '555.444.3.3', 1, '2022-06-01 00:00:00'),
    ('02', 'Baz', 'baz@example.com', '555.444.3.4', 0, '2022-02-01 00:00:00'),
    ('03', 'Cat', 'cat@example.com', '555.444.3.5', 1, '2023-01-01 00:00:00'),
    ('04', 'Dog', 'dog@example.com', '555.444.3.6', 0, '2023-01-01 00:00:00')
]

Shipment
print([column[0] for column in shipment.description])
['id', 'user_id', 'order_date', 'street', 'postcode', 'city', 'status']
print(shipment.fetchall())
[
    ('00', '01', '2022-06-01 00:00:00', 'Ceintuurbaan 282', '1072 GK', 'Amsterdam', 'delivered'),
    ('01', '02', '2022-03-01 00:00:00', 'Singel 542', '1017 AZ', 'Amsterdam', 'delivered'),
    ('02', '02', '2022-04-15 00:00:00', 'Singel 542', '1017 AZ', 'Amsterdam', 'delivered'),
    ('03', '03', '2023-01-05 00:00:00', 'Wibautstraat 150', '1091 GR', 'Amsterdam', 'delivered'),
    ('04', '03', '2023-01-06 00:00:00', 'Wibautstraat 150', '1091 GR', 'Amsterdam', 'returned'),
    ('05', '03', '2023-01-06 00:00:00', 'Wibautstraat 150', '1091 GR', 'Amsterdam', 'delivered')
]
6. Populate the catalogue yaml files with meta data that defines the organisation, system, and resource. You can copy the sample files below.

quickstart/catalogue/organization/organization.yaml
organization:
  - name: organization_demo
quickstart/catalogue/organization/system/system.yaml
system:
  - name: system_demo
quickstart/catalogue/organization/system/resource/resource.yaml
resource:
  - name: resource_demo
    resource_type: Service
    privacy_declarations:
      - name: Analyze customer behaviour for improvements.
        data_categories:
          - user.contact
          - user.contact.address
        data_use: improve.system
        data_subjects:
          - customer
        data_qualifier: identified_data

  1. Define the de-identification methods, values and retention periods for each field containing personally identifiable information.
    quickstart/catalogue/organization/system/resource/dataset/dataset.yaml
    dataset:
      - name: Demo Database
        description: Demo database for Blackline
        collections:
          user:
            name: user
            description: User collection
            datetime_field:
              name: created_at
            fields:
              - name: name
                description: Name of user
                deidentifier:
                  type: redact
                period: P365D
              - name: email
                deidentifier:
                  type: replace
                  value: fake@email.com
                period: P365D
              - name: ip
                deidentifier:
                  type: mask
                  value: "#"
                period: 280 00
          shipment:
            name: shipment
            datetime_field:
              name: order_date
            fields:
              - name: street
                deidentifier:
                  type: redact
                period: P185D
    
  2. Run de-identification from the root of the Blackline project.

foo@bar:~$ cd quickstart
foo@bar:~$ blackline run --profile default --start-date 2023-01-01
Running project: /quickstart
Running profile: default
Running start date: 2023-01-01 00:00:00
Finished project: /quickstart
9. Explore the de-identified data.

explore_data.py
from sqlite3 import connect

conn = connect("blackline_sample.db")

tables = conn.execute(
    "SELECT name FROM sqlite_master WHERE type='table';"
    ).fetchall()
user = conn.execute("SELECT * FROM user")
shipment = conn.execute("SELECT * FROM shipment")
User
print([column[0] for column in user.description])
['id', 'name', 'email', 'ip', 'verified', 'created_at']
print(user.fetchall())
[
  ('00', None, 'fake@email.com', '###.###.#.#', 1, '2021-02-01 00:00:00'),
  ('01', 'Biz', 'biz@example.com', '555.444.3.3', 1, '2022-06-01 00:00:00'),
  ('02', 'Baz', 'baz@example.com', '###.###.#.#', 0, '2022-02-01 00:00:00'),
  ('03', 'Cat', 'cat@example.com', '555.444.3.5', 1, '2023-01-01 00:00:00'),
  ('04', 'Dog', 'dog@example.com', '555.444.3.6', 0, '2023-01-01 00:00:00')
]
Shipment
print([column[0] for column in shipment.description])
['id', 'user_id', 'order_date', 'street', 'postcode', 'city', 'status']
print(shipment.fetchall())
[
  ('00', '01', '2022-06-01 00:00:00', None, '1072 GK', 'Amsterdam', 'delivered'),
  ('01', '02', '2022-03-01 00:00:00', None, '1017 AZ', 'Amsterdam', 'delivered'),
  ('02', '02', '2022-04-15 00:00:00', None, '1017 AZ', 'Amsterdam', 'delivered'),
  ('03', '03', '2023-01-05 00:00:00', 'Wibautstraat 150', '1091 GR', 'Amsterdam', 'delivered'),
  ('04', '03', '2023-01-06 00:00:00', 'Wibautstraat 150', '1091 GR', 'Amsterdam', 'returned'),
  ('05', '03', '2023-01-06 00:00:00', 'Wibautstraat 150', '1091 GR', 'Amsterdam', 'delivered')
]

Contributing

This project is new and could use your help. Please open an issue or make a feature request.

Code of Conduct

If you would like to contribute, fork blackline-core, commit your changes, and make a pull request. It's a python project so we follow the PSF Code of Conduct. In general, be a decent and polite human.