Data Classifier

Identify what data types are being processed or stored in every data source across your inventory.

What is the Data Classifier? 

Once you've identified all data sources across your organization, you can use our technology to identify what data types are being processed or stored in each data source.

Mine supports three different technologies for classification of data types, which are reflected in the 'Scan method' column of the Data Classifier screen:

  1.  MineAI
    Mine leverages its machine learning capabilities to analyze the vendors’ privacy policy, supported features, and your own business (size of company, industry etc.). How does it work? We analyze 3 inputs:
    • Auto-scan and analysis of legal document (PP, tos, dpa)
    • Analysis of the data source capabilities and features
    • Analysis of your company (size, vertical, etc)
    With these inputs we create a prediction of what data types are being processed in each data source.
  2. Smart Data Sampling
    MineOS provides a more cost effective solution to companies who want to understand the global picture of a data source with a less complex and less intensive and time-consuming scanning.
    How it works: Our smart algorithms map the data source and sample the data in a smart way. With a minimal amount of data and files - usually between 0.5% and 50% - to provide you with 100% of the core data required for compliance.
  3. A full scan of the data source.
    Using an API integration for specific data sources (e.g. Salesforce, S3, GoogleDrive, BigQuery, etc.), we classify what types of information are stored there by exhaustively scanning through (potentially) terabytes of data (e.g. credit card records, emails, location, etc) to find all data possible - core data, incremental data, outliers and more. The comprehensive scan is ideal to use as the basis for policy rules in order to govern data and create alerts.

Let's get started

5 steps to identifying your data types

  1. Identify sources you'd like to scan
  2. Set up their integrations
  3. Smart Data Sampling & statistics report

  4. View scan results
  5. Conduct a full scan 

1. Identify sources you'd like to scan

To get started, identify sources you'd like to scan that are also supported by Content Discovery.  Head to your "Data types" page and click "Manage" to open Content Discovery settings. 

If you don't have any supported systems,  click "Add source" to view a list of supported data sources and select the source you'd like to add. 

Once you have at least one compatible system in your Inventory, the system will appear in your Content Discovery settings.

2. Set up integrations


To use smart sampling or Content Discovery, click "Connect" below the system you'd like to connect. This will take you to the system's page. 

Note:  If you have multiple instances of a system, you can enable Content Discovery per each instance. 

Next, enable "Use source in Content Discovery",  set up the integration and save. 

 

There are different types of integrations depending on the data source. To find documentation for developers click here

3. Smart Data Sampling and statistics report

Click “Start scan” to begin scanning for records. After a few moments, if available, you can “View statistics report” while the content discovery is running. The statistics report will help you understand the types of objects and distribution of the data types in your system.

Note: Scans are limited to once every 48 hours.  

When the scan is complete, you will have a discovery of data types across all objects with a number of records.

Use policies to set alerts to find unexpected data types number of records found using Content Discovery.

4. View scan results


To view your scan results, head to your Data Classifier page. Here you can see an aggregated number of records per type

 

You can view smart scanning results in 2 ways:

  1. View Records per system
  2. Identify sources that have data type record(s) (e.g. contact info)

 

When the scan is complete, MineOS will automatically update the number of records per data system.

 

 

 

On the system's pane, the smart scan will also show you a breakdown of the system's data types. 

 

 

 

Deep dive

Once integrations have been set up & a smart scan is complete, you can conduct a thorough review of records per data type. This review, or deep dive, is useful when a data type seems unfamiliar, misplaced, or the number of records are unexpectedly high or low.

To open the deep dive dialog, click “View”. In the dialog will be a link to every data type per object (link depends on the integration).

Note: Deep dives are limited to 100 records per data type per object

Note: Multiple records can be found in the same link; (e.g.. one unique ticket might two distinct phone number records)

5. Conduct a full scan of your data source.

If you prefer a full scan of the data source and Smart Data Sampling or MineAI aren't relevant, please contact us. 

 

Talk to us if you need any help with Content Discovery via our chat or at portal@saymine.com, and we'll be happy to assist!🙂

Next steps...