Content Discovery

Identify what data types are being processed or stored in every data source across your inventory.

What is Content Discovery? 

Once you've identified all data sources across your organization, you can use our proprietary technology to identify what data types are being processed or stored in each data source.

Mine supports three different technologies for content discovery:

  1.  Context classification
    Mine leverages its ML capabilities to analyze the vendors’ privacy policy, supported features, and your own business (size of company, industry etc.). How does it work? We analyze 3 inputs:
    • Auto-scan and analysis of legal document (PP, tos, dpa)
    • Aalysis of the data source capabilities and features
    • Analysis of your company (size, vertical, etc)
    With these inputs we create a prediction of what data types are being processed in each data source.
  2. PII tracing (full scan)
    Using an API integration for specific data sources (e.g. Salesforce, S3, GoogleDrive, BigQuery, etc.), we classify what types of information are stored there by scanning through (potentially) terabytes of data (e.g. credit card records, emails, location, etc). We can classify more than 150 different personal data types.
  3. PII scanning (smart sampling)
    In this option Mine provides a more cost effective solution to companies who want to understand the global picture of a data source with a less complex and less heavy scanning process.
    How it works: Our smart algorithms map the data source and sample the data in a smart way. - minimal amount of data and files - to provide with as extensive and comprehensive coverage of what data exists in the source. (The 80/20 rule)

Let's get started

5 steps to identifying your data types

  1. Identify sources you'd like to scan
  2. Set up their integrations
  3. Smart sample & statistics report

  4. View smart scan results
  5. Conduct a full scan 

1. Identify sources you'd like to scan

To get started, identify sources you'd like to scan that are also supported by Content Discovery.  Head to your "Data types" page and click "Manage" to open Content Discovery settings. 

If you don't have any supported systems,  click "Add source" to view a list of supported data sources and select the source you'd like to add. 

Once you have at least one compatible system in your Inventory, the system will appear in your Content Discovery settings.

2. Set up integrations for Content Discovery


To use smart sampling or Content Discovery, click "Connect" below the system you'd like to connect. This will take you to the system's page. 

Note:  If you have multiple instances of a system, you can enable Content Discovery per each instance. 

Next, enable "Use source in Content Discovery",  set up the integration and save. 

 

There are different types of integrations depending on the data source. To find documentation for developers click here

3. Smart scan & review statistics report

Click “Start scan” to begin scanning for records. After a few moments, if available, you can “View statistics report” while the content discovery is running. The statistics report will help you understand the types of objects and distribution of the data types in your system.

Note: Scans are limited to once every 48 hours.  

When the smart scanning sampling is complete, you will have a discovery of data types across all objects with a number of records.

Use policies to set alerts to find unexpected data types number of records found using Content Discovery.

4. View smart scanning results


To view smart scanning results, head to your data types page. Here you can see an aggregated number of records per type

 

You can view smart scanning results in 2 ways:

  1. View Records per system
  2. Identify sources that have data type record(s) (e.g. contact info)

 

When the smart scan is complete, Mine will automatically update the number of records per data system.

You will also see any systems in your Inventory have Content Discovery available but not connected. 

 

 

 

On the system's pane, the smart scan will also show you a breakdown of the system's data types. 

 

 

 

 

 

Deep dive

Once integrations have been set up & a smart scan is complete, you can conduct a thorough review of records per data type. This review, or deep dive, is useful when a data type seems unfamiliar, misplaced, or the number of records are unexpectedly high or low.

To open the deep dive dialog, click “View”. In the dialog will be a link to every data type per object (link depends on the integration).

Note: Deep dives are limited to100 records per data type per object

Note: Multiple records can be found in the same link; (e.g.. one unique ticket might two distinct phone number records)

5. Conduct a full scan

Please contact us for a full scan. 

 

Talk to us if you need any help with Content Discovery via our chat or at portal@saymine.com, and we'll be happy to assist!🙂