Data Discovery terms
- Data Object - Data container, such as a table in a database or a file in a file system
- Data Source - Instance of a data management technology that contains data objects
- Scan - Analysis performed on data sampled from one or more data sources
- Classifier - Artifact by which scans are set to classify a data element or document type
- Attribute - Classified data element identified in structured or unstructured data source
Data sources
- primary input for personal information discovery for specific data subjects within a specific organization.
- can be structured, semi-structured, unstructured, and cloud.
- An explicit connection to the target data source is required to performs canning to discover data.
- Broadly or tightly scoped to include some or all data objects.
- Examples: structured database tables based on schema, or unstructured files based on path
Declaration Methods
- Manual – Single declaration and test of a data source connection. You can also import multiple data sources with CSV import method.
- Automated for Cloud providers – SmallID access your Cloud account and will find all supported data sources
- Currently for AWS:
- Structured: Athena, DynamoDB, and EMR
- Unstructured: S3
- Azure and GCP autodiscovery will be supported soon.