Notes on data security
1. In June I wrote about burgeoning interest in data security. I’d now like to add:
- Even more than I previously thought, demand seems to be driven largely by issues of regulatory compliance.
- In an exception to that general rule, many enterprise have vague mandates for data encryption.
- In awkward contradiction to that general rule, there’s a general sense that it’s just security’s “turn” to be a differentiating feature, since various other “enterprise” needs are already being well-addressed.
We can reconcile these anecdata pretty well if we postulate that:
- Enterprises generally agree that data security is an important need.
- Exactly how they meet this need depends upon what regulators choose to require.
2. My current impressions of the legal privacy vs. surveillance tradeoffs are basically:
- The freer non-English-speaking countries are more concerned about ensuring data privacy. In particular, the European Union’s upcoming GDPR (General Data Protection Regulation) seems like a massive addition to the compliance challenge.
- The “Five Eyes” (US, UK, Canada, Australia, New Zealand) are more concerned about maintaining the efficacy of surveillance.
- Authoritarian countries, of course, emphasize surveillance as well.
3. Multiple people have told me that security concerns include (data) lineage and (data) governance as well. I’m fairly OK with that conflation.
- By citing “lineage” I think they’re referring to the point that if you don’t know where data came from, you don’t know if it’s trustworthy. This fits well with standard uses of the “data lineage” term.
- By “data governance” they seem to mean policies and procedures to limit the chance of unauthorized or uncontrolled data change, or technology to support those policies. Calling that “data governance” is a bit of a stretch, but it’s not so ridiculous that we need to make a big fuss about it.
In other words: If your data transformation pipelines aren’t locked down, then your data isn’t locked down either.
4. But how seriously does that last point need to be taken? For starters, the possibility of erroneous calculations:
- Is a strong threat to analytic accuracy, as has been recognized at least for the decades that “one version of the truth” has been a catchphrase.
- Has some regulatory risk, e.g. in the United States around Sarbanes-Oxley.
- Is not as a big a deal for the core security threat of data theft/exfiltration.
Further, it’s not too hard architecturally to have a divide between:
- Data transformation for operational use cases, which may need to be locked down.
- Data transformation for purely investigative analytics, which can be very fluid, for transformation technologies such as Hadoop, Spark and Excel alike.
Bottom line: Data transformation security is an accessible must-have in some use cases, but an impractical nice-to-have in others.
Comments
Leave a Reply