Five Reasons to Use Data-Centric Security to Secure Your Hadoop Deployment
Apache Hadoop is designed to enable very rapid time-to-insight, decision support, and operational efficiencies. But Hadoop poses many security and regulatory compliance challenges, including automatic replication of data across multiple nodes, multiple types of data concentrated in the Hadoop “Data Lake,” and access by many different users with varying analytic needs.With more companies adopting Hadoop, it is changing the cyberattack landscape. Traditional IT security controls like firewalls and intrusion prevention systems establish a security perimeter that’s designed to keep hackers out. But these technologies cannot fully protect an organization from data breaches and data leakage. This is where a data-centric security model is paramount.
1. Protects sensitive data. Analytics consume increasingly large volumes of sensitive data. Customer profiles and personally identifiable information, corporate intellectual property, payments/ transactions data, protected health information and more – is all streaming into Hadoop and promises to deliver profound new insights and real-time decision-making. The best analytics include sensitive data.
2. Enables analytics on protected data. Data-centric security de-identifies the data at field and sub-field level. It’s format-preserving so an email looks like an email, a credit card number looks like a credit card number, and so on–preserving characteristics of the original data, including numbers, symbols, letters and numeric relationships such as date and salary ranges, and maintaining referential integrity across distributed data sets so joined data tables continue to operate properly. Up to 90% of your analytics can be performed on protected data, with no decryption required – so no performance impact.
3. Protects data in motion, at rest and in use. Traditional infrastructure security methods remain problematic, leaving security gaps throughout the data ecosystem. Today’s mega-breaches exploit those gaps. The solution: replace the data–with encrypted and tokenized values that preserve the format, behavior and meaning of the data for secure analytics. Data-centric security protects data pervasively throughout your ecosystem. It’s not just in Hadoop and other Big Data systems but across the multi-platform enterprise. Protection travels with the data.
4. Neutralizes the value of data to cyber attackers. Cyber thieves today are increasingly sophisticated, and always looking for which systems to attack. Hadoop is literally changing the attack landscape, because it simplifies their search by concentrating the data in massive clusters. But de-identified data in Hadoop is protected data, and even in the event of a data breach, yields nothing of value to the thieves, avoiding the penalties and costs such an event would otherwise have triggered.
5. Delivers regulatory compliance and risk reduction. These are board-level issues and can slow or halt the Hadoop implementation that lacks a strong and proven data security strategy from the outset. Data-centric security delivers the safe harbor protection needed in the event of data breach, along with the ongoing assurance of compliance with data privacy regulations.
Voltage Security® is the global leader in data-centric security for Hadoop. Voltage SecureData™ encryption/ tokenization protection can be applied at the source before it gets into Hadoop, or can be evoked during an ETL transfer to a landing zone, or from the Hadoop process transferring the data into HDFS. For more information on Voltage SecureData for Hadoop, please go to www.voltage.com/hadoop.