Are Enterprises Collecting and Protecting Data? New Survey Results

In September, 2016, HPE Security – Data Security attended the Strata + Hadoop World Conference in New York and the Teradata PARTNERS 2016 Conference in Atlanta, GA. What do these two shows have in common? Attendees are passionate about data, and their work reflects the aptly named PARTNERS 2016 conference theme: “Data. Changes. Everything.”. These data lovers come from virtually all industries to discover how they can get more insights and business value out of their Big Data analytics and Hadoop projects. It was interesting to note a notable shift in the audience mindset at Strata. In previous years, attendees were first to admit they were there to “learn” about Hadoop. This year they are looking to improve the way they work with and expand their Hadoop implementations.

Over the duration of both shows, we conducted an anonymous industry survey querying attendees about the use and protection of sensitive data. Respondents represented verticals such as finance, energy healthcare, service providers, retail, telecoms, transportation, and government. With over 400 attendees participating, the results are revealing and show that protecting sensitive data in Big Data/Hadoop is a top-of-mind concern. Here are the results of the seven question survey.

Question One
Does your business currently use sensitive data such as PCI (payment card information), PII (personally identifiable information) or PHI (protected health information)? If yes, what type of sensitive data?

With 70% of respondents at the Strata + Hadoop World Conference and over half (54%) of respondents at the Teradata conference stating they are collecting sensitive data, it is our hope that they are also taking the preventive steps to secure that data today before they become a headline tomorrow.  Best practices include using a data-centric approach to data security that enables businesses to protect data over its entire lifecycle—from the point at which it’s captured, throughout its movement across your extended enterprise, and in use in analytics and applications–all without exposing live information to data breach.

In response to that type of information collected, for those that chose to answer, a large percentage was PII. At Strata, it was 55%, and 34% at Teradata. PCI came in second (28%) at Teradata, with PHI (14%) coming in a distant third. At Strata, PCI and PHI were tied at 22%. Responders could pick more than one answer.

At Teradata PARTNERS this year, respondents described new and specific use cases and data types such as home energy remote devices and geolocation data.   With the explosion of IOT devices, such as connected thermostats and smart energy meters, connected cars, and a plethora of mobile apps collecting geolocation data, businesses are realizing that there is an explosion of data that is sensitive, should be considered PII, and can become toxic when combined with other data types in the data lake. The next step is to help these enterprises to build in data security at the data level, with data-centric protection not only in the enterprise back-end but also in the connected devices and mobile applications in their Big Data and IoT ecosystems.

Question Two:
What steps, if any, are you currently using to protect this data?


The clear winner was encryption, with 53% at Strata and 44% picked by Teradata responders. Tokenization was second at both as well, with 17% at Strata and 22% at Teradata, which makes sense, as the PCI DSS recommends tokenization as a technology to protect credit card data. Only 6% and 5% at Strata and Teradata respectively, mentioned Data Leakage Protection. Somewhat disturbing was the answer that 9% of responders at Strata and 4% at Teradata answered that they do not protect their sensitive data. Tying that to the 19% (Strata) and 17% (Teradata) who did not know if or how they protected their collected data, could mean those businesses are playing with fire by not enacting a comprehensive data protection approach to protecting their sensitive data.

Question 3:
Are you currently planning any big data projects involving sensitive data? Or at Teradata: are you currently planning any big data projects involving sensitive data in the Teradata® Unified Data Architecture (UDA)™?

Those planning big data projects involving sensitive data answered affirmatively 65% to 22% at Strata. For clarity, the Teradata Unified Data Architecture (UDA) is an integrated solution designed to make it easy to transform data into meaningful insights from big data environments. At Teradata, 46% said they were planning big data projects involving sensitive data in the Teradata UDA, with 34% saying no.

Question Four:
What kind of data do you need to secure for your big data project?

For this question, responders could pick more than one choice, and we chose four of the top identifiers of consumers. Choices were credit card information (CC), social security numbers (SSN), name and address, data of birth (DOB) and other. Name and address was even at both, with 41% and 43% at Strata and Teradata respectively. DOB was also second at both conferences as well, with 28% at Strata and 38% at Teradata. Credit card and social security numbers were also evenly split at both conferences, with CC at 21%, and SSN at 24% at Strata; and 34% (CC) and 35% (SSN) at Teradata. The answer of other data came in at 19% at Strata and 17% at Teradata, correlating to those collecting geolocation or IoT-type data.

Question 5:
Which Hadoop distribution or Big Data platform environment are you using?

At Strata, Cloudera came in at 36%, Hortonworks 14%, IBM 4%, MapR 9%, HPE Vertica 3%, Teradata 8%, and other 18%. At Teradata, it was no surprise that the answer Teradata came in first at 53%, followed by Hortonworks at 20%, IBM 1%, MapR 4%, Cloudera 13%, and other at 8%.

Question Six:
Are you implementing IoT or offering products that could be considered part of the “Internet of Things?” (The IoT is growing network of everyday objects that feature an IP address for internet connectivity – e.g. smoke detectors, wearables, door locks, thermostats, etc.)

At Strata Hadoop, 26% answered yes and 56% said no. Those polled at Teradata responded with yes significantly more, at 39% to 42% saying no. Those that answered yes were asked a follow-up question, question seven.

Question Seven:
If yes, are you protecting IoT device data and/or mobile app data from breach?

At Strata, 31% answered yes, they are protecting the IoT connected device and app data, and 30% said no. A quarter of those polled at Teradata also answered yes, and 26% answered no. A full 38% at Strata and 48% at Teradata did not know if they were or not. Those last two answers are chilling. According to a HPE Internet of Things Research Study, 60% of IoT devices tested raised security concerns with their user interfaces. As the number of IoT connected devices in the enterprise multiplies, and the amount of data collected grows exponentially, organizations will need to act quickly, and adopt new security methodologies and best practices in their Big Data projects and IoT offerings.

Our Big Data, Hadoop and IoT security experts enjoy going to these conferences and interacting with the data architects, data scientists, engineers, developers and technology experts deploying Big Data/Hadoop projects in the field. We hear fascinating use cases, and stories from our customers and contacts who are truly doing wonderful and innovative things with Big Data. Next time you are at a Big Data conference or event, stop by the HPE booth to talk data-centric security.

Leave a Reply

Your email address will not be published. Required fields are marked *