Securing big data: Architecture tips for building security in


Securing big data: Architecture tips for building security in

There’s a lot of hype about “big data” and Hadoop in our organization right now. What are the key network security considerations we should think about with Hadoop network design and deployment?

Continue Reading This Article

Enjoy this article as well as all of our content, including E-Guides, news, tips and more.

By submitting your email address, you agree to receive emails regarding relevant topic offers from TechTarget and its partners. You can withdraw your consent at any time. Contact TechTarget at 275 Grove Street, Newton, MA.

You also agree that your personal information may be transferred and processed in the United States, and that you have read and agree to the Terms of Use and the Privacy Policy.

Safe Harbor

Since “big data” is a hot topic these days, there’s no question an increasing number of enterprise infosec teams are going to be asked about the security-related ramifications of big data projects. There are many issues to look into, but here are a few tips for making big data security efforts more secure during architecture and implementation phases:

  1. Create data controls as close to the data as possible, since much of this data isn’t “owned” by the security team. The risk of having big data traversing your network is that you have large amounts of confidential data – such as credit card data, Social Security numbers, personally identifiable information (PII), etc. -- that’s residing in new places and being used in new ways. Also, you’re usually not going to see terabytes of data siphoned from an organization, but the search for patterns to find the content in these databases is something to be concerned about. Keep the security as close to the data as possible and don’t rely on firewalls, IPS, DLP or other systems to protect the data. 
  2. Verify that sensitive fields are indeed protected by using encryption so when the data is analyzed, manipulated or sent to other areas of the organization, you’re limiting risk of exposure. All sensitive information needs to be encrypted once you have control over it.
  3. After you’ve made the move to encrypt data, the next logical step is to concern yourself with key management. There are a few new ways to perform key management, including creating keys on an as-needed basis so you don’t have to store them.
  4. In Hadoop designs, review the HDFS permissions of the cluster and verify all access to HDFS is authenticated. When first implemented, Hadoop frameworks were notoriously bad at performing authentication of users and services. This allows users to impersonate as a user the cluster services itself. You can be authenticated to the Hadoop framework using Kerberos, which can be used with HDFS access tokens to authenticate to the name node.

There are many other areas of security in big data systems, like Hadoop,  but when securing big data, authentication, encryption and permissions are three of the largest areas of concern during the big data architecture phase. As with most IT projects, building security in from the beginning is always better than trying to add security later.

This was first published in June 2012