By creating this macro-level of information, big data allows businesses to comprehend how their offerings are operating at a previously unattainable level of economic understanding.
What technologies and processes are needed to lay the groundwork for using big data to enhance enterprise information security? Sound log management? A SIEM implementation? What training is needed? What needs to be done to secure the data center?
In this tip, I will offer a realistic primer so enterprise information security teams know what technology they must have and what processes must be in place to take advantage of big data.
What is big data? Why does it matter to information security?
Like the sentient machines from The Matrix, or Skynet in the Terminator movies, today's big data environments consist of massively parallel processing database products -- though, fortunately, not self-aware -- crunching petabytes (1015) to zetabytes (1021) of seemingly disparate data to create trends and data mappings. By creating this macro-level of information, big data allows businesses to comprehend how their offerings are operating at a previously unattainable level of economic understanding. More specifically, by combining and analyzing huge volumes of data in new ways, new business insights can be revealed.
While big data has numerous valuable applications in the business world, it's important to remember that this type of information can be valuable to enterprise information security teams as well. So what should be done to lay the groundwork for security teams to take advantage of big data to enhance the security of the organization against both inside and outside threats?
Securing big data: Preparing the infrastructure
First, an organization's security team must understand the infrastructure differences between traditional security remediation tools and security tools that leverage big data systems to analyze the activities occurring within the organization. In today's corporate security offices, it's not uncommon to find a mix of disparate security tools reporting on diverse segments of security data that are of interest to the security analyst looking for problems. Logging tools, security monitoring tools, perimeter security devices, application access control devices, provisioning systems, vendor risk analysis programs, GRC products and others collect large volumes of information that must be broken down and normalized to identify security risks.
While these traditional tools provide data views into their particular variety of control, the outputs of these systems are generally not consolidated, or else the data is broken down into summary data and fed into one or more SIEM tools to visually highlight predetermined events of interest for the security team. Once a trend and potential incident has been identified, a team or several teams of security professionals must sift through the evidence from the massive output data to discover any unauthorized or malicious activities. This "loosely federated" approach to security management generally works well, but it's slow, is prone to miss well-camouflaged malicious events and may not uncover serious security events until large tracks of historic data have been collected, analyzed and summarized.
In contrast, the creation of a big data security environment relies on the previously mentioned tools to act as input into a single logical big data warehouse for security information. This warehouse has the advantage of treating the data as part of a larger security ecosystem with strong analytic and trend-analysis tools to identify threats that may only be evidenced by examining multiple data sets, rather than the traditional approach that involves the security team scouring over a conglomeration of loosely coupled data with a virtual magnifying glass.
Security big data: Infrastructure support
Of course at its core, this new environment will require infrastructure changes to allow it to collect and analyze the data.
More on big data and security
Securing big data: Architecture tips for building in security
To create an infrastructure that supports a big data environment, a secure, high-speed network is necessary to collect the many security system data feeds needed to sustain its collection requirements. Because of the virtualized, distributed nature of big data infrastructures, companies need to look at virtualized networks as the underlying communication infrastructure. Using technologies like VLANs between data centers and virtual devices as the network inside a virtual host that has implemented virtual switches will be the best choice for carrying big data. Because firewalls need to examine every packet for every session that flows through them, they are a bottleneck to big data's speedy computational capabilities. Organizations, therefore, need to separate their traditional user traffic from the traffic that will make up the main big data security feed. By ensuring only trusted server traffic is moving through the encrypted network tunnels and eliminating traditional infrastructure firewalls in between, the system can communicate at the unhindered speeds it needs.
Next, the virtual servers of the security data warehouse need to be protected. Ensuring they are hardened to NIST standards is always good practice as is uninstalling unneeded services like FTP utilities and making sure there's a good patch management process in place. Because the data on the servers needs to be trustworthy, backup services for the big data centers are needed. Furthermore, backups, whether via tape media or secondary drives, must be encrypted. Too often, breaches of secure data sites occur because of lost or stolen backup media. Operating system updates should be installed on a scheduled basis, and a good system monitoring tool should be deployed with a formal operations center for centralized monitoring and control.
Big data security: Integrating with existing tools, processes
In order to ensure that the big data security warehouse is at the top of the security incident ecosystem, it must be integrated with the existing security tools and processes in place. Of course these integration points should be in parallel to the existing connections because an organization can't drop its security analysis function as it retools for big data. The best approach for a new deployment is to minimize the number of connections by connecting the output of the corporate and/or line of business' SIEM tools to the big data security warehouse. Since this data has been preprocessed, it will allow the company to begin testing its analysis algorithms against a refined set of data.
Once the integration work with the security information and event management tools is complete and initial trends and events have begun to show themselves, a program should be developed to decouple the SIEM tools' inputs to feed directly into the warehouse. A best practice is to select well-defined, standardized data formats for input. It will greatly reduce the integration and normalization steps needed and will guarantee continued validation of the data warehouse's developed analysis algorithms.
Over time, the improved analysis capabilities make the data warehouse the main collection point of the corporate security tools, and the business's security office will have a single point of entry for security event analysis.
Finally, because big data is operating in a new and different environment, a customized training program should be developed for security office personnel. The program should focus on newly developed analysis and remediation processes that the security big data warehouse performs in regard to flagging and reporting unusual activities and network traffic. The actual operation of a big data ecosystem has very normalized functions, and unauthorized changes or access should be easy to spot.
Big data promises to open new ways for security teams to function without all the "machines taking over" drama presented by science-fiction movies. By understanding the benefits, setting realistic expectations and taking advantage of existing security technology, security managers can feel comfortable knowing that the investment they make in big data will be worthwhile.
About the Author:
Randall Gamby is the information security officer for the Medicaid Information Service Center of New York (MISCNY). MISCNY manages and maintains the largest state-run Medicaid claims data warehouse in the U.S. Prior to this position, he was the enterprise security architect for a Fortune 500 insurance and finance company. His experience also includes many years as an analyst for the Burton Group's Security and Risk Management Services group. His coverage areas included secure messaging, security infrastructure, identity and access management, security policies and procedures, credential services, and regulatory compliance.
This was first published in October 2012