←
Back to Blog
Security best practices
11/4/2024
-
XX
Minute Read
Why data security needs a platform: what recent DSPM and DLP acquisitions tell us about the future of cybersecurity
In recent years, we've witnessed an acceleration of consolidation across the cybersecurity industry, with data security at the heart of this evolution. The rise of AI use cases has accelerated the need for protecting sensitive data for most enterprises. Major acquisitions like Dig Security and Laminar in the Data Security Posture Management (DSPM) space, alongside acquisitions of DLP (Data Loss Prevention) companies like Next DLP, Trail, Mimecast, and Code42, signal that we are entering a new phase in how enterprises secure their most sensitive asset—their data.
This consolidation reveals a growing recognition across the industry: standalone tools are no longer sufficient. The fragmented approach of deploying multiple point solutions to address data security challenges, from cloud assets to endpoints, is riddled with inefficiencies. As data continues to move across increasingly complex environments—from cloud to on-prem to endpoints—the need for a unified platform has never been more critical.
A brief history of data security platforms
Consider this: data can reside in a cloud database, be exported to a CSV, emailed, stored on a laptop, and eventually shared via cloud storage. Each of these stages in the data lifecycle represents an opportunity for security breakdowns when disparate tools are tasked with covering different parts of the workflow. Over the years, the industry has responded to solving the problem with a narrow focus.
The first generation: The first generation of data security products (e.g., Symantec Vontu, McAfee, Forcepoint) focused on leveraging pattern-matching technologies to identify sensitive data on endpoints and on-premise servers. They protected this data from being exfiltrated using hard-to-tune policies that worked for limited types of sensitive data. Some companies extended this protection with a network control point. Companies like Varonis focused on right-sizing access to these on-premise data stores. However, most of these companies required complex implementation, had challenges with detection accuracy, and utilized techniques that impacted system stability and performance.
Innovation cycle 1: As data began moving to SaaS applications, the Cloud Access Security Broker (CASB) market was born. Companies like Skyhigh Networks, Adallom, Netskope, and Palo Alto Networks leveraged both APIs and proxy control points to focus on what data was stored in these sanctioned SaaS applications and to protect sensitive data from being exfiltrated. This was the first true attempt at understanding “what type of data exists in these applications, who has access to this data, and how the data is being accessed and shared.” Unfortunately, most of these solutions only solved the SaaS security problem and never built out a connected platform.
Innovation cycle 2: A new breed of companies came to market to discover and classify data both on-premise servers and in the cloud. Companies like BigID and Securiti.ai built out privacy, security, and governance use cases on top of a discovery and classification engine. This was a great attempt at building a connected story. However, these solutions focused on data at rest and not data in motion. Also, once the data is downloaded to endpoints, they lose visibility. Some of these solutions rely on machine learning technology to classify data but are complex to implement and manage.
Innovation cycle 3: As cloud data warehouses became more popular, there was a need to understand and protect the data stored in PaaS and IaaS environments. This gave birth to DSPM vendors like Cyera, Normalyze, and Concentric. Most of these companies rely on pattern-matching technology to classify content and are prone to false positives, just like the first-generation solutions. Some have incorporated AI into their detection algorithms and have extended their platforms to govern access. However, they are far from having full visibility and control over the end-to-end data lifecycle.
Many of these products from prior innovation cycles lack end-to-end visibility. These different innovation cycles have also given birth to different market categories like DSPM, CASB, IRM, and SASE, resulting in many siloed tools that create blind spots and redundancies.
For example, without unified visibility, an employee could download sensitive HR data from Workday, email it to another colleague, who then emails it to their personal account—all while flying under the radar of point solutions that aren't correlated. One would also need to understand context. For example, if this employee is downloading their W2 forms from Workday during tax season and emailing it to their spouse, that should be acceptable behavior. The lack of end-to-end context and siloed, inefficient policies leads to duplicated effort and, most critically, a higher likelihood of missing threats.
Just as cloud security has consolidated into comprehensive platforms like Wiz and Orca, data security must follow suit. These Cloud-Native Application Protection Platforms (CNAPPs) combine security functionalities that were once managed by separate tools—like Cloud Security Posture Management (CSPM), Cloud Workload Protection Platforms (CWPP), and Cloud Infrastructure Entitlement Management (CIEM). The same level of cohesion is required for data security to effectively safeguard data at rest, in use, and in motion across the enterprise.
{{ promo }}
The ingredients needed for building the all-in-one data security platform
The case for consolidation is clear: to gain full visibility into the data lifecycle, you need to break down silos. Data no longer lives in just one place, and neither should your security controls. Data can live on your endpoints, servers, SaaS applications, cloud databases, and data warehouses. Every single day, petabytes of data move between these sources and are shared within and outside the organization.
Unparalleled data visibility
It is important to gain visibility into where data resides in an organization. Without visibility into endpoints, servers, SaaS, PaaS, and IaaS data sources, a data security platform would have significant gaps. A modern data security platform must understand and protect both structured and unstructured data equally well.
Accurate classification of data
It is crucial to understand the sensitivity of the data. Traditional approaches have relied on pattern-matching technology, which is severely prone to false positives. But what if you could use context surrounding the data? Understanding where the data originated and how it moved through the organization can allow for very accurate classification without relying solely on content inspection.
This concept of “Lineage-Based Classification” is something Cyberhaven has pioneered and perfected over the years, and it is being used by hundreds of large organizations. Leveraging AI to classify content is also very promising. The best-in-class solution will leverage both context and content inspection to accurately classify data.
Accurate Data Classification = Context-Based Classification + AI based Content Classification
Understanding the lifecycle of the data
Historically, data security has been siloed in solutions that focus on scanning data at rest and assessing its posture and access, as well as in solutions that actively detect threats and respond.
Solutions like DSPM and DAG (Data Access Governance) understand the risks associated with data from the moment it’s created or ingested into your systems. This often involves data at rest. In addition to scanning content and classifying data, some of these tools also look for important context. For example, when data is stored in the “Board Meetings” folder in OneDrive, you already know vital context—like which people the file is associated with and how it’s used. But as data moves, it loses this context.
Another category of data security tools sits further right. DLP and IRM (Insider Risk Management) tools are laser-focused on preventing data exfiltration and insider threats—the "right" side of the data lifecycle. These tools take actions to protect data at this stage but sometimes lack important context, such as where the data originated and how it was handled or edited—its data lineage.
However, the game is changing. Effective data security requires a shift to both the left and the right—monitoring and protecting data from creation through storage, sharing, and eventual deletion. A modern data security platform connects both sides, enriching detection capabilities with metadata, access permissions, and contextual signals from the moment data is created. This results in more accurate detection, as you’re not solely relying on content inspection but leveraging the rich context associated with the data from the outset.
Connecting all the dots — a Data Security Knowledge Graph
Building a connected data story in terms of where the data is stored, what type of data it is, where it came from, who has access to it, and who is accessing it is game-changing for understanding and protecting data. As we have seen with cloud security, the true winners in this market were the ones with an architecture that could easily add and leverage context.
Building a Data Security Knowledge Graph is a challenging technical problem. In a typical organization, there could be millions of data elements and billions of data flows over time. To store, correlate, and access this information in milliseconds requires a strong architectural foundation. This is one of the reasons why previous attempts at building an end-to-end Data Security Platform have fallen short.
The challenge of unstructured data
The future of data security platforms hinges on their ability to handle both structured and unstructured data. Structured data, while spread across sprawling cloud and on-prem databases, is at least predictable. It’s tabular, neatly organized, and typically comes with defined metadata fields, making it relatively easy to track and protect. This is where DSPM products initially focused before customers pushed them to expand into SaaS and other repositories for a more comprehensive view of their data, much of which is unstructured.
Unstructured data is a completely different beast. From documents and PDFs to images and videos, unstructured data lacks the inherent organization of structured data and is far more difficult to secure. It requires broader visibility across diverse systems—endpoints, email, cloud platforms, and more. It also requires a combination of content and context to accurately classify it. Sometimes, the line between structured and unstructured data is blurred—e.g., when someone exports data from a DB and creates charts from it or when AI turns unstructured data back to structured data. A successful data security platform must account for the variety of users, contexts, and data lineage that unstructured data encompasses as it flows across the organization.
Platforms for tomorrow’s problems
As we look ahead, the data security platforms that will win aren’t simply applying yesterday’s technologies to tomorrow’s problems. Today’s security challenges require a more sophisticated approach. The traditional method of relying on pattern-matching or even AI-powered content analysis falls short. For instance, a file containing names and phone numbers could either be a sensitive customer list or a public directory downloaded from the internet. Content analysis alone cannot determine the context or risk associated with that data. Instead, security needs to be data-first, meaning that platforms must work directly with the data and its associated metadata to understand where it came from, who’s using it, and whether it’s at risk.
The wave of consolidation sweeping through the data security space is a clear signal of where the industry is heading. The rise of DSPM vendors and the acquisitions of DLP tools mark the beginning of a new era in cybersecurity—one where platforms, not point solutions, lead the way.