The Future of the SOC: How Microsoft Sentinel Is Converging SIEM and Data Lake Architectures

This is 75% ready and uploaded

Publication marked as "For Review"

Your scrolling text here

Microsoft CEO Satya Nadella has positioned cybersecurity as a core pillar of the company’s strategy, with a focus on building a unified platform that spans on-prem, cloud and data. As one of the leading organizations of AI with OpenAI, their goal is to deliver a scalable platform that enables organizations to detect, protect, and respond across their environments. Microsoft’s security vision is both comprehensive and integrated, covering identity, endpoint, email security, application and data protection, cloud security, SIEM and, importantly, agentic security. This is the beginning of our Microsoft security coverage. Over the next few months, expect to learn more about Microsoft’s strategic plans around SOC, Identity and Data. We will be uncovering aspects of their platform strategy.

Actionable Summary

  1. A couple of weeks ago, we wrote about the disaggregation happening with security operations. We examined Cribl’s role in disrupting aspects of the data pipeline. Earlier, we covered the larger trends happening with the rise of security data pipelines and how SIEMs must evolve over time to meet current demands.
  2. In April 2024, we also wrote about the changing dynamics within the modern cybersecurity data platform, focusing on how data lakes are disrupting aspects of the traditional SIEM model.
  3. Microsoft, the largest technology provider in cybersecurity, has now launched a key release that builds on a core trend we’ve been tracking. The company has introduced Sentinel data lake in public preview, marking a significant evolution of its cloud-native SIEM capabilities.
  4. This is my first write-up on Microsoft was over 2 years since I wrote this report – Microsoft’s $20B Cybersecurity platform, but I will use the opportunity to refresh all practitioners reading about their business. This report offers our analysis of the architectural components, core capabilities, and strategic implications of this offering for the future of SIEM.
  5. At its core, Microsoft Sentinel data lake addresses critical challenges facing security operations teams: the rapid growth of security data volumes, rising storage costs, and the demand for broad threat visibility without exceeding budgets. By introducing a unified, cost-effective storage layer with advanced analytics support, Microsoft is evolving into a comprehensive security platform, not just a traditional SIEM.

Microsoft Security Portfolio


Diagram of the Microsoft Security portfolio including Sentinel, Defender, Entra, Purview, and Intune.

This is a good opportunity to remind all the practitioners reading, about Microsoft’s security portfolio, and how it’s organized into five major product lines that work together to provide comprehensive protection across an organization’s digital estate.

    1. Microsoft Sentinel: This is their cloud-native Security Information and Event Management (SIEM) and (SOAR) platform that serves as the central repository for security data from across the Microsoft security ecosystem and beyond⁠⁠. This is the focus of our analysis today. More later in the report.
    2. Microsoft Defender: Microsoft’s flagship cybersecurity product that protects workloads across endpoints, applications, and the cloud. It’s broken down into four areas: Defender for cloud , Defender for cloud apps, Defender for endpoints and XDR, and Defender for IoT and vulnerability management⁠⁠. This is the product that competes against Crowdstrike and SentinelOne. We’ll discuss the Defender product because of its close relationship with the Sentinel and data lake products. Microsoft Defender includes modular products for various domains:
      • Defender for Endpoint delivers EDR, XDR, threat and vulnerability management, and auto-remediation.
      • Defender for Identity protects Active Directory environments against lateral movement and insider threats.
      • Defender for Office 365 secures collaboration tools (email, Teams) from phishing, BEC, and ransomware.
      • Defender for Cloud provides CWPP + CSPM capabilities for hybrid/multi-cloud environments. It helps strengthen cloud security posture, protect workloads, and unify DevOps security across hybrid and multicloud environments, providing unified security from code to runtime.
      • Defender for Cloud Apps (CASB) enables SaaS visibility and control, and integrates with Entra SSE for SaaS security edge enforcement.
    3. Microsoft Entra: Microsoft’s identity security platform, Entra includes Entra ID (formerly Azure AD) and addresses workforce, workload, and consumer identity use cases. It provides Azure Active Directory, workload identities, and identity governance to ensure only authorized users access systems and data⁠⁠, decentralized identity (Verified ID), and Zero Trust SSE access (Internet and Private Access). The products in the Entra family help provide secure access to everything for everyone, by providing identity and access management, cloud infrastructure entitlement management, identity verification, and identity governance.

With ~26.5% (analyst estimates) market share in IAM, it is the undisputed leader. Entra’s deep integration with Office 365 and Windows environments gives it an unique leverage over standalone IAM vendors like Okta. Entra provides identity and access management for workforces, collaborators, customers, workloads, and AI agents. Microsoft Entra ID (formerly Azure Active Directory) offers multi-cloud IAM with phishing-resistant authentication and adaptive conditional access. The Entra Suite adds identity governance, risk-based identity protection, decentralized identity verification, and Zero Trust SSE for securing access to public and private networks.

This also includes CIAM capabilities and access management for workloads and AI agents. With Entra Internet Access and Private Access, Microsoft has entered the SSE market, aiming to integrate network and identity management, competing with vendors like Zscaler, Netskope, and Palo Alto Networks.

  1. Microsoft Purview: Microsoft’s solution for data security, compliance, and governance, helping organizations protect sensitive information across their environments⁠⁠. Purview unifies data governance, compliance, and DLP into a cloud-native platform spanning Microsoft 365, Azure, and third-party data sources. It includes tools for audit logging, data mapping, insider risk management, lifecycle governance, and adaptive DLP. Microsoft positions Purview as the backbone for modern GRC and data protection efforts, aiming to replace fragmented compliance tooling.
    • Adjacent to this is Priva. This is Microsoft’s data privacy solution focused on risk management and data subject rights. It automates data access requests and identifies overexposed data or privacy risks using policy templates. As privacy regulations expand (e.g., GDPR, CCPA), Priva is Microsoft’s attempt to fill enterprise privacy gaps without relying on third-party GRC tools.
  2. Microsoft Intune: A unified endpoint management product that allows companies to deploy and manage all their devices, including computers, servers, and mobile devices, through a single console. (Similar to a company called Tanium). Intune is Microsoft’s Unified Endpoint Management (UEM) suite that combines mobile device management, endpoint security, and privilege management. Integrated natively with Defender, Entra ID, and Windows Autopilot, it provides full policy enforcement and conditional access. The Intune Suite offers advanced tools like Remote Help and Endpoint Privilege Management, designed to enforce least privilege and enable Zero Trust endpoint control at scale.
  3. Microsoft Security Co-pilot: This is its latest launch since the ChatGPT moment. Copilot is Microsoft’s generative AI assistant for security professionals, integrated into products like Sentinel, Defender, Intune and Entra. It allows natural language threat hunting, investigation, and response workflows. In testing, it improved junior analyst accuracy by 35% and senior analyst speed by 22%. At $4/hour, its usage-based pricing model reflects its role as an assistive (not autonomous) AI layer. Security Copilot also powers a growing set of AI agents that automate routine tasks, accelerate triage, and scale incident response across the Microsoft Security portfolio.

The SIEM Market Today vs Microsoft Sentinel


Chart showing the competitive SIEM market landscape.

Our firm has written extensively about the competitive pressures happening within the SIEM market today. The SIEM market in 2025 is undergoing a big transformation driven by cost pressures, data volume challenges, and the rise of modular architectures. As traditional SIEMs strain under exponential data growth, the market is shifting from monolithic systems to more flexible, cost-effective solutions⁠⁠.

Microsoft Sentinel has evolved significantly since its 2019 launch, positioning itself as more than just a cloud-native SIEM. With the introduction of Sentinel data lake, Microsoft is addressing the industry’s biggest pain point – the tradeoff between comprehensive visibility and escalating storage costs⁠⁠. This solution stores data at around 15% of traditional analytics log costs, allowing security teams to maintain full visibility without budget compromises⁠

Logos of major SIEM market competitors.

In comparison to key competitors:

  • Splunk (now Cisco): Remains deeply entrenched in large enterprises with over $4B in ARR as of 2024⁠⁠. While maintaining significant market share, Splunk struggles with its shift from on-premises to cloud and is notorious for high costs. The recent Cisco acquisition has raised concerns among customers about future direction⁠⁠.
  • Elastic: Has gained popularity, particularly in the public sector, due to competitive pricing⁠⁠. Like Splunk, Elastic has introduced data tiering options to address cost concerns, but lacks the comprehensive security ecosystem that Microsoft offers⁠⁠.
  • Palo Alto XSIAM: Positioned as a comprehensive platform merging SIEM, XDR, ASM, and SOAR capabilities⁠⁠. While powerful, XSIAM’s ecosystem is gaining traction well utilizing Google’s BigQuery.

As everyone knows, Microsoft Sentinel, Splunk, Elastic, Palo Alto XSIAM currently dominate the SIEM market as the major SIEM providers. However, many fail to meet the strengths of Microsoft’s capabilities.


Microsoft Sentinel (Core SIEM)


Microsoft Sentinel is the company’s core SIEM product, competing with other major players in the space. Since its general availability in September 2019, Sentinel has rapidly evolved into a leading cloud-native Security Information and Event Management (SIEM) solution for SOC teams.

Offering cloud-scale analytics, Sentinel provides organizations with a wide view across their digital estate by collecting security data from diverse sources. These include Microsoft 365, Microsoft Defender, Azure, Entra ID, third-party cloud providers like AWS and Google Cloud, and over 350 third-party tools and custom integrations. Its cloud-native architecture eliminates the need for on-prem infrastructure and supports rapid deployment through a flexible, consumption-based pricing model.

Microsoft Sentinel is built to detect previously undiscovered threats and correlate seemingly unrelated alerts into actionable incidents. One of its key strengths is its use of Kusto Query Language (KQL), which gives security teams robust capabilities for crafting custom detections and conducting in-depth threat hunting. Analysts frequently compare KQL favorably to Splunk’s SPL, citing its flexibility and depth.

Furthermore, its integrated SOAR capabilities, powered by Azure Logic Apps, enable automated responses to common security incidents, reducing manual effort. In terms of capabilities, Sentinel’s “better together” narrative with Microsoft Defender is a standout, providing a unified view of threats from endpoint to cloud. Security analysts laud the seamless integration with Azure services and Microsoft 365, which streamlines data collection and enhances visibility.

One of the key benefits of Sentinel is its ability to scale to meet the needs of organizations of all sizes. It’s a cloud-native system, which means it can handle large volumes of data, and can be easily scaled up or down as needed. This makes it an ideal solution for organizations that are growing quickly, or that need to process large volumes of security data.


Architectural and Technological Components


Architectural diagram of Microsoft Sentinel data lake.

Microsoft Sentinel data lake introduces a security-optimized data tier that extends the existing Sentinel SIEM architecture. Built on Microsoft’s OneLake infrastructure, the lake adds a unified storage layer that merges three key components:

  1. Activity Store (containing security logs and activity data),
  2. Asset Store (data from Microsoft and non-Microsoft sources, used for exposure management and future post-breach analysis),
  3. Threat Intelligence Store (integrating Microsoft Defender for Threat Intelligence).

Technically, the platform uses open formats such as Delta Parquet. This allows compatibility with multiple analytics engines and ensures long-term flexibility. Integration with the Microsoft Defender portal provides a more unified experience across the security operations stack. An added architectural feature is the mirroring of data between tiers. Data placed in the analytics tier is automatically available in the data lake, ensuring the lake serves as the system of record for all security data.

Diagram showing the data flow and storage tiers in Microsoft Sentinel.

The data lake operates as an extension of Sentinel rather than a replacement, allowing security teams to configure data routing between the analytics tier and data lake tier, based on their specific needs. This enables organizations to optimize their data management strategy without disrupting existing workflows. Microsoft has designed the system to support data retention for up to 12 years, addressing both immediate security needs and long-term compliance requirements.


Key Capabilities


While there are many capabilities that are launched and will be launched later. There are two key standout features that are worth mentioning for cybersecurity teams.

Federation Support

A core capability of Microsoft Sentinel Data Lake is its support for data federation, enabling integration with external repositories such as Databricks, Snowflake, S3, and ADLS. This addresses a common challenge for enterprises with large data ecosystems by reducing the need for data migration projects. Rather than requiring organizations to move all security data into Microsoft-controlled storage, the federation feature allows them to retain data in existing repositories while using Sentinel’s analytics layer.

The federation model is designed to be transparent to users. When configuring the data lake, customers can specify external data sources, which then appear as tables in the query experience. This allows analysts to run queries across both data stored in Microsoft’s lake and federated sources. For organizations with existing investments in platforms like Snowflake or Databricks, this offers continuity with their current data infrastructure while enabling security use cases through Sentinel.

This approach supports organizations with hybrid or multi-cloud strategies and those that prefer to avoid vendor lock-in. By enabling federation across multiple platforms, Microsoft is aligning with common enterprise data architectures. Sentinel Data Lake can function as a security overlay that works alongside existing systems.

Microsoft has clarified that federation capabilities will be implemented gradually, with full rollout expected by the end of calendar year 2025.

Security Analytics Capabilities

Microsoft Sentinel data lake is not simply a cost-effective storage option. It includes a fully managed analytics engine that supports advanced operational capabilities. The engine works with KQL, maintaining continuity with existing Sentinel environments, and also supports Python notebooks via a VS Code extension. This opens the door to new security data science use cases.

The analytics engine is particularly helpful for historical data analysis, supporting critical security functions such as retroactive threat intelligence matching, forensic investigations, and anomaly detection. Security teams can create asynchronous jobs to perform deep analysis over high-volume logs, helping identify low-and-slow attack patterns that might otherwise go undetected. This approach allows security teams to balance immediate, real-time analysis with thorough historical investigation without performance compromises.

A notable feature is the integration of GitHub Copilot with the Python notebook environment, bringing AI-assisted analytics to security operations. This helps bridge skill gaps in security teams by allowing analysts who may not be proficient in Python to leverage data science capabilities through natural language interactions. The underlying Spark engine provides the computing power needed for sophisticated machine learning models and data analysis across massive datasets, enabling security teams to apply advanced techniques to their security data. Rather than simply parking data in cold storage where it becomes difficult to access, Microsoft’s approach ensures that all data remains analyzable. This fundamentally changes the equation for security teams who have traditionally had to make difficult choices about which data to retain for analysis versus which to archive due to cost constraints.


Additional Core Features


Support for Multiple Data Formats

Microsoft Sentinel data lake provides robust support for multiple data formats, significantly enhancing its ability to normalize and analyze diverse security data. It currently supports advanced hunting schema and ASIM (Advanced Security Information Model), with OCSF (Open Cybersecurity Schema Framework) support on the roadmap. This multi-schema approach addresses one of the most challenging aspects of security data management – the inconsistent formats and taxonomies across different security tools and data sources.

The support for standard schemas enables more effective correlation and search capabilities across disparate data sources. By normalizing security events into consistent formats, Microsoft enables security teams to develop detection rules and analytics that work consistently regardless of the original data source. This approach also provides data transformation capabilities at various stages of the data lifecycle during ingestion, through ETL jobs, or at query time , giving organizations flexibility in how they standardize their security data.

This is important because Microsoft is now supporting open formats rather than proprietary ones, representing a shift in their approach to security data, acknowledging the increasingly heterogeneous nature of enterprise security environments and positioning Sentinel data lake as an open/integration layer rather than a closed ecosystem.

Integration of Microsoft Defender Threat Intelligence

A significant value proposition of Microsoft Sentinel data lake is the inclusion of Microsoft Defender Threat Intelligence (MDTI) at no additional licensing cost This removes the licensing barrier previously required for access to threat intel data and gives security teams immediate visibility into Microsoft’s threat intelligence corpus. That corpus is built from over 84 trillion signals and supported by more than 10,000 security professionals.

Integration goes beyond raw data access. It enables retroactive threat matching, allowing analysts to scan historical logs for indicators of compromise as new threats emerge. The threat intelligence store operates alongside activity and asset data in the unified storage layer, supporting more comprehensive investigations and contextual correlation.


Benefits for Microsoft Customers


Cost Optimization

Most people know of the complaints with regard to the cost of maintaining SIEMs. Hence, perhaps the most immediate and tangible benefit of Microsoft Sentinel data lake is the cost optimization for security data storage. Microsoft claims that storing data in the Sentinel data lake costs less than 15% of traditional analytics log storage costs. This addresses one of the most significant pain points for security teams, who have long struggled with the trade-off between comprehensive security visibility and escalating data storage costs.

The cost structure allows organizations to retain security data for much longer periods without compromising their budgets. Instead of being forced to make difficult choices regarding which data to retain for analysis, organizations can now maintain comprehensive visibility across their security landscape. This extended retention enables more thorough investigations, better compliance with regulatory requirements, and the ability to conduct retrospective analysis as new threats emerge.

The cost benefits extend beyond simple storage savings. By providing configurable data routing between the analytics tier and data lake tier, Microsoft enables organizations to optimize their spending based on the criticality and analytics needs of different data types. High-volume, relatively noisy data sources can be directed to the more cost-effective data lake storage while maintaining their analytical value, rather than either being deleted or consuming expensive analytics-tier resources.

Additional cost benefits come from the inclusion of Microsoft Defender Threat Intelligence at no additional cost and the ability to leverage existing data repositories through federation, avoiding expensive data migration projects and maximizing return on existing investments in data infrastructure.


Broader Implications for the Future of SOC


Microsoft Sentinel data lake signals a shift in how security operations centers will function moving forward. It reflects an evolution from alert-centric to data-centric security operations, where the comprehensive collection, retention, and analysis of security data becomes the base layer for both human-led and AI-driven workflows.

This shift enables several key changes in SOC operations. First, it allows for more proactive security practices. Teams are no longer limited to reacting to real-time alerts. With access to long-term historical data, they can hunt for threats across extended timelines. As threat intelligence improves, analysts can re-evaluate previously collected data to identify indicators that may have gone unnoticed.

Second, the data lake architecture lays a strong foundation for AI applications in security. By centralizing normalized data and supporting both KQL and Python-based analytics, Microsoft is building the conditions needed for AI to become a practical tool in day-to-day operations. The GitHub Copilot integration marks an early step toward embedding AI into the analyst workflow.

Third, this shift will reshape the makeup of SOC teams. While traditional security analyst skills remain important, organizations will increasingly need team members with data science experience and Python proficiency. Microsoft’s use of GitHub Copilot may help bridge this transition, but the broader trend toward data-centric operations will continue to influence hiring and team structures.

Finally, Microsoft’s architectural choices, including federation and open format support, signal a move toward platforms that serve as integration layers across diverse environments. This reflects the reality of most enterprise infrastructures today, which often span multiple clouds, on-prem systems, and SaaS platforms.


The future of the SOC involves a SIEM-data lake convergence


Overall, we believe that the future of the SOC will bring about a convergence of SIEM and data lake technologies. This will represent one of the biggest architectural shifts in security operations since 2023. Traditional SIEMs were designed as monolithic systems that both stored and analyzed security data, often charging based on data ingestion volumes. As data volumes have grown exponentially, with many organizations experiencing doubling of log volumes every 18 months, this model has become economically unsustainable. The biggest in recent years is around the separation of storage and analytics tiers. Modern SIEM architectures are increasingly modular, decoupling the analytics layer from the storage layer. Microsoft Sentinel data lake exemplifies this approach, allowing security teams to configure data routing between the analytics tier and the data lake tier based on their specific needs. This enables organizations to optimize their data management strategy without disrupting existing workflows.

Once again, the cost-optimized tiered storage is a big deal for organizations. The query layer model allows security teams to store high-volume data in significantly cheaper storage options like Azure Blob Storage, Snowflake, or other cloud object storage, while maintaining query capabilities. Microsoft claims storing data in the Sentinel data lake costs less than 10% of traditional analytics log storage⁠⁠. This addresses one of the most significant pain points for security teams who have long struggled with the trade-off between comprehensive visibility and budget constraints.


Conclusion


Microsoft Sentinel data lake represents a strategic evolution in Microsoft’s security portfolio. It addresses critical market needs around storage cost, data integration, and advanced analytics. By combining cost-effective storage with full analytics capabilities, the offering removes traditional trade-offs between visibility and budget constraints.

Federation with external repositories and the bundled access to Microsoft Defender Threat Intelligence provide immediate operational value. Support for multiple data formats and schema models gives organizations the flexibility needed in mixed environments. Together, these capabilities enable more scalable and sophisticated security operations, including retroactive analysis and anomaly detection.

Microsoft has also indicated that the data lake is only the beginning. Additional platform capabilities are expected in September 2025, as part of a larger roadmap.

For security leaders, Microsoft Sentinel data lake offers near-term benefits around cost and data access, while also serving as a foundation for longer-term changes in how security teams operate. Organizations evaluating this offering should consider how it aligns with their existing Microsoft investments and prepare for the architectural and skill shifts that come with a more data-centric approach.

By addressing immediate operational challenges and enabling future growth in AI-driven security, Microsoft Sentinel data lake marks a meaningful advancement in the evolution of security operations.

A summary diagram of the Microsoft security ecosystem.

Future Reports

This is the beginning of our coverage of Microsoft capabilities. Please expect more in-depth reports covering their platform strategy, especially relative to other key players across the cybersecurity ecosystem. We will publish more reports on the future of the SOC in the upcoming months.


Additional Readings

About Michelle Larson

Michelle Larson is a lingerie expert living in Brooklyn, NY, where she creates quippy written content, crafts dreamy illustrations, and runs the ethically-made loungewear line.

Related Posts

AI-SOC Report 2025

cyber threat intelligence services

From Pipeline to Platform: The Cribl Success Story & the New Frontier of Security Data

No Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Report Outline

home-new-prev data-ai-security deepak videos subscribe mika security-operations blog-3 conference identity-network-security rapheal other-topics webinar webinar about-us-prev cloud-app-security dspm report-tag/new
cybersecurity research icon

Subscribe to the
Software Analyst

Subscribe for a weekly digest on the best private technology companies.