When Amazon Security Lake Is Not Enough on Its Own: The Missing Layer Most Organisations Overlook
Amazon Security Lake is a genuinely significant development in security data architecture. The ability to centralise security data from across an AWS environment, third-party sources and on-premises infrastructure into a single, standardised store, built on open formats and designed for interoperability, addresses a problem that has frustrated security teams for years. The interest from CISOs and security architects is well-founded.
But there is a pattern that emerges consistently once organisations move from evaluation to live deployment, and it is worth talking about honestly. Amazon Security Lake is a powerful foundation. On its own, it is not a complete security operations capability. The gap between what it delivers out of the box and what security teams need to detect, investigate and respond effectively is larger than many organisations anticipate — and closing that gap requires investment in a layer that the platform itself does not provide.
This is not a criticism of Amazon Security Lake. It is an observation about what the platform is and what it is not, and about the additional architecture that organisations need to build around it if they want it to deliver on its promise.
What Amazon Security Lake Gets Right
It is worth starting here, because the foundation is genuinely strong and it matters for understanding why the gaps that follow are architectural rather than fundamental.
Security Lake solves the aggregation problem. Pulling security data from AWS services, third-party tools, custom sources and on-premises environments into a single store has historically required significant engineering effort and ongoing maintenance. Security Lake reduces that burden considerably, particularly for organisations already operating within the AWS ecosystem.
It solves part of the normalisation problem. The adoption of the Open Cybersecurity Schema Framework as the standard schema for data ingested into Security Lake is an important step toward interoperability. OCSF provides a common structure that allows different data sources to be queried and compared without the kind of bespoke transformation work that has traditionally consumed significant analyst and engineering time.
It solves the storage economics problem. Security Lake is built on Amazon S3, which means that the cost of retaining large volumes of security data for extended periods is substantially lower than equivalent retention in a traditional SIEM. For organisations that have been making uncomfortable compromises on retention periods to control costs, this is a meaningful change.
And it solves the access problem. Data stored in Security Lake can be queried by a growing ecosystem of compatible tools, which means security teams are not locked into a single vendor for analytics and detection. That flexibility is valuable.
These are real and substantial benefits. They are also the starting point, not the destination.
Where the Gaps Appear
The gaps that organisations discover after going live with Security Lake tend to fall into a consistent set of categories. Understanding them in advance is considerably more useful than discovering them through operational experience.
Normalisation Is Incomplete Without Enrichment
OCSF provides a structural standard, but structure alone does not produce the kind of enriched, contextualised data that security operations teams need. Raw log data normalised into OCSF fields still lacks the contextual layer that makes it operationally useful: asset classification, user context, business criticality, threat intelligence enrichment, geolocation, identity resolution.
Without that enrichment layer, analysts working with Security Lake data are working with well-structured raw material that still requires significant manual effort to interpret. The normalisation that OCSF provides is necessary but not sufficient. The enrichment that sits on top of it determines whether the data is genuinely actionable.
Building that enrichment layer requires decisions about:
- Which threat intelligence feeds are integrated, and how that intelligence is applied to incoming data at ingestion.
- How assets are classified and tagged so that the same IP address or hostname carries context about what it represents in the business.
- How identity data from directory services is resolved against log events so that user activity is attributable to real individuals rather than service accounts or generic identifiers.
- How business context, environment classification, data sensitivity, is attached to events so that the same alert can be weighted differently depending on whether it fires in a production environment or a development one.
This is substantive work. It does not happen automatically when you deploy Security Lake, and it does not happen once. It requires ongoing maintenance as the environment changes.
Querying Security Lake Requires Expertise That Many Teams Do Not Have
Security Lake stores data in Apache Parquet format, queryable via Amazon Athena. For data engineers and cloud architects, this is familiar territory. For SOC analysts whose primary tooling has been a SIEM with a proprietary query language and a purpose-built interface, it is not.
The practical consequence is that the data in Security Lake is accessible in principle but inaccessible in practice for the analysts who need to use it day-to-day. Investigation workflows that worked efficiently in a SIEM break down when analysts are expected to write Athena queries against Parquet files. Threat hunting that was achievable with a SIEM’s built-in search capabilities becomes significantly harder without a purpose-built interface sitting in front of the data.
This is not insurmountable, but it requires deliberate investment in either capability building — training analysts on the relevant tooling — or in an abstraction layer that presents Security Lake data through an interface that security operations teams can work with effectively. Without one or the other, the data exists but the operational value does not.
Detection Is Not Included
This is the gap that surprises organisations most consistently. Security Lake is a data store. It does not include native detection capability. There is no engine continuously evaluating incoming data against detection logic, no alert queue for analysts to work from, no correlation of events across sources to identify patterns that individual events would not reveal.
For organisations migrating from a SIEM, this is a significant adjustment. The SIEM, whatever its cost and scalability problems, provided an integrated detection and alerting capability. Security Lake requires that detection capability to be built or integrated separately.
That might mean integrating a dedicated detection engine that queries Security Lake on a scheduled or streaming basis. It might mean connecting Security Lake to a SIEM that handles detection while the lake handles storage and long-term retention. It might mean building a custom detection layer using AWS-native services. What it cannot mean is assuming that the detection problem is solved by deploying Security Lake alone.
Response Capability Requires Additional Architecture
Even where detection is addressed, response capability is a separate question. Security Lake does not include SOAR functionality. When a detection fires, the workflow that determines what happens next, who is notified, what actions are taken, how the incident is tracked, has to be built outside the platform.
For organisations with mature SOAR deployments that they are retaining alongside Security Lake, this may be a straightforward integration. For those who were relying on a SIEM’s built-in workflow and ticketing capability, it is an additional architectural component that needs to be planned and resourced.
Compliance Reporting Is Not Automatic
Security Lake’s ability to retain large volumes of security data at low cost makes it attractive for compliance use cases. The data organisations need to demonstrate controls is in the lake. Getting it out in the format that auditors and regulators require is a different challenge.
Compliance reporting against frameworks like NIST, ISO 27001 or sector-specific requirements typically needs data that has been mapped to specific control objectives, presented in standardised formats and supported by evidence chains that auditors can follow. Security Lake stores the raw data. Producing compliance-ready outputs from it requires a reporting layer that maps Security Lake data to control frameworks and generates the documentation that audit processes demand.
What the Missing Layer Needs to Provide
Taken together, the gaps above describe a consistent set of capabilities that organisations need to build or acquire in addition to Security Lake itself:
- Enrichment at ingestion, attaching threat intelligence, asset context, identity resolution and business classification to data as it enters the lake.
- An operational interface that allows SOC analysts to investigate and hunt effectively without requiring data engineering expertise.
- A detection layer that continuously evaluates data against detection logic and surfaces alerts for analyst review.
- Response workflow integration that connects detections to ticketing, escalation and automated response actions.
- A compliance reporting capability that maps Security Lake data to control frameworks and produces audit-ready outputs.
This is not a small build. For organisations approaching Security Lake as a complete solution rather than a foundation, the discovery that this additional layer is required tends to arrive as an unwelcome surprise, often after the initial deployment budget has been spent.
For organisations that plan for it from the outset, it is a tractable architecture problem. The components exist. The integration patterns are well understood. The key is recognising that Security Lake is the starting point and planning the surrounding architecture accordingly.
The Honest Conversation Security Leaders Need to Have
The decision to adopt Amazon Security Lake is often made on the strength of the storage economics and the aggregation capability, which are real and compelling. The conversation that does not always happen alongside that decision is about total cost of ownership, including the additional architecture required to make the platform operationally useful.
That conversation is not a reason to avoid Security Lake. The platform is a strong foundation and the trajectory of development is positive. It is a reason to go into the deployment with a clear-eyed view of what you are building, what it will take to build it, and what you will need in addition to the platform itself.
The organisations that are getting the most value from Security Lake are not the ones that deployed it and hoped for the best. They are the ones that treated it as a foundation and invested seriously in the enrichment, detection and operations layers that sit on top of it. The difference in outcome between the two approaches is significant — and it is entirely predictable in advance.
HOOP Cyber specialises in Amazon Security Lake deployments and the security data architecture that surrounds them. If you are evaluating Security Lake or looking to get more from an existing deployment, we would be glad to talk. Visit hoopcyber.com to find out more.