Mar 27, 2026
Document Processing with AI Agents: When to Use Bedrock AgentCore Runtime (vs Lambda)

Background
Intelligent document processing is a pattern we have built on AWS for financial services clients, where documents routinely contain sensitive data including personally identifiable information (PII). When Amazon Bedrock AgentCore Runtime became available, the natural question arose: when does it make sense to use it over a more familiar Lambda-based architecture?
The short answer is that it depends on whether your documents are independent or form part of a related batch requiring shared context, how long your processing takes and — critically — how you need to manage data governance boundaries.
This post lays out the trade-offs, the governance considerations and the architecture patterns we have found most useful.
What is Amazon Bedrock AgentCore Runtime?
Amazon Bedrock AgentCore is a managed platform for building, deploying and operating AI agents at scale without managing the underlying infrastructure. The Runtime component specifically handles agent hosting and execution.
The key architectural difference from Lambda is that AgentCore runs each session in a dedicated microVM — an isolated virtual machine with its own CPU, memory and filesystem. This is not a shared container that gets reused between invocations. When a session ends, the entire microVM is terminated and its memory is sanitised [1].
This isolation model has direct implications for both cost and governance, which we will explore in detail.
Governance of Sensitive Data
This is the pillar that organisations in regulated industries need to get right before anything else. AI agent workloads introduce unique governance challenges compared with traditional compute: agents may read, reason over and store context that includes PII, financial records, legal privileged materials or health data. The boundaries of where that data lives — and for how long — must be explicit and defensible.
The Session as a Data Boundary
The most important governance primitive in AgentCore Runtime is the session. Each runtimeSessionId defines an isolated microVM. Data written to memory or disk within that session does not bleed across to any other session. When the session ends, the microVM is terminated and memory is sanitised [1].
This means the session boundary is your PII boundary. The engineering consequence is:
- One document per session if documents must not share context (e.g. independent patient records, separate legal matters)
- One session per batch if related documents belong together and the agent needs to reason across them (e.g. all documents for a single case file)
This is not just a technical choice — it is a data governance decision that should be driven by your data classification policy and any applicable regulatory obligations.
Operational Logging
There do not appear to be CloudTrail data plane events for AgentCore Runtime session invocations. Unlike AgentCore Gateway, where data events are explicitly documented (though not enabled by default), no equivalent documentation exists for Runtime. Management-plane calls — creating or updating a runtime resource — are logged in CloudTrail as standard, but these tell you little about what data your agent processed.
What AgentCore Runtime does provide is CloudWatch Logs. When you create a runtime resource, a log group is created automatically at /aws/bedrock-agentcore/runtimes/<agent_id>-<endpoint_name>/runtime-logs. Each invocation record includes session_id, request_id, trace_id, operation and the request_payload and response_payload [9].
This is operational logging rather than a formal audit trail. For regulated workloads, the distinction matters: CloudWatch Logs can be queried and retained, but they are not necessarily the tamper-evident, account-wide audit record that CloudTrail provides. If you need to demonstrate to auditors what data was passed to the agent and when, think carefully about how you will provide this.
Data payloads
The agent receives whatever you put in the invocation payload. Resist the temptation to send entire raw documents if a structured extract would suffice. This limits the PII surface area and reduces payload size. AgentCore supports payloads up to 100 MB [3], but that does not mean you should use all of it.
Governance Summary
| Control | AgentCore Runtime | Lambda |
|---|---|---|
| Session isolation | Dedicated microVM, memory sanitised on termination | Shared container, reused between invocations |
| PII boundary | Session ID | Invocation (but no persistent state) |
| Encryption at rest | Encrypted; customer-managed KMS available | No native in-function storage |
| Logging | CloudWatch Logs (per-invocation); CloudTrail for management plane only | CloudTrail |
| Model training use | Not used | Not applicable |
Lambda has no persistent session state between invocations (beyond what you store), which makes it simpler from a PII isolation perspective for independent documents. The trade-off is that you lose the context-sharing capability that makes AgentCore valuable for multi-document batches.
Synchronous vs Asynchronous Processing
Per the AWS documentation: “The Amazon Bedrock AgentCore SDK supports both synchronous and asynchronous processing through a unified API. This creates a flexible implementation pattern for both clients and agent developers.” [4]
Short processing (seconds to a minute): Synchronous is acceptable — the caller waits for the response, but the cost of holding that connection adds up quickly beyond that.
Anything longer: Asynchronous is the right approach. The session remains active and the agent processes in the background, with the caller polling or receiving a callback.
Timeout limits:
- Lambda: 15 minutes maximum [5]
- AgentCore synchronous: 15 minutes maximum
- AgentCore asynchronous: 8 hours maximum [6]
AgentCore appears to have no built-in completion notification mechanism at the moment. If you need to notify downstream systems when processing completes, your agent code must publish an event.
Cost: The I/O Wait Difference
This is where AgentCore has a structural advantage over Lambda for AI workloads. Most of the wall-clock time in an LLM-based agent is spent waiting for model responses — not doing computation.
Lambda: Charges for the full execution duration at the full CPU plus memory rate, including all time spent waiting for Bedrock API responses.
AgentCore Runtime: CPU is charged only during active processing. During I/O wait — waiting for LLM responses, tool calls, external API responses — CPU is not charged. Memory is charged for the full session duration [7].
Example for a document with 1.5 seconds of CPU processing and 8 seconds waiting for LLM responses:
| Lambda | AgentCore | |
|---|---|---|
| Billed CPU | 9.5 seconds | 1.5 seconds |
| Billed memory | 9.5 seconds | 9.5 seconds |
For I/O-heavy workloads, AgentCore is more cost-effective per document. For short, fast processing where session management adds overhead, the cost difference is negligible and Lambda’s simplicity wins.
Event-Driven Integration
AgentCore Runtime cannot currently be triggered directly by S3 events, SQS or EventBridge. It is not listed as an EventBridge target and has no native event source mapping [8].
This means Lambda is required as an intermediary for event-driven architectures:
Event Source (S3 / SQS / EventBridge) → Lambda → AgentCore API
This is not a major constraint, but it is a cost and latency consideration. The Lambda function in this pattern is short-lived — it simply validates the event, creates a session ID and invokes AgentCore — so the additional cost is minimal.
Session Limits
AgentCore Runtime enforces limits on concurrent sessions [6]:
- Active session workloads: 1,000 in US East (N. Virginia) and US West (Oregon); 500 in other regions
- These limits can be raised via AWS Service Quotas
For high-volume document processing, these limits require consideration. Lambda-based architectures with SQS event source mapping and concurrency controls may be more straightforward to scale.
Failure Handling
AgentCore Runtime has no built-in API to query session status. Silent failures — where the agent crashes without publishing a result — are a concern. Two complementary strategies can help:
1. DynamoDB tracking table with a scheduled checker:
- A Lambda function logs each session to DynamoDB before invoking AgentCore (with
caseId,sessionId,status,startTime) - A scheduled Lambda (e.g. every 15 minutes) queries for batches that have not completed after a threshold (processing time estimate multiplied by a safety margin)
- Stale batches are marked as failed and retried
2. Agent publishes its own result:
- On completion, the agent publishes to SQS or EventBridge for downstream processing
- The agent then updates DynamoDB status — only after a successful SQS publish
- This provides immediate feedback without waiting for the scheduled checker
Both strategies together are recommended: option 2 gives immediate feedback, option 1 is the safety net for crashes and quiet failures.
Lambda has built-in failure handling. When using SQS as an event source, Lambda automatically retries failed invocations and routes messages to a dead-letter queue after the configured maximum retry count. This comes as standard.
Comparison
Single Document Processing (Independent, No Shared Context)
| Factor | Lambda + Bedrock Runtime | AgentCore Runtime |
|---|---|---|
| PII isolation | Per invocation | Per unique session (must be 1:1) |
| Event-driven integration | Native (S3, SQS, EventBridge) | Requires intermediary |
| Session management | None needed | Session IDs |
| I/O wait charging | Full CPU + memory | Memory only (CPU free) |
| Async invocation | Built-in (SNS / EventBridge destinations) | Manual (agent publishes) |
| Failure detection | Built-in (Lambda failures tracked) | Manual (tracking table + logs) |
| Timeout limit | 15 minutes | 8 hours (async) |
| Context sharing | Not needed | Not used |
| Recommendation | Default choice | Only if processing exceeds a minute |
Multi-Document Batches (Shared Context Required)
| Factor | Lambda + Bedrock Runtime | AgentCore Runtime |
|---|---|---|
| Shared context | Must pass via payload or external state store | Session memory preserves context natively |
| PII isolation | Per invocation | Per session (one session per batch) |
| I/O wait charging | Full CPU + memory | Memory only (CPU free) |
| Orchestration complexity | High (external state management) | Low (agent handles internally) |
| Failure detection | Built-in | Custom tracking required |
| Recommendation | Simple batches | Complex batches where context is critical |
Architecture Patterns
Pattern 1: Lambda + Bedrock Runtime (Independent Documents)
S3 Upload → S3 Event → SQS Queue
↓
Lambda Event Source Mapping
↓
Lambda → Bedrock Runtime API
↓
S3 (results) + SNS (completion)
Lost message protection: Lambda deletes the SQS message only after successful processing. After the configured maximum retries, the message moves to a dead-letter queue for investigation.
This pattern is the right default for independent document processing. It is simpler to operate, easier to debug and has well-understood failure modes.
Pattern 2: AgentCore Runtime (Multi-Document Batches)
S3 Upload → S3 Event → SQS Queue
↓
Lambda (log to DynamoDB, invoke AgentCore)
↓
AgentCore Session (unique per batch)
↓
Agent (process batch documents, publish result to SQS, update DynamoDB)
↓
Lambda (results processor) → S3 + downstream
--- Compensation (crash / silent failure)
EventBridge Scheduler (periodic)
↓
Lambda (stale batch detector) → mark failed, trigger retry
Lost message protection:
- Agent publishes result to SQS results queue first
- Agent updates DynamoDB status (SUCCESS or FAILURE) only after successful SQS publish
- DynamoDB tracks
caseId,sessionId,statusandstartTime - Scheduled Lambda detects stale batches and triggers retry
When to Use Each
Use Lambda + Bedrock Runtime when:
- Documents are independent with no shared context requirements
- Processing completes within 15 minutes
- You want native event-driven integration without an intermediary
- You prefer simpler failure handling with built-in DLQ support
- You are starting out and want to minimise operational complexity
Use AgentCore Runtime when:
- Related documents must share context across a batch or session
- Processing exceeds a minute or two (especially if it runs for tens of minutes or hours)
- The I/O-heavy nature of your workload makes the CPU billing model significantly cheaper
Conclusion
Amazon Bedrock AgentCore Runtime is a genuine step forward for AI agent workloads on AWS. The microVM session isolation model gives you a sensible boundary for regulated workloads, the 8-hour async timeout removes a real constraint for complex document processing and the CPU billing model is structurally more efficient for I/O-heavy agents.
That said, it is not the right tool for every job. Independent document processing — where there is no need to share context across documents — is better served by Lambda in most cases. Lambda is simpler, has native event-driven integration, well-understood failure handling and no session management overhead.
The decision comes down to context: do your documents form part of a related batch that the agent needs to reason across? If so, AgentCore’s session model is compelling. If not, Lambda remains the right default.
Whatever you choose, treat the governance of sensitive data as a first-class architectural concern — not an afterthought. The boundaries you define in code are the boundaries that audits undoubtedly will scrutinise.
References
- Amazon Bedrock AgentCore Runtime — Session Isolation, AWS Documentation
- Amazon Bedrock AgentCore — Data Protection, AWS Documentation
- Amazon Bedrock AgentCore Runtime — Limits and Quotas, AWS Documentation
- Handle Asynchronous and Long-Running Agents, AWS Documentation
- AWS Lambda — Configuring Function Timeout, AWS Documentation
- Amazon Bedrock AgentCore — Service Quotas, AWS Documentation
- Amazon Bedrock AgentCore — Pricing, AWS Pricing
- Amazon EventBridge — Targets, AWS Documentation
- AgentCore Runtime Observability — Runtime Metrics and Log Formats, AWS Documentation