Posts Tagged ‘AmazonBedrock’
[AWSReInvent2025] Scaling Customer Support, Compliance, and Productivity with Conversational AI at Coinbase
Lecturer
Joshua Smith is a Senior Solutions Architect at Amazon Web Services (AWS), specializing in financial services. He collaborates closely with major institutions to design scalable, secure cloud architectures.
Vara Maharivan serves as Director of Machine Learning and Artificial Intelligence at Coinbase, leading the company’s efforts to integrate advanced AI and machine learning capabilities across its cryptocurrency platform.
Abstract
This session examines how Coinbase, a leading cryptocurrency exchange, has deployed a unified generative AI platform built on Amazon Bedrock to transform three critical operational domains: customer support, regulatory compliance, and internal developer productivity. The presentation details the architectural approach, key AWS services leveraged, real-world performance metrics, and the strategic roadmap ahead. By combining retrieval-augmented generation (RAG), tool execution, and domain-specific agents, Coinbase has achieved substantial automation, cost efficiencies, and enhanced user experiences while maintaining rigorous security and compliance standards.
The Evolution of Generative AI in Financial Services
Joshua Smith opened the discussion by contextualizing the rapid maturation of generative AI within financial services. In 2023, early adoption centered on foundational concerns such as data trust and secure retrieval mechanisms. By 2024, the introduction of Amazon Bedrock enabled broader experimentation in areas like customer support, with focus shifting toward scalability, granular access controls, and integration with existing enterprise tools. Entering 2025, the landscape has progressed toward fully agentic, multi-agent systems capable of autonomously orchestrating complex workflows.
Smith emphasized that the primary challenge is no longer prototyping conversational interfaces but rather re-engineering entire business processes to deliver measurable impact on key performance indicators. This shift demands robust infrastructure, advanced security primitives, and operational frameworks tailored for agentic workloads.
AWS Services Enabling Production-Grade Agentic AI
Central to the discussion was Amazon Bedrock, a fully managed service providing access to leading foundation models through a unified API. Bedrock supports private model customization, guardrails for safety, cost-latency optimization, and, notably, Agent Core—a suite of capabilities designed to operationalize agents at scale.
Agent Core addresses critical production gaps: a serverless runtime supporting long-running multimodal agents (up to eight hours), checkpointing and recovery, identity management compatible with existing providers, secure token vaults, shared and private memory, tool discovery with fine-grained controls, and centralized observability combining logs, traces, and metrics. These components collectively mitigate risks highlighted in industry reports, such as escalating costs, unclear value, and insufficient security, which threaten the viability of agentic initiatives.
Coinbase’s Strategic Vision for AI Integration
Vara Maharivan outlined Coinbase’s mission to increase economic freedom through a trusted global cryptocurrency platform. The company rests on three pillars: building trust via top-tier security, enhancing accessibility through intuitive experiences, and scaling operations efficiently across more than 100 countries.
AI and machine learning have long underpinned fraud detection, risk assessment, personalization, and infrastructure scaling at Coinbase. Recent innovations include graph neural network-based risk scoring for blockchain addresses, ERC-20 scam token detection combining smart contract auditing with ML, and predictive scaling models to handle market volatility.
With the advent of large language models, Coinbase identified three high-impact generative AI domains: customer support automation, compliance process acceleration, and developer productivity enhancement.
Transforming Customer Support with Agentic Workflows
Crypto markets exhibit extreme volatility, driving unpredictable spikes in user inquiries that challenge traditional human-staffed support models. Coinbase addressed this through a unified generative AI platform granting fluid access to models and internal data via standardized interfaces.
The architecture features a virtual assistant handling routine interactions autonomously and an agent-assist tool empowering human representatives. The virtual assistant resolves straightforward cases end-to-end, while the assistive tool synthesizes real-time information from knowledge bases and tools, providing agents with contextual summaries, suggested responses, and multilingual capabilities.
Results demonstrate significant impact: approximately 65% of customer contacts are now automated, yielding nearly five million annualized employee-hour savings. Automated cases resolve in under ten minutes—contrasting sharply with up to forty minutes for human-handled escalations—dramatically improving customer satisfaction and operational efficiency.
Streamlining Compliance through AI-Augmented Investigations
Regulatory compliance in financial services demands rigorous processes such as KYC, KYB, and transaction monitoring. These workflows are labor-intensive, require exhaustive explainability, and must adapt to diverse jurisdictional requirements.
Coinbase augmented traditional ML-based risk detection models (deployed via Anyscale on AWS EKS) with generative AI. A compliance-assist tool aggregates data from internal systems and open-source intelligence, producing narrative summaries and risk signals for human reviewers.
At the core lies an autoresolution engine orchestrating holistic reviews. Upon a high-risk alert, the engine coordinates data synthesis, automated actions, human-in-the-loop feedback, and customer information requests. Final decisions—such as filing Suspicious Activity Reports—remain with human compliance officers, preserving accountability while accelerating throughput and consistency.
Boosting Developer Productivity across the SDLC
Developer efficiency emerged as another strategic priority. Coinbase provides multiple best-in-class coding assistants (e.g., Claude Code, Cursor) powered by Anthropic models via Bedrock, allowing engineers to select preferred tools.
A custom GitHub Action automates pull-request reviews: summarizing changes, generating natural-language comments, enforcing conventions, identifying testing gaps, and offering debugging guidance for CI failures. This shifts human review toward higher-value architectural concerns.
For quality assurance, an in-house UI testing tool translates natural-language test descriptions into autonomous browser actions across form factors, achieving parity with human accuracy, triple the bug-detection rate, and 86% cost reduction versus manual testing.
Quantifiable outcomes include nearly 40% of daily code being AI-generated or influenced (targeting 50%), 75,000 annual hours saved via automated PR reviews, and dramatically faster test introduction.
Future Directions and Platform Modernization
Coinbase aims to democratize agentic AI across the organization, enabling every employee to experiment and innovate. Ongoing efforts focus on modernizing existing tools and scaling enterprise-wide impact.
Agent Core features—secure deployment, robust identity management, advanced memory, and interoperability—are viewed as pivotal for the next phase of expansion.
Conclusion
The Coinbase case illustrates a mature approach to generative AI deployment: leveraging a unified platform on Amazon Bedrock to address volatility-driven operational challenges while upholding security and regulatory standards. By combining autonomous agents, human augmentation, and rigorous evaluation, the company has realized substantial automation, cost savings, and quality improvements across support, compliance, and engineering functions. As agentic systems evolve, such integrated architectures offer a blueprint for financial institutions seeking transformative efficiency without compromising trust.
Links:
[AWSReInventPartnerSessions2024] Constructing Real-Time Generative AI Systems through Integrated Streaming, Managed Models, and Safety-Centric Language Architectures
Lecturer
Pascal Vuylsteker serves as Senior Director of Innovation at Confluent, where he spearheads advancements in scalable data streaming platforms designed to empower enterprise artificial intelligence initiatives. Mario Rodriguez operates as Senior Partner Solutions Architect at AWS, concentrating on seamless integrations of generative AI services within cloud ecosystems. Gavin Doyle heads the Applied AI team at Anthropic, directing efforts toward developing reliable, interpretable, and ethically aligned large language models.
Abstract
This comprehensive scholarly analysis investigates the foundational principles and practical methodologies for deploying real-time generative AI applications by harmonizing Confluent’s data streaming capabilities with Amazon Bedrock’s fully managed foundation model access and Anthropic’s advanced language models. The discussion centers on establishing robust data governance frameworks, implementing retrieval-augmented generation with continuous contextual updates, and leveraging Flink SQL for instantaneous inference. Through detailed architectural examinations and illustrative configurations, the article elucidates how these components dismantle data silos, ensure up-to-date relevance in AI responses, and facilitate scalable, secure innovation across organizational boundaries.
Establishing Governance-Centric Modern Data Infrastructures
Contemporary enterprise environments increasingly acknowledge the indispensable role of data streaming in fostering operational agility. Empirical insights reveal that seventy-nine percent of information technology executives consider real-time data flows essential for maintaining competitive advantage. Nevertheless, persistent obstacles—ranging from fragmented technical competencies and isolated data repositories to escalating governance complexities and heightened expectations from generative AI adoption—continue to hinder comprehensive exploitation of these potentials.
To counteract such impediments, contemporary data architectures prioritize governance as the pivotal nucleus. This core ensures that information remains secure, compliant with regulatory standards, and readily accessible to authorized stakeholders. Encircling this nucleus are interdependent elements including data warehouses for structured storage, streaming analytics for immediate processing, and generative AI applications that derive actionable intelligence. Such a holistic configuration empowers institutions to eradicate silos, achieve elastic scalability, and satisfy burgeoning demands for instantaneous insights.
Confluent emerges as the vital connective framework within this paradigm, facilitating uninterrupted real-time data synchronization across disparate systems. By bridging ingestion pipelines, data lakes, and batch-oriented workflows, Confluent guarantees that information arrives at designated destinations precisely when required. Absent this foundational layer, the construction of cohesive generative AI solutions becomes substantially more arduous, often resulting in delayed or inconsistent outputs.
Complementing this streaming backbone, Amazon Bedrock delivers a fully managed service granting access to an array of foundation models sourced from leading providers such as AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon itself. Bedrock supports diverse experimentation modalities, enables model customization through fine-tuning or extended pre-training, and permits the orchestration of intelligent agents without necessitating extensive coding expertise. From a security perspective, Bedrock rigorously prohibits the incorporation of customer data into baseline models, maintains isolation for fine-tuned variants, implements encryption protocols, enforces granular access controls aligned with AWS identity management, and adheres to certifications including HIPAA, GDPR, SOC, ISO, and CSA STAR.
The differentiation of generative AI applications hinges predominantly on proprietary datasets. Organizations possessing comparable access to foundation models achieve superiority by capitalizing on unique internal assets. Three principal techniques harness this advantage: retrieval-augmented generation incorporates external knowledge directly into prompt engineering; fine-tuning crafts specialized models tailored to domain-specific corpora; continued pre-training broadens model comprehension using enterprise-scale information repositories.
For instance, an online travel agency might synthesize personalized itineraries by amalgamating live flight availability, client profiles, inventory levels, and historical preferences. AWS furnishes an extensive suite of services accommodating unstructured, structured, streaming, and vectorized data formats, thereby enabling seamless integration across heterogeneous sources while preserving lifecycle security.
Orchestrating Real-Time Contextual Enrichment and Inference Mechanisms
Confluent assumes a critical position by directly interfacing with vector databases, thereby assuring that conversational AI frameworks consistently operate upon the most pertinent and current information. This integration transcends basic data translocation, emphasizing the delivery of contextualized, AI-actionable content.
Central to this orchestration is Flink Inference, a sophisticated capability within Confluent Cloud that facilitates instantaneous machine learning predictions through Flink SQL syntax. This approach dramatically simplifies the embedding of predictive models into operational workflows, yielding immediate analytical outcomes and supporting real-time decision-making grounded in accurate, contemporaneous data.
Configuration commences with establishing connectivity between Flink environments and target models utilizing the Confluent command-line interface. Parameters specify endpoints, authentication credentials, and model identifiers—accommodating various Claude iterations alongside other compatible architectures. Subsequent commands define reusable prompt templates, allowing baseline instructions to persist while dynamic elements vary per invocation. Finally, data insertion invokes the ML_PREDICT function, passing relevant parameters for processing.
Architecturally, the pipeline initiates with document or metadata publication to Kafka topics, forming ingress points for downstream transformation. Where appropriate, documents undergo segmentation into manageable chunks to promote parallel execution and enhance computational efficiency. Embeddings are then generated for each segment leveraging Bedrock or Anthropic services, after which these vector representations—accompanied by original chunks—are indexed within a vector store such as MongoDB Atlas.
To accelerate adoption, dedicated quick-start repositories provide deployable templates encapsulating this workflow. Notably, these templates incorporate structured document summarization via Claude, converting tabular or hierarchical data into narrative abstracts suitable for natural language querying.
Interactive sessions begin through API gateways or direct Kafka clients, enabling bidirectional real-time communication. User queries generate embeddings, which subsequently retrieve semantically aligned documents from the vector repository. Retrieved artifacts, augmented by available streaming context, inform prompt construction to maximize relevance and precision. The resultant engineered prompt undergoes processing by Claude on Anthropic Cloud, producing responses that reflect both historical knowledge and live situational awareness.
Efficiency enhancements include conversational summarization to mitigate token proliferation and refine large language model performance. Empirical observations indicate that Claude-generated query reformulations for vector retrieval substantially outperform direct human phrasing, yielding markedly superior document recall.
CREATE MODEL anthropic_claude WITH (
'connector' = 'anthropic',
'endpoint' = 'https://api.anthropic.com/v1/messages',
'api.key' = 'sk-ant-your-key-here',
'model' = 'claude-3-opus-20240229'
);
CREATE TABLE refined_queries AS
SELECT ML_PREDICT(
'anthropic_claude',
CONCAT('Rephrase for vector search: ', user_query)
) AS optimized_query
FROM raw_interactions;
Flink’s value proposition extends beyond connectivity to encompass cost-effectiveness, automatic scaling for voluminous workloads, and native interoperability with extensive ecosystems. Confluent maintains certified integrations across major AWS offerings, prominent data warehouses including Snowflake and Databricks, and leading vector databases such as MongoDB. Anthropic models remain comprehensively accessible via Bedrock, reflecting strategic collaborations spanning product interfaces to silicon-level optimizations.
Analytical Implications and Strategic Trajectories for Enterprise AI Deployment
The methodological synthesis presented—encompassing streaming orchestration, managed model accessibility, and safety-oriented language processing—fundamentally reconfigures retrieval-augmented generation from static knowledge injection to dynamic reasoning augmentation. This evolution proves indispensable for domains requiring precise interpretation, such as regulatory compliance or legal analysis.
Strategic ramifications are profound. Organizations unlock domain-specific differentiation by leveraging proprietary datasets within real-time contexts, achieving decision-making superiority unattainable through generic models alone. Governance frameworks scale securely, accommodating enterprise-grade requirements without sacrificing velocity.
Persistent challenges, including data provenance assurance and model drift mitigation, necessitate ongoing refinement protocols. Future pathways envision declarative inference paradigms wherein prompts and policies are codified as infrastructure, alongside hybrid architectures merging vector search with continuous streaming for anticipatory intelligence.