RAG Made Serverless - Amazon Bedrock Knowledge Base With Spring AI

RAG Made Serverless - Amazon Bedrock Knowledge Base With Spring AI

What if you could build an AI assistant with access to your own data in under 40 lines of Java? That’s now possible with my contribution to the just-released Spring AI 2.0.0 M2 - Amazon Bedrock Knowledge Base support. It’s a fully managed RAG (Retrieval-Augmented Generation) service that handles document ingestion, embeddings, and vector storage for you - and now you can use it with Spring AI! RAG lets AI models answer questions using your own documents instead of relying solely on their training data.

In this post, I’ll show you how to build a working AI agent with RAG in minutes using JBang - no Maven project setup required. You’ll have an AI assistant answering questions from your company documents with minimal code.

Modern Java Observability in 2026 - Spring Boot 4 on Amazon EKS

Modern Java Observability in 2026 - Spring Boot 4 on Amazon EKS

If you’re running Spring Boot applications on Kubernetes, you’ve probably hit the same wall I did: containers restart, logs disappear, and when something goes wrong or slows down, you’re left guessing which microservice caused the issue.

Two recent developments prompted me to revisit my observability setup. First, the OpenTelemetry Java Agent v2.0 (January 2024) shifted from “instrument everything automatically” to “explicit over implicit” - requiring adjustments to maintain visibility into business logic. Second, Spring Boot 4.0 (November 2025) introduced the new spring-boot-starter-opentelemetry, making it easier than ever to export metrics, traces, and logs via OpenTelemetry Protocol (OTLP).

In this post, I’ll walk you through setting up observability for Spring Boot applications on Amazon EKS - starting with the basics (logs and metrics), diving into distributed tracing, and finishing with Application Signals. Hopefully this saves you some time.

In Part 5 of this series, we added MCP to our AI agent, enabling dynamic tool integration without code changes. However, we discovered another limitation: when users wants to upload expense receipts, invoices, or travel documents, the agent can only process text—it cannot analyze images or extract information from visual documents.

Text-only AI agents miss critical information embedded in images, scanned documents, charts, and diagrams. In business scenarios like expense management, travel booking confirmations, or invoice processing, most information arrives as images or PDFs rather than structured text.

In this post, we’ll add multi-modal capabilities (vision and document analysis) and multi-model support to our AI agent, allowing it to analyze images, extract structured data from receipts, and use different AI models optimized for specific tasks.

In Part 4 of this series, we added tool calling to our AI agent, allowing it to access real-time information like weather forecasts and current dates. However, we discovered another limitation: when asked to book flights or hotels, the agent couldn’t access those systems because we’d need to hardcode every API integration into our application.

Hardcoding API integrations creates maintenance challenges. Every new service requires code changes, recompilation, and redeployment. As your organization adds more systems (booking platforms, inventory systems, CRM tools), your AI agent becomes increasingly difficult to maintain and extend.

In this post, we’ll add Model Context Protocol (MCP) to our AI agent, allowing it to dynamically discover and use tools from external services without code changes or redeployment. You’ll learn how MCP enables you to expose your legacy systems, enterprise applications, or microservices to AI agents without rewriting them.

In Part 3 of this series, we enhanced our AI agent with domain-specific knowledge through RAG, allowing it to answer questions based on company documents. However, we discovered another critical limitation: when asked about real-time information like weather forecasts or current dates, the agent couldn’t provide accurate answers.

AI models are trained on historical data with a knowledge cutoff date. They don’t have access to real-time information like current weather, today’s date, live flight prices, or currency exchange rates. This makes them unable to help with time-sensitive tasks that require up-to-date information.

In this post, we’ll add tool calling (also known as function calling) to our AI agent, allowing it to access real-time information and take actions by calling external APIs and services.

In Part 2 of this series, we enhanced our AI agent with conversation memory, allowing it to remember previous interactions and maintain context across sessions. However, we discovered another critical limitation: when asked about company-specific information like travel policies, the agent couldn’t provide accurate answers.

Generic AI models are trained on broad internet data but don’t know your company’s specific policies, procedures, or domain knowledge. They might provide plausible-sounding but incorrect answers (hallucinations), which is unacceptable for business applications.

In this post, we’ll add domain-specific knowledge to our AI agent through RAG (Retrieval-Augmented Generation), allowing it to answer questions based on company documents with accuracy and confidence.

In Part 1 of this series, we built a functional AI agent using Java, Spring AI, and Amazon Bedrock. However, we discovered a critical limitation: the agent couldn’t remember previous conversations. When we asked “What is my name?” after introducing ourselves, the agent had no recollection of our earlier interaction.

This lack of memory creates a frustrating user experience and limits the agent’s usefulness for real-world applications. Imagine a customer service agent that forgets your issue every time you send a message, or a travel assistant that can’t recall your preferences from previous conversations.

In this post, we’ll enhance our AI agent with a three-tier memory architecture that provides both short-term conversation context and long-term user knowledge. We’ll implement this incrementally, starting with persistent session memory, then adding conversation summaries, and finally user preferences—all backed by PostgreSQL for production reliability.

Building AI-powered applications has become increasingly important for modern Java developers. With the rise of large language models and AI services, integrating intelligent capabilities into Java applications is no longer a luxury — it’s a necessity for staying competitive.

Spring AI makes this integration seamless by providing a unified framework for building AI-powered applications with Java. Combined with Amazon Bedrock, developers can create sophisticated AI agents that leverage state-of-the-art foundation models without managing complex infrastructure.

In this post, I’ll guide you through creating your first AI agent using Java and Spring AI, connected to Amazon Bedrock. We’ll build a complete application with both REST API and web interface, demonstrating how to integrate AI capabilities into your Java applications.

Modernizing Java applications has always been a challenge. As a Java developer, I’ve faced the difficulties of upgrading legacy code—managing technical debt, dealing with dependency issues, and investing significant time to ensure a smooth transition. Moving from Java 8 to a newer version often felt like more trouble than it was worth.

Now, Amazon Q Developer makes this process much easier. This AI-powered tool automates the transformation of Java 8 applications to Java 21, significantly reducing the time and effort required. What used to take weeks can now be done quickly, allowing developers to focus on building new features instead of fixing old code.

Easy Start Into Kubernetes With EKS Auto Mode and Eksctl

Easy Start Into Kubernetes With EKS Auto Mode and Eksctl

As a Solutions Architect, my area of interest has always been Automation and how it can help developers be more effective and make their lives easier. Many developers are embracing Kubernetes as an environment for their applications. However, setting up a vanilla Kubernetes cluster in the cloud might be quite challenging for a developer who is not keen on doing it the “hard way”. Local setup, with various tools, is easier but takes resources from a developer’s computer. Wouldn’t it be great if we had a simple way to spin up and tear down a Kubernetes cluster in the cloud in a matter of minutes and immediately use it? Now, with Amazon EKS Auto Mode, it is possible! Let’s find out how.