In Part 2 of this series, we enhanced our AI agent with conversation memory, allowing it to remember previous interactions and maintain context across sessions. However, we discovered another critical limitation: when asked about company-specific information like travel policies, the agent couldn’t provide accurate answers.

Generic AI models are trained on broad internet data but don’t know your company’s specific policies, procedures, or domain knowledge. They might provide plausible-sounding but incorrect answers (hallucinations), which is unacceptable for business applications.

In this post, we’ll add domain-specific knowledge to our AI agent through RAG (Retrieval-Augmented Generation), allowing it to answer questions based on company documents with accuracy and confidence.

Overview of the Solution

We’ll solve the knowledge problem with RAG (Retrieval-Augmented Generation):

  1. Convert company documents (travel policy) into vector embeddings
  2. Store embeddings in a vector database (PGVector)
  3. Automatically retrieve relevant sections when users ask questions
  4. Ground AI responses in actual company documents

What are vector embeddings?

Vector embeddings are numerical representations of words, sentences, or other data types in a continuous vector space, where similar meanings are located close together. For example, the vectors for “Paris” and “France” will be close to each other—just as “Tokyo” and “Japan” are—because the model learns their relationship through context. This concept allows computers to understand semantic similarity and is widely used in search engines, recommendation systems, and chatbots.

Architecture Overview

1
2
3
4
5
6
7
User Question

[Vector Store] ← PGVector (company policies, RAG)

AI Model (Amazon Bedrock)

Response (grounded in documents)

Key Spring AI Components

Prerequisites

Before you start, ensure you have:

  • Completed Part 2 of this series with the working ai-agent application
  • Java 21 JDK installed (Amazon Corretto 21)
  • Maven 3.6+ installed
  • Docker Desktop running (for Testcontainers with PostgreSQL/PGVector)
  • AWS CLI configured with access to Amazon Bedrock
  • Access to Amazon Bedrock models (specifically Titan Embeddings) in your AWS account

Navigate to your project directory from Part 2:

1
cd ai-agent

Travel Policy and RAG

Why RAG?

RAG (Retrieval-Augmented Generation) grounds AI responses in actual documents, eliminating hallucinations and ensuring accuracy.

Why PGVector?

Spring AI supports multiple vector databases:

  • PGVector (PostgreSQL extension - what we’ll use)
  • OpenSearch
  • Pinecone
  • Weaviate
  • Milvus
  • Chroma
  • Redis

We chose PGVector because:

  • ✅ Reuses existing PostgreSQL database (from Part 2)
  • ✅ Single database for both memory and knowledge
  • ✅ No additional infrastructure needed
  • ✅ OpenSearch and other specialized vector databases are excellent for large-scale production, but add complexity
  • ✅ PGVector is sufficient for most applications

For large-scale production with millions of documents, consider dedicated vector databases like OpenSearch or Pinecone.

Add RAG Dependencies

Open pom.xml and add these dependencies to the <dependencies> section:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<!-- RAG Dependencies -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-vector-store</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-advisors-vector-store</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-starter-vector-store-pgvector</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-starter-model-bedrock</artifactId>
</dependency>

Configure Titan Embeddings and PGVector:

1
2
3
4
5
6
7
8
9
cat >> src/main/resources/application.properties << 'EOF'

# RAG Configuration
spring.ai.model.embedding=bedrock-titan
spring.ai.bedrock.titan.embedding.model=amazon.titan-embed-text-v2:0
spring.ai.bedrock.titan.embedding.input-type=text
spring.ai.vectorstore.pgvector.initialize-schema=true
spring.ai.vectorstore.pgvector.dimensions=1024
EOF

Create Travel Policy

Create the company travel and expense policy document:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
mkdir -p samples
cat <<'EOF' > samples/policy-travel.md
# Travel Policy

## Purpose
Guidelines for booking business travel including flights, accommodation, and transportation.

## Accommodation

### Regional Limits
- Europe: Maximum €130 per night
- North America: Maximum $150 per night
- Asia Pacific: Maximum $120 per night

### Booking Requirements
- Use company-preferred vendors when available
- Exceeding regional limits requires Director approval

## Transportation

### Air Travel
- Economy class for standard flights
- Business class requires Director approval

### Ground Transportation
- Taxi/rideshare for airport transfers and business meetings
- Car rentals require Manager approval

## Approval Requirements
- All travel must be pre-approved by Manager
- Exceptions to policy limits require Director approval
EOF

Create VectorStoreService and Controller

Create a service to load documents into the vector store:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
cat <<'EOF' > src/main/java/com/example/ai/agent/service/VectorStoreService.java
package com.example.ai.agent.service;

import org.springframework.ai.document.Document;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.stereotype.Service;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.util.List;

@Service
public class VectorStoreService {
private static final Logger logger = LoggerFactory.getLogger(VectorStoreService.class);

private final VectorStore vectorStore;

public VectorStoreService(VectorStore vectorStore) {
this.vectorStore = vectorStore;
}

// Add content to vector store for semantic search
public void addContent(String content) {
logger.info("Adding content to vector store: {} chars", content.length());
vectorStore.add(List.of(new Document(content)));
}
}
EOF

Create a controller to expose the load endpoint:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
cat <<'EOF' > src/main/java/com/example/ai/agent/controller/VectorStoreController.java
package com.example.ai.agent.controller;

import com.example.ai.agent.service.VectorStoreService;

import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;

@RestController
@RequestMapping("api/admin")
public class VectorStoreController {
private final VectorStoreService vectorStoreService;

public VectorStoreController(VectorStoreService vectorStoreService) {
this.vectorStoreService = vectorStoreService;
}

@PostMapping("rag-load")
public void loadDataToVectorStore(@RequestBody String content) {
vectorStoreService.addContent(content);
}
}
EOF

Why separate controller? The /api/admin/rag-load endpoint is for administrative purposes (loading company documents) and will require different security permissions (ROLE_ADMIN) than user-facing chat endpoints (ROLE_USER).

Update ChatService with RAG

Add QuestionAnswerAdvisor to ChatService:

src/main/java/com/example/ai/agent/service/ChatService.java

1
2
3
4
5
6
7
8
9
10
11
12
13
14
...
public ChatService(ChatMemoryService chatMemoryService,
VectorStore vectorStore,
ChatClient.Builder chatClientBuilder) {

// Build ChatClient with RAG and memory
this.chatClient = chatClientBuilder
.defaultSystem(SYSTEM_PROMPT)
.defaultAdvisors(QuestionAnswerAdvisor.builder(vectorStore).build()) // RAG for policies
.build();

this.chatMemoryService = chatMemoryService;
}
...

Testing Knowledge

Let’s test the RAG system with policy questions:

1
./mvnw spring-boot:test-run

In another terminal, load the policy document:

1
2
3
curl -X POST http://localhost:8080/api/admin/rag-load \
-H "Content-Type: text/plain" \
--data-binary @samples/policy-travel.md

Test with REST API:

1
2
3
4
5
6
7
# Test hotel budget question
curl -X POST http://localhost:8080/api/chat/message \
-H "Content-Type: application/json" \
-d '{"prompt": "What is the hotel budget for France?", "userId": "alice"}'
# Response: "Based on your company's travel policy, the hotel budget for France is:
# Hotel Budget for France
# **Maximum:** €130 per night"

Alice travel request

Success! The agent provides accurate policy information grounded in company documents.

You can also test in the UI at http://localhost:8080 - ask policy questions and see grounded responses.

Let’s continue the chat:

1
2
3
4
curl -X POST http://localhost:8080/api/chat/message \
-H "Content-Type: application/json" \
-d '{"prompt": "I would like to travel to Paris next week from Monday to Friday. What is the weather forecast?", "userId": "alice"}'
# Response: "I don't know - I don't have access to the current information"

Alice forecast request problem

Problem discovered: The agent can’t perform actions and access real-time information like the current date and the weather forecast.

We’ll address this limitation in the next part of the series! Stay tuned!

Cleanup

To stop the application, press Ctrl+C in the terminal where it’s running.

The PostgreSQL container will continue running (due to withReuse(true)). If necessary, stop and remove it:

1
2
docker stop ai-agent-postgres
docker rm ai-agent-postgres

(Optional) To remove all data and start fresh:

1
docker volume prune

Commit Changes

1
2
git add .
git commit -m "Add RAG with travel policy"

Conclusion

In this post, we’ve added domain-specific knowledge to our AI agent through RAG:

  • PGVector for vector similarity search
  • Titan Embeddings for semantic understanding
  • QuestionAnswerAdvisor for automatic context retrieval
  • Grounded responses in company documents

The foundation we’ve built—memory and knowledge—is essential for any production AI agent. You now have the tools to create intelligent assistants that remember conversations and provide accurate, domain-specific information.

What’s Next

Add access to real-time information to our AI agent. The agent will finally be able to answer questions like “What is the weather forecast?” by calling the tools and execute functions.

Challenges and Solutions - Part 3

Learn More

Let’s continue building intelligent Java applications with Spring AI!

Comments