Spring AI¶
Overview¶
Spring AI is a project within the Spring ecosystem that simplifies the integration of AI capabilities into Spring applications. It provides abstractions and integrations for various AI models, services, and tools, making it easier for Spring developers to incorporate AI functionality into their applications without deep AI expertise. This guide covers the fundamentals of Spring AI, its integration with different AI providers, and practical examples of using AI in Spring Boot applications.
Prerequisites¶
- Basic knowledge of Spring Boot
- Understanding of Java and RESTful API concepts
- Familiarity with AI/ML concepts (helpful but not required)
- Spring Boot application setup experience
Learning Objectives¶
- Understand the Spring AI project and its components
- Set up Spring AI in a Spring Boot application
- Work with different AI model providers through Spring AI
- Implement common AI use cases with Spring AI
- Build AI-powered features in Spring applications
- Handle AI responses and errors appropriately
- Optimize and monitor AI integrations
Table of Contents¶
- Introduction to Spring AI
- Getting Started with Spring AI
- Working with LLMs in Spring
- Text Embedding and Vector Databases
- Image Generation
- Document Processing
- AI-Powered Web Applications
- Prompt Engineering in Spring AI
- Error Handling and Resiliency
- Performance Considerations
- Best Practices
Introduction to Spring AI¶
Spring AI is designed to simplify the integration of AI capabilities into Spring applications by providing consistent abstractions across different AI providers.
Key Concepts¶
- AI Models: Pre-trained algorithms that can perform specific tasks like text generation, classification, image generation, etc.
- Prompts: Input text that guides AI models to produce relevant outputs
- Completion: The output generated by an AI model in response to a prompt
- Embeddings: Numerical representations of text that capture semantic meaning
- Tokens: Units of text that AI models process (words, subwords, or characters)
Spring AI Architecture¶
Spring AI follows a provider-based architecture:
- Core abstractions: Define common interfaces for AI functionality
- Provider implementations: Integrate with specific AI services (OpenAI, Azure, etc.)
- Middleware components: Handle cross-cutting concerns like caching, rate limiting, etc.
Supported AI Providers¶
Spring AI supports multiple AI providers:
- OpenAI (GPT models)
- Azure OpenAI
- Google AI (PaLM, Gemini)
- Amazon Bedrock
- Anthropic Claude
- Ollama (for local models)
- HuggingFace
- More providers are added regularly
Getting Started with Spring AI¶
Setting Up Spring AI¶
Add Spring AI dependencies to your Spring Boot project:
<!-- Maven -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-starter</artifactId>
<version>0.8.0</version>
</dependency>
<!-- Provider specific dependencies -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
<version>0.8.0</version>
</dependency>
Or for Gradle:
implementation 'org.springframework.ai:spring-ai-starter:0.8.0'
implementation 'org.springframework.ai:spring-ai-openai-spring-boot-starter:0.8.0'
Configuration¶
Configure Spring AI in your application.properties
or application.yml
:
# OpenAI configuration
spring.ai.openai.api-key=${OPENAI_API_KEY}
spring.ai.openai.model=gpt-4
spring.ai.openai.temperature=0.7
spring.ai.openai.max-tokens=500
For Azure OpenAI:
spring.ai.azure.openai.api-key=${AZURE_OPENAI_API_KEY}
spring.ai.azure.openai.endpoint=${AZURE_OPENAI_ENDPOINT}
spring.ai.azure.openai.deployment-name=${AZURE_OPENAI_DEPLOYMENT}
Basic Usage¶
A simple example of using Spring AI with OpenAI:
import org.springframework.ai.chat.ChatClient;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.chat.prompt.SystemPromptTemplate;
import org.springframework.ai.chat.prompt.UserPrompt;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
@RestController
public class AiController {
private final ChatClient chatClient;
@Autowired
public AiController(ChatClient chatClient) {
this.chatClient = chatClient;
}
@GetMapping("/chat")
public String chat(@RequestParam String message) {
String systemPrompt = "You are a helpful assistant. Provide concise responses.";
Prompt prompt = new Prompt(
new SystemPromptTemplate(systemPrompt).create(),
new UserPrompt(message)
);
return chatClient.call(prompt).getResult().getOutput().getContent();
}
}
Working with LLMs in Spring¶
Large Language Models (LLMs) are the core AI models used for text generation and understanding.
ChatClient for Conversations¶
The ChatClient
interface is the primary way to interact with LLMs:
@Service
public class ConversationService {
private final ChatClient chatClient;
public ConversationService(ChatClient chatClient) {
this.chatClient = chatClient;
}
public String generateResponse(String userMessage) {
Prompt prompt = new Prompt(new UserPrompt(userMessage));
return chatClient.call(prompt).getResult().getOutput().getContent();
}
public String generateStructuredResponse(String userMessage) {
Prompt prompt = new Prompt(
new SystemPromptTemplate("You are a helpful assistant. Format your response as JSON.").create(),
new UserPrompt(userMessage)
);
return chatClient.call(prompt).getResult().getOutput().getContent();
}
}
Working with Multiple Messages¶
Build more complex conversations with multiple messages:
public String multiTurnConversation(List<String> conversation) {
List<Message> messages = new ArrayList<>();
// Add system message first
messages.add(new SystemPromptTemplate("You are a helpful assistant.").create());
// Add conversation history
for (int i = 0; i < conversation.size(); i++) {
if (i % 2 == 0) {
// User messages
messages.add(new UserPrompt(conversation.get(i)));
} else {
// Assistant messages
messages.add(new AssistantPrompt(conversation.get(i)));
}
}
Prompt prompt = new Prompt(messages);
return chatClient.call(prompt).getResult().getOutput().getContent();
}
Streaming Responses¶
For large responses, use streaming to improve user experience:
@GetMapping(value = "/chat/stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<String> streamChat(@RequestParam String message) {
Prompt prompt = new Prompt(new UserPrompt(message));
return Flux.from(chatClient.stream(prompt))
.map(response -> response.getOutput().getContent());
}
Handling Structured Data¶
Extract structured data from LLM responses:
public ProductRecommendation getProductRecommendation(String userQuery) {
String systemPrompt = """
You are a product recommendation system.
Analyze the user query and recommend a product.
Format your response in JSON with the following structure:
{
"productName": "string",
"description": "string",
"price": float,
"relevanceScore": float
}
""";
Prompt prompt = new Prompt(
new SystemPromptTemplate(systemPrompt).create(),
new UserPrompt(userQuery)
);
String response = chatClient.call(prompt).getResult().getOutput().getContent();
// Parse JSON to object using Jackson
ObjectMapper mapper = new ObjectMapper();
try {
return mapper.readValue(response, ProductRecommendation.class);
} catch (JsonProcessingException e) {
throw new RuntimeException("Failed to parse AI response", e);
}
}
Text Embedding and Vector Databases¶
Text embeddings convert text into numerical vectors that capture semantic meaning, enabling similarity search and retrieval.
Generating Embeddings¶
Use the EmbeddingClient
to generate embeddings:
@Service
public class EmbeddingService {
private final EmbeddingClient embeddingClient;
public EmbeddingService(EmbeddingClient embeddingClient) {
this.embeddingClient = embeddingClient;
}
public List<Float> generateEmbedding(String text) {
EmbeddingResponse response = embeddingClient.embed(text);
return response.getEmbedding();
}
public List<List<Float>> batchEmbeddings(List<String> texts) {
EmbeddingResponse response = embeddingClient.embed(texts);
return response.getEmbeddings();
}
}
Integrating with Vector Databases¶
Spring AI supports various vector databases for storing and retrieving embeddings:
Redis as a Vector Store¶
@Configuration
public class VectorStoreConfig {
@Bean
public RedisVectorStore redisVectorStore(
RedisTemplate<String, String> redisTemplate,
EmbeddingClient embeddingClient) {
return new RedisVectorStore(redisTemplate, embeddingClient);
}
}
@Service
public class DocumentSearchService {
private final RedisVectorStore vectorStore;
private final EmbeddingClient embeddingClient;
public DocumentSearchService(RedisVectorStore vectorStore, EmbeddingClient embeddingClient) {
this.vectorStore = vectorStore;
this.embeddingClient = embeddingClient;
}
public void storeDocument(String id, String content) {
Document document = Document.from(content, Map.of("id", id));
vectorStore.add(List.of(document));
}
public List<Document> searchSimilarDocuments(String query, int k) {
return vectorStore.similaritySearch(query, k);
}
}
Using PostgreSQL with pgvector¶
@Configuration
public class PgVectorConfig {
@Bean
public DataSource dataSource() {
// Configure PostgreSQL datasource
}
@Bean
public PgVectorStore pgVectorStore(DataSource dataSource, EmbeddingClient embeddingClient) {
return new PgVectorStore(dataSource, embeddingClient);
}
}
Retrieval Augmented Generation (RAG)¶
Combine document retrieval with LLM generation for knowledge-grounded responses:
@Service
public class RagService {
private final VectorStore vectorStore;
private final ChatClient chatClient;
public RagService(VectorStore vectorStore, ChatClient chatClient) {
this.vectorStore = vectorStore;
this.chatClient = chatClient;
}
public String answerWithContext(String question) {
// 1. Retrieve relevant documents
List<Document> relevantDocs = vectorStore.similaritySearch(question, 3);
// 2. Format documents as context
String context = relevantDocs.stream()
.map(Document::getContent)
.collect(Collectors.joining("\n\n"));
// 3. Create prompt with context
String systemPrompt = """
You are a helpful assistant. Answer the user's question using the provided context.
If the answer cannot be found in the context, say "I don't have enough information."
Context:
%s
""".formatted(context);
Prompt prompt = new Prompt(
new SystemPromptTemplate(systemPrompt).create(),
new UserPrompt(question)
);
// 4. Generate answer
return chatClient.call(prompt).getResult().getOutput().getContent();
}
}
Image Generation¶
Spring AI supports image generation capabilities through various providers.
Generating Images with DALL-E¶
@Service
public class ImageGenerationService {
private final OpenAiImageClient imageClient;
public ImageGenerationService(OpenAiImageClient imageClient) {
this.imageClient = imageClient;
}
public String generateImage(String prompt) {
ImageOptions options = ImageOptions.builder()
.withModel("dall-e-3")
.withSize("1024x1024")
.withQuality("standard")
.build();
ImageResponse response = imageClient.generateImage(prompt, options);
return response.getResult().getUrl();
}
public void generateAndSaveImage(String prompt, String outputPath) throws IOException {
ImageResponse response = imageClient.generateImage(prompt);
String imageUrl = response.getResult().getUrl();
// Download and save image
URL url = new URL(imageUrl);
try (InputStream in = url.openStream();
FileOutputStream out = new FileOutputStream(outputPath)) {
byte[] buffer = new byte[1024];
int bytesRead;
while ((bytesRead = in.read(buffer)) != -1) {
out.write(buffer, 0, bytesRead);
}
}
}
}
Image Variation and Editing¶
Some providers support image variations and editing:
public String createImageVariation(File baseImage) throws IOException {
ImageOptions options = ImageOptions.builder()
.withModel("dall-e-2")
.withN(1)
.withSize("1024x1024")
.build();
ImageResponse response = imageClient.createVariation(
new FileSystemResource(baseImage), options);
return response.getResult().getUrl();
}
Document Processing¶
Spring AI can help process and analyze documents with AI capabilities.
PDF Analysis with LLMs¶
@Service
public class DocumentAnalysisService {
private final ChatClient chatClient;
public DocumentAnalysisService(ChatClient chatClient) {
this.chatClient = chatClient;
}
public String analyzePdf(MultipartFile pdfFile) throws IOException {
// 1. Extract text from PDF
String pdfText = extractTextFromPdf(pdfFile.getInputStream());
// 2. Generate analysis with LLM
String systemPrompt = """
You are a document analysis assistant. Analyze the provided document and extract:
1. Key points
2. Main topics
3. Action items
4. Summary
Format your response in Markdown.
""";
Prompt prompt = new Prompt(
new SystemPromptTemplate(systemPrompt).create(),
new UserPrompt("Here's the document: " + pdfText)
);
return chatClient.call(prompt).getResult().getOutput().getContent();
}
private String extractTextFromPdf(InputStream pdfStream) throws IOException {
// Use a PDF library like Apache PDFBox or PdfToText
// Example with PDFBox:
PDDocument document = PDDocument.load(pdfStream);
try {
PDFTextStripper stripper = new PDFTextStripper();
return stripper.getText(document);
} finally {
document.close();
}
}
}
Document Question Answering¶
public String askQuestionAboutDocument(String documentText, String question) {
String systemPrompt = """
You are an assistant that answers questions about documents.
Use only the information in the provided document to answer the question.
If the answer is not in the document, say "I don't have that information."
Document:
%s
""".formatted(documentText);
Prompt prompt = new Prompt(
new SystemPromptTemplate(systemPrompt).create(),
new UserPrompt(question)
);
return chatClient.call(prompt).getResult().getOutput().getContent();
}
AI-Powered Web Applications¶
Integrating Spring AI into web applications creates powerful user experiences.
AI-Powered Chat Interface¶
@Controller
public class ChatController {
private final ChatClient chatClient;
public ChatController(ChatClient chatClient) {
this.chatClient = chatClient;
}
@GetMapping("/chat-ui")
public String chatPage() {
return "chat";
}
@MessageMapping("/chat")
@SendTo("/topic/messages")
public ChatMessage processMessage(UserMessage message) {
String userContent = message.getContent();
Prompt prompt = new Prompt(
new SystemPromptTemplate("You are a helpful customer service assistant.").create(),
new UserPrompt(userContent)
);
String response = chatClient.call(prompt).getResult().getOutput().getContent();
return new ChatMessage("assistant", response);
}
}
Content Generation for Websites¶
@Service
public class ContentGenerationService {
private final ChatClient chatClient;
public ContentGenerationService(ChatClient chatClient) {
this.chatClient = chatClient;
}
public String generateBlogPost(String topic, String keywords, int wordCount) {
String systemPrompt = """
You are a professional content writer. Generate a blog post about the given topic.
Use the provided keywords naturally throughout the text.
The blog post should be approximately %d words.
Format the post in Markdown with appropriate headings, paragraphs, and bullet points.
""".formatted(wordCount);
String userPrompt = """
Topic: %s
Keywords: %s
""".formatted(topic, keywords);
Prompt prompt = new Prompt(
new SystemPromptTemplate(systemPrompt).create(),
new UserPrompt(userPrompt)
);
return chatClient.call(prompt).getResult().getOutput().getContent();
}
public String generateMetaDescription(String content, int maxLength) {
String systemPrompt = """
Generate an SEO-friendly meta description for the provided content.
The description should be engaging and under %d characters.
""".formatted(maxLength);
Prompt prompt = new Prompt(
new SystemPromptTemplate(systemPrompt).create(),
new UserPrompt("Content: " + content)
);
return chatClient.call(prompt).getResult().getOutput().getContent();
}
}
Prompt Engineering in Spring AI¶
Effective prompt engineering is crucial for getting good results from AI models.
Creating Effective Prompts¶
@Service
public class PromptService {
public Prompt createStructuredPrompt(String task, Map<String, Object> parameters) {
String systemTemplate = """
You are an AI assistant specialized in ${task}.
Follow these guidelines:
- Be concise and clear
- Focus only on the requested task
- Format your response as ${format}
Additional instructions: ${instructions}
""";
SystemPromptTemplate systemPrompt = new SystemPromptTemplate(systemTemplate);
return new Prompt(
systemPrompt.create(parameters),
new UserPrompt(parameters.get("userInput").toString())
);
}
}
Working with Prompt Templates¶
@Component
public class PromptTemplateService {
private final PromptTemplateEngine promptTemplateEngine;
public PromptTemplateService(PromptTemplateEngine promptTemplateEngine) {
this.promptTemplateEngine = promptTemplateEngine;
}
public String processTemplate(String template, Map<String, Object> variables) {
return promptTemplateEngine.process(template, variables);
}
public Prompt createPromptFromTemplate(String systemTemplate,
String userTemplate,
Map<String, Object> variables) {
Message systemMessage = new SystemPromptTemplate(systemTemplate)
.create(variables);
Message userMessage = new UserPromptTemplate(userTemplate)
.create(variables);
return new Prompt(List.of(systemMessage, userMessage));
}
}
Few-Shot Learning Examples¶
public Prompt createFewShotPrompt(String task, List<Example> examples, String userInput) {
StringBuilder fewShotExamples = new StringBuilder();
for (Example example : examples) {
fewShotExamples.append("Input: ").append(example.getInput()).append("\n");
fewShotExamples.append("Output: ").append(example.getOutput()).append("\n\n");
}
String systemTemplate = """
You are an assistant that helps with ${task}.
Here are some examples of how to respond:
${examples}
Follow the same pattern for the user's input.
""";
Map<String, Object> variables = Map.of(
"task", task,
"examples", fewShotExamples.toString()
);
return new Prompt(
new SystemPromptTemplate(systemTemplate).create(variables),
new UserPrompt(userInput)
);
}
Error Handling and Resiliency¶
Proper error handling is essential when working with external AI services.
Handling AI Service Errors¶
@Service
public class ResilientAiService {
private final ChatClient chatClient;
private final Logger logger = LoggerFactory.getLogger(ResilientAiService.class);
public ResilientAiService(ChatClient chatClient) {
this.chatClient = chatClient;
}
public String generateContentWithRetry(String prompt) {
int maxRetries = 3;
int retryCount = 0;
long retryDelay = 1000; // 1 second
while (retryCount < maxRetries) {
try {
return chatClient.call(new Prompt(new UserPrompt(prompt)))
.getResult().getOutput().getContent();
} catch (RuntimeException e) {
retryCount++;
logger.warn("AI service call failed (attempt {}): {}", retryCount, e.getMessage());
if (retryCount >= maxRetries) {
logger.error("Max retries reached for AI call", e);
throw new AiServiceException("Failed to generate content after multiple attempts", e);
}
try {
Thread.sleep(retryDelay * retryCount); // Exponential backoff
} catch (InterruptedException ie) {
Thread.currentThread().interrupt();
throw new AiServiceException("Retry interrupted", ie);
}
}
}
// This should never happen with the retry logic above
throw new IllegalStateException("Unexpected end of retry loop");
}
}
Circuit Breaker Pattern¶
@Configuration
public class ResiliencyConfig {
@Bean
public CircuitBreakerFactory circuitBreakerFactory() {
CircuitBreakerConfig circuitBreakerConfig = CircuitBreakerConfig.custom()
.failureRateThreshold(50)
.waitDurationInOpenState(Duration.ofSeconds(10))
.permittedNumberOfCallsInHalfOpenState(5)
.slidingWindowSize(10)
.slidingWindowType(CircuitBreakerConfig.SlidingWindowType.COUNT_BASED)
.build();
Map<String, CircuitBreakerConfig> configs = Map.of("aiService", circuitBreakerConfig);
return new DefaultCircuitBreakerFactory(configs);
}
}
@Service
public class CircuitBreakerAiService {
private final ChatClient chatClient;
private final CircuitBreakerFactory circuitBreakerFactory;
public CircuitBreakerAiService(ChatClient chatClient, CircuitBreakerFactory circuitBreakerFactory) {
this.chatClient = chatClient;
this.circuitBreakerFactory = circuitBreakerFactory;
}
public String generateContent(String prompt) {
CircuitBreaker circuitBreaker = circuitBreakerFactory.create("aiService");
return circuitBreaker.run(
() -> chatClient.call(new Prompt(new UserPrompt(prompt)))
.getResult().getOutput().getContent(),
throwable -> getFallbackResponse(prompt, throwable)
);
}
private String getFallbackResponse(String prompt, Throwable throwable) {
return "I'm unable to generate a response at this time. Please try again later.";
}
}
Performance Considerations¶
Optimizing AI interactions improves application performance and reduces costs.
Caching Responses¶
@Configuration
@EnableCaching
public class CacheConfig {
@Bean
public CacheManager cacheManager() {
SimpleCacheManager cacheManager = new SimpleCacheManager();
cacheManager.setCaches(List.of(
new ConcurrentMapCache("aiResponses")
));
return cacheManager;
}
}
@Service
public class CachedAiService {
private final ChatClient chatClient;
public CachedAiService(ChatClient chatClient) {
this.chatClient = chatClient;
}
@Cacheable(value = "aiResponses", key = "#prompt.hashCode()")
public String generateResponse(Prompt prompt) {
return chatClient.call(prompt).getResult().getOutput().getContent();
}
@CacheEvict(value = "aiResponses", allEntries = true)
public void clearCache() {
// Method to clear the cache
}
}
Batching Requests¶
@Service
public class BatchProcessingService {
private final EmbeddingClient embeddingClient;
public BatchProcessingService(EmbeddingClient embeddingClient) {
this.embeddingClient = embeddingClient;
}
public List<List<Float>> batchProcessDocuments(List<String> documents) {
// Process in batches of 20
int batchSize = 20;
List<List<Float>> allEmbeddings = new ArrayList<>();
for (int i = 0; i < documents.size(); i += batchSize) {
int end = Math.min(i + batchSize, documents.size());
List<String> batch = documents.subList(i, end);
EmbeddingResponse response = embeddingClient.embed(batch);
allEmbeddings.addAll(response.getEmbeddings());
}
return allEmbeddings;
}
}
Optimizing Token Usage¶
@Service
public class TokenOptimizationService {
private final ChatClient chatClient;
public TokenOptimizationService(ChatClient chatClient) {
this.chatClient = chatClient;
}
public String summarizeWithTokenLimit(String text, int maxTokens) {
// First, check if we need to truncate the input
if (estimateTokens(text) > 4000) { // Rough estimate for token limit
text = truncateText(text, 4000);
}
String systemPrompt = """
Summarize the following text concisely.
Focus on the most important information.
""";
ChatOptions options = ChatOptions.builder()
.withMaxTokens(maxTokens)
.withTemperature(0.3f) // Lower temperature for more focused response
.build();
Prompt prompt = new Prompt(
new SystemPromptTemplate(systemPrompt).create(),
new UserPrompt(text)
);
return chatClient.call(prompt, options).getResult().getOutput().getContent();
}
private int estimateTokens(String text) {
// Simple estimation: ~4 characters per token
return text.length() / 4;
}
private String truncateText(String text, int tokenLimit) {
// Simple truncation strategy
int charLimit = tokenLimit * 4;
if (text.length() <= charLimit) {
return text;
}
return text.substring(0, charLimit) + "...";
}
}
Best Practices¶
Security Considerations¶
@Configuration
public class AiSecurityConfig {
@Bean
public OpenAiChatClient openAiChatClient(
@Value("${spring.ai.openai.api-key}") String apiKey) {
// Validate API key format
if (apiKey == null || !apiKey.startsWith("sk-")) {
throw new IllegalArgumentException("Invalid OpenAI API key format");
}
// Create client with security headers
return new OpenAiChatClient(OpenAiChatOptions.builder()
.withApiKey(apiKey)
.withModel("gpt-4")
.build());
}
@Bean
public SecurityFilter aiSecurityFilter() {
return new SecurityFilter();
}
}
@Component
public class SecurityFilter {
public String sanitizeUserInput(String input) {
// Remove potentially harmful characters or sequences
return input.replaceAll("[<>]", "")
.replaceAll("javascript:", "")
.trim();
}
public boolean containsSensitiveInformation(String text) {
// Check for patterns that might indicate sensitive data
Pattern creditCardPattern = Pattern.compile("\\d{4}[- ]?\\d{4}[- ]?\\d{4}[- ]?\\d{4}");
Pattern ssnPattern = Pattern.compile("\\d{3}[- ]?\\d{2}[- ]?\\d{4}");
return creditCardPattern.matcher(text).find() || ssnPattern.matcher(text).find();
}
}
Cost Management¶
@Service
public class AiCostManagementService {
private final ChatClient chatClient;
private final AtomicLong tokenUsage = new AtomicLong(0);
public AiCostManagementService(ChatClient chatClient) {
this.chatClient = chatClient;
}
public String generateResponse(String prompt, boolean isHighPriority) {
// Select model based on priority/complexity
String model = isHighPriority ? "gpt-4" : "gpt-3.5-turbo";
ChatOptions options = ChatOptions.builder()
.withModel(model)
.withMaxTokens(isHighPriority ? 500 : 250)
.build();
ChatResponse response = chatClient.call(
new Prompt(new UserPrompt(prompt)),
options
);
// Track usage
TokenUsage usage = response.getMetadata().getUsage();
tokenUsage.addAndGet(usage.getInputTokens() + usage.getOutputTokens());
return response.getResult().getOutput().getContent();
}
public long getTotalTokenUsage() {
return tokenUsage.get();
}
public double estimateCost() {
// Approximate cost calculation
// Rates might vary based on the model and provider
return tokenUsage.get() * 0.00002; // $0.02 per 1000 tokens
}
}
Monitoring AI Services¶
@Configuration
public class MonitoringConfig {
@Bean
public MeterRegistry meterRegistry() {
return new SimpleMeterRegistry();
}
}
@Service
public class MonitoredAiService {
private final ChatClient chatClient;
private final MeterRegistry meterRegistry;
private final Logger logger = LoggerFactory.getLogger(MonitoredAiService.class);
public MonitoredAiService(ChatClient chatClient, MeterRegistry meterRegistry) {
this.chatClient = chatClient;
this.meterRegistry = meterRegistry;
}
public String generateContent(String prompt) {
Timer.Sample sample = Timer.start(meterRegistry);
try {
ChatResponse response = chatClient.call(new Prompt(new UserPrompt(prompt)));
// Record metrics
TokenUsage usage = response.getMetadata().getUsage();
meterRegistry.counter("ai.tokens.input").increment(usage.getInputTokens());
meterRegistry.counter("ai.tokens.output").increment(usage.getOutputTokens());
String content = response.getResult().getOutput().getContent();
sample.stop(meterRegistry.timer("ai.request.duration", "model", response.getMetadata().getModel()));
return content;
} catch (Exception e) {
meterRegistry.counter("ai.request.errors").increment();
logger.error("AI request failed", e);
throw e;
}
}
}
Testing AI Integrations¶
@SpringBootTest
class AiServiceTests {
@Autowired
private ChatClient chatClient;
@MockBean
private OpenAiApi openAiApi;
@Test
void shouldGenerateResponse() {
// Arrange
String expectedResponse = "This is a mock response";
ChatCompletion mockCompletion = new ChatCompletion();
mockCompletion.setChoices(List.of(
new ChatCompletionChoice(0, new ChatMessage("assistant", expectedResponse), null)
));
mockCompletion.setUsage(new TokenUsage(10, 5));
when(openAiApi.chatCompletion(any())).thenReturn(mockCompletion);
// Act
String response = chatClient.call(new Prompt(new UserPrompt("Test prompt")))
.getResult().getOutput().getContent();
// Assert
assertEquals(expectedResponse, response);
verify(openAiApi).chatCompletion(any());
}
@Test
void shouldHandleApiError() {
// Arrange
when(openAiApi.chatCompletion(any())).thenThrow(new RuntimeException("API error"));
// Act & Assert
assertThrows(RuntimeException.class, () -> {
chatClient.call(new Prompt(new UserPrompt("Test prompt")));
});
}
}
Summary¶
Spring AI provides a powerful abstraction layer for integrating AI capabilities into Spring applications. Key benefits include:
- Consistent API across different AI providers
- Seamless integration with the Spring ecosystem
- Support for various AI models and capabilities
- Tools for text generation, embeddings, image generation, and more
- Built-in support for error handling and resiliency
By following the practices outlined in this guide, you can effectively leverage AI to enhance your Spring applications while maintaining good performance, security, and cost management.