Transform Your Backend into a Smart Autonomous Decision Layer
Executive Summary
Building Intelligent, Agentic APIs with ASP.NET Core and Azure OpenAI
Modern applications need far more than static JSON—they require intelligence, reasoning, and autonomous action. By integrating Azure OpenAI into ASP.NET Core, you can build agentic APIs capable of understanding natural language, analyzing content, and orchestrating workflows with minimal human intervention.
This guide shows how to go beyond basic chatbot calls and create production-ready AI APIs, unlocking:
-
Natural language decision-making
-
Content analysis pipelines
-
Real-time streaming responses
-
Tool calling for agent workflows
-
Resilient patterns suited for enterprise delivery
Whether you’re automating business operations or creating smart assistants, this blueprint gives you everything you need.
Prerequisites
Before writing a single line of code, make sure you have:
-
.NET 6+ (prefer .NET 8 for best performance)
-
Azure subscription
-
Azure OpenAI model deployment (gpt-4o-mini recommended)
-
IDE (Visual Studio or VS Code)
-
API key + endpoint
-
Familiarity with async patterns and dependency injection
Required NuGet packages
Install these packages in your ASP.NET Core project:
“`
dotnet add package Azure.AI.OpenAI
dotnet add package Azure.Identity
dotnet add package Microsoft.Extensions.AI
dotnet add package Microsoft.Extensions.Configuration.UserSecrets
“`
Step 1 — Securely Configure Azure OpenAI
Options class
Start by setting up secure credential management. Create a configuration class to encapsulate Azure OpenAI settings:
namespace YourApp.AI.Configuration;
public class AzureOpenAIOptions
{
public string Endpoint { get; set; } = string.Empty;
public string DeploymentName { get; set; } = string.Empty;
public string ApiKey { get; set; } = string.Empty;
}
Add your credentials to `appsettings.json`:
{
"AzureOpenAI": {
"Endpoint": "https://your-resource.openai.azure.com/",
"DeploymentName": "gpt-4o-mini",
"ApiKey": "your-api-key-here"
}
}
For local development, use .NET user secrets to avoid committing credentials:
dotnet user-secrets init
dotnet user-secrets set "AzureOpenAI:Endpoint" "https://your-resource.openai.azure.com/"
dotnet user-secrets set "AzureOpenAI:DeploymentName" "gpt-4o-mini"
dotnet user-secrets set "AzureOpenAI:ApiKey" "your-api-key-here"
Step 2 — Create an AI Abstraction Service
Build a clean abstraction layer that isolates Azure OpenAI details from your business logic:
namespace YourApp.AI.Services;
using Azure;
using Azure.AI.OpenAI;
using Microsoft.Extensions.Options;
public interface IAIService
{
Task GenerateResponseAsync(string userMessage, CancellationToken cancellationToken = default);
Task AnalyzeContentAsync(string content, string analysisPrompt, CancellationToken cancellationToken = default);
IAsyncEnumerable StreamResponseAsync(string userMessage, CancellationToken cancellationToken = default);
}
public class AzureOpenAIService(IOptions options) : IAIService
{
private readonly AzureOpenAIOptions _options = options.Value;
private OpenAIClient? _client;
private OpenAIClient Client => _client ??= new OpenAIClient(
new Uri(_options.Endpoint),
new AzureKeyCredential(_options.ApiKey));
public async Task GenerateResponseAsync(string userMessage, CancellationToken cancellationToken = default)
{
var chatCompletionOptions = new ChatCompletionOptions
{
Temperature = 0.7f,
MaxTokens = 2000,
};
var messages = new[]
{
new ChatMessage(ChatRole.System, "You are a helpful assistant that provides accurate, concise responses."),
new ChatMessage(ChatRole.User, userMessage)
};
var response = await Client.GetChatCompletionsAsync(
_options.DeploymentName,
messages,
chatCompletionOptions,
cancellationToken);
return response.Value.Choices.Message.Content;
}
public async Task AnalyzeContentAsync(string content, string analysisPrompt, CancellationToken cancellationToken = default)
{
var systemPrompt = $"You are an expert analyst. {analysisPrompt}";
var messages = new[]
{
new ChatMessage(ChatRole.System, systemPrompt),
new ChatMessage(ChatRole.User, content)
};
var response = await Client.GetChatCompletionsAsync(
_options.DeploymentName,
messages,
cancellationToken: cancellationToken);
return response.Value.Choices.Message.Content;
}
public async IAsyncEnumerable StreamResponseAsync(
string userMessage,
[System.Runtime.CompilerServices.EnumeratorCancellation] CancellationToken cancellationToken = default)
{
var messages = new[]
{
new ChatMessage(ChatRole.System, "You are a helpful assistant."),
new ChatMessage(ChatRole.User, userMessage)
};
using var streamingResponse = await Client.GetChatCompletionsStreamingAsync(
_options.DeploymentName,
messages,
cancellationToken: cancellationToken);
await foreach (var update in streamingResponse.EnumerateUpdatesAsync(cancellationToken))
{
if (update.ContentUpdate != null)
{
yield return update.ContentUpdate;
}
}
}
}
Step 3 — Register Services in Dependency Injection
Configure your services in `Program.cs`:
var builder = WebApplicationBuilder.CreateBuilder(args);
// Add configuration
builder.Services.Configure(
builder.Configuration.GetSection("AzureOpenAI"));
// Register AI service
builder.Services.AddScoped<IAIService, AzureOpenAIService>();
// Add HTTP client for downstream integrations
builder.Services.AddHttpClient();
builder.Services.AddControllers();
builder.Services.AddOpenApi();
var app = builder.Build();
if (app.Environment.IsDevelopment())
{
app.MapOpenApi();
}
app.UseHttpsRedirection();
app.MapControllers();
app.Run();
Step 4 — Build REST Intelligence Endpoints
Create a controller that exposes AI capabilities as REST endpoints:
namespace YourApp.Controllers;
using Microsoft.AspNetCore.Mvc;
using YourApp.AI.Services;
[ApiController]
[Route("api/[controller]")]
public class IntelligenceController(IAIService aiService) : ControllerBase
{
[HttpPost("analyze")]
public async Task AnalyzeContent(
[FromBody] AnalysisRequest request,
CancellationToken cancellationToken)
{
if (string.IsNullOrWhiteSpace(request.Content))
return BadRequest("Content is required.");
var analysis = await aiService.AnalyzeContentAsync(
request.Content,
request.AnalysisPrompt ?? "Provide a detailed analysis.",
cancellationToken);
return Ok(new { analysis });
}
[HttpPost("chat")]
public async Task Chat(
[FromBody] ChatRequest request,
CancellationToken cancellationToken)
{
if (string.IsNullOrWhiteSpace(request.Message))
return BadRequest("Message is required.");
var response = await aiService.GenerateResponseAsync(
request.Message,
cancellationToken);
return Ok(new { response });
}
[HttpPost("stream")]
public async IAsyncEnumerable StreamChat(
[FromBody] ChatRequest request,
[System.Runtime.CompilerServices.EnumeratorCancellation] CancellationToken cancellationToken)
{
if (string.IsNullOrWhiteSpace(request.Message))
yield break;
await foreach (var chunk in aiService.StreamResponseAsync(request.Message, cancellationToken))
{
yield return chunk;
}
}
}
public record AnalysisRequest(string Content, string? AnalysisPrompt = null);
public record ChatRequest(string Message);
Step 5 — Enable Agentic Behavior (Tool Calling)
Create an advanced service that enables the AI to call functions autonomously:
namespace YourApp.AI.Services;
using Azure.AI.OpenAI;
public interface IAgentService
{
Task ExecuteAgentAsync(string userRequest, CancellationToken cancellationToken = default);
}
public class AgentService(IAIService aiService, IHttpClientFactory httpClientFactory) : IAgentService
{
public async Task ExecuteAgentAsync(string userRequest, CancellationToken cancellationToken = default)
{
var conversationHistory = new List
{
new ChatMessage(ChatRole.System,
"You are an intelligent agent. When asked to perform tasks, use available tools. " +
"Available tools: GetWeather, FetchUserData, SendNotification."),
new ChatMessage(ChatRole.User, userRequest)
};
var response = await aiService.GenerateResponseAsync(userRequest, cancellationToken);
// In production, implement actual tool calling logic here
// This would involve parsing the AI response for tool calls and executing them
return new AgentResponse
{
InitialResponse = response,
ExecutedActions = new List(),
FinalResult = response
};
}
}
public class AgentResponse
{
public string InitialResponse { get; set; } = string.Empty;
public List ExecutedActions { get; set; } = new();
public string FinalResult { get; set; } = string.Empty;
}
## Production-Ready C# Examples
Production-Ready C# Enhancements
Retry + resilience using Polly
namespace YourApp.AI.Services;
using Polly;
using Polly.CircuitBreaker;
using Azure;
public class ResilientAzureOpenAIService(
IOptions options,
ILogger logger) : IAIService
{
private readonly AzureOpenAIOptions _options = options.Value;
private OpenAIClient? _client;
private IAsyncPolicy<Response>? _retryPolicy;
private OpenAIClient Client => _client ??= new OpenAIClient(
new Uri(_options.Endpoint),
new AzureKeyCredential(_options.ApiKey));
private IAsyncPolicy<Response> RetryPolicy =>
_retryPolicy ??= Policy
.Handle(ex => ex.Status >= 500)
.Or()
.OrResult<Response>(r => !r.GetRawResponse().IsError)
.WaitAndRetryAsync(
retryCount: 3,
sleepDurationProvider: attempt => TimeSpan.FromSeconds(Math.Pow(2, attempt)),
onRetry: (outcome, timespan, retryCount, context) =>
{
logger.LogWarning(
"Retry {RetryCount} after {DelayMs}ms due to {Reason}",
retryCount,
timespan.TotalMilliseconds,
outcome.Exception?.Message ?? "rate limit");
});
public async Task GenerateResponseAsync(
string userMessage,
CancellationToken cancellationToken = default)
{
var messages = new[]
{
new ChatMessage(ChatRole.System, "You are a helpful assistant."),
new ChatMessage(ChatRole.User, userMessage)
};
var chatCompletionOptions = new ChatCompletionOptions { MaxTokens = 2000 };
try
{
var response = await RetryPolicy.ExecuteAsync(
async () => await Client.GetChatCompletionsAsync(
_options.DeploymentName,
messages,
chatCompletionOptions,
cancellationToken),
cancellationToken);
return response.Value.Choices.Message.Content;
}
catch (Azure.RequestFailedException ex) when (ex.Status == 429)
{
logger.LogError("Rate limit exceeded. Implement backoff strategy.");
throw;
}
}
public async Task AnalyzeContentAsync(
string content,
string analysisPrompt,
CancellationToken cancellationToken = default)
{
// Implementation similar to GenerateResponseAsync
throw new NotImplementedException();
}
public IAsyncEnumerable StreamResponseAsync(
string userMessage,
CancellationToken cancellationToken = default)
{
throw new NotImplementedException();
}
}
Content Analysis Pipelines
namespace YourApp.Features.ContentAnalysis;
using YourApp.AI.Services;
public interface IContentAnalyzer
{
Task AnalyzeAsync(string content, CancellationToken cancellationToken = default);
}
public class ContentAnalyzer(IAIService aiService, ILogger logger) : IContentAnalyzer
{
public async Task AnalyzeAsync(
string content,
CancellationToken cancellationToken = default)
{
logger.LogInformation("Starting content analysis for {ContentLength} characters", content.Length);
var sentimentTask = aiService.AnalyzeContentAsync(
content,
"Analyze the sentiment. Respond with: positive, negative, or neutral.",
cancellationToken);
var summaryTask = aiService.AnalyzeContentAsync(
content,
"Provide a concise summary in 2-3 sentences.",
cancellationToken);
var keywordsTask = aiService.AnalyzeContentAsync(
content,
"Extract 5 key topics or keywords as a comma-separated list.",
cancellationToken);
await Task.WhenAll(sentimentTask, summaryTask, keywordsTask);
return new ContentAnalysisResult
{
Sentiment = await sentimentTask,
Summary = await summaryTask,
Keywords = (await keywordsTask).Split(',').Select(k => k.Trim()).ToList(),
AnalyzedAt = DateTime.UtcNow
};
}
}
public class ContentAnalysisResult
{
public string Sentiment { get; set; } = string.Empty;
public string Summary { get; set; } = string.Empty;
public List Keywords { get; set; } = new();
public DateTime AnalyzedAt { get; set; }
}
Common Pitfalls & Troubleshooting
Pitfall 1: Hardcoded Credentials
Problem: Storing API keys directly in code or configuration files committed to version control.
Solution: Always use Azure Key Vault or .NET user secrets:
// In production, use Azure Key Vault
builder.Services.AddAzureAppConfiguration(options =>
options.Connect(builder.Configuration["AppConfig:ConnectionString"])
.Select(KeyFilter.Any, LabelFilter.Null)
.Select(KeyFilter.Any, builder.Environment.EnvironmentName));
Pitfall 2: Unhandled Rate Limiting
Problem: Azure OpenAI enforces rate limits; exceeding them causes request failures.
Solution: Implement exponential backoff and circuit breaker patterns (shown in the resilient example above).
Pitfall 3: Streaming Without Proper Cancellation
Problem: Long-running streaming operations don’t respect cancellation tokens, consuming resources.
Solution: Always pass `CancellationToken` through the entire call chain and use `EnumeratorCancellation` attribute.
Pitfall 4: Memory Leaks from Unclosed Clients
**Problem:** Creating new `OpenAIClient` instances repeatedly without disposal.
**Solution:** Use lazy initialization or dependency injection to maintain a single client instance:
private OpenAIClient Client => _client ??= new OpenAIClient(
new Uri(_options.Endpoint),
new AzureKeyCredential(_options.ApiKey));
### Pitfall 5: Ignoring Token Limits
**Problem:** Sending prompts that exceed the model’s token limit, causing failures.
**Solution:** Implement token counting and truncation:
private const int MaxTokens = 2000;
private const int SafetyMargin = 100;
private string TruncateIfNeeded(string content)
{
// Rough estimate: 1 token ≈ 4 characters
var estimatedTokens = content.Length / 4;
if (estimatedTokens > MaxTokens - SafetyMargin)
{
var maxChars = (MaxTokens - SafetyMargin) * 4;
return content[..maxChars];
}
return content;
}
## Performance & Scalability Considerations
### 1. Connection Pooling
Reuse HTTP connections by maintaining a single `OpenAIClient` instance per application:
// ✓ Good: Single instance
private OpenAIClient Client => _client ??= new OpenAIClient(...);
// ✗ Bad: New instance per request
var client = new OpenAIClient(...);
### 2. Async All the Way
Never block on async operations:
// ✓ Good
var result = await aiService.GenerateResponseAsync(message);
// ✗ Bad
var result = aiService.GenerateResponseAsync(message).Result;
### 3. Implement Caching for Repeated Queries
public class CachedAIService(IAIService innerService, IMemoryCache cache) : IAIService
{
private const string CacheKeyPrefix = "ai_response_";
private const int CacheDurationSeconds = 3600;
public async Task GenerateResponseAsync(
string userMessage,
CancellationToken cancellationToken = default)
{
var cacheKey = $"{CacheKeyPrefix}{userMessage.GetHashCode()}";
if (cache.TryGetValue(cacheKey, out string? cachedResponse))
return cachedResponse!;
var response = await innerService.GenerateResponseAsync(userMessage, cancellationToken);
cache.Set(cacheKey, response, TimeSpan.FromSeconds(CacheDurationSeconds));
return response;
}
// Other methods...
}
### 4. Batch Processing for High Volume
public class BatchAnalysisService(IAIService aiService)
{
public async Task<List> AnalyzeBatchAsync(
IEnumerable items,
string analysisPrompt,
int maxConcurrency = 5,
CancellationToken cancellationToken = default)
{
var semaphore = new SemaphoreSlim(maxConcurrency);
var tasks = new List<Task>();
foreach (var item in items)
{
await semaphore.WaitAsync(cancellationToken);
tasks.Add(Task.Run(async () =>
{
try
{
return await aiService.AnalyzeContentAsync(item, analysisPrompt, cancellationToken);
}
finally
{
semaphore.Release();
}
}, cancellationToken));
}
var results = await Task.WhenAll(tasks);
return results.ToList();
}
}
### 5. Regional Deployment for Low Latency
Deploy your ASP.NET Core application in the same Azure region as your OpenAI resource to minimize network latency.
## Practical Best Practices
### 1. Structured Logging
logger.LogInformation(
"AI request completed. Model: {Model}, Tokens: {Tokens}, Duration: {Duration}ms",
_options.DeploymentName,
response.Usage.TotalTokens,
stopwatch.ElapsedMilliseconds);
### 2. Input Validation and Sanitization
private void ValidateInput(string userMessage)
{
if (string.IsNullOrWhiteSpace(userMessage))
throw new ArgumentException("Message cannot be empty.");
if (userMessage.Length > 10000)
throw new ArgumentException("Message exceeds maximum length.");
// Prevent prompt injection
if (userMessage.Contains("ignore previous instructions", StringComparison.OrdinalIgnoreCase))
throw new ArgumentException("Invalid message content.");
}
### 3. Testing with Mocks
public class MockAIService : IAIService
{
public Task GenerateResponseAsync(string userMessage, CancellationToken cancellationToken = default)
{
return Task.FromResult("Mock response for testing");
}
public Task AnalyzeContentAsync(string content, string analysisPrompt, CancellationToken cancellationToken = default)
{
return Task.FromResult("Mock analysis");
}
public async IAsyncEnumerable StreamResponseAsync(string userMessage, [System.Runtime.CompilerServices.EnumeratorCancellation] CancellationToken cancellationToken = default)
{
yield return "Mock ";
yield return "streaming ";
yield return "response";
}
}
### 4. Monitoring and Observability
builder.Services.AddApplicationInsightsTelemetry();
// In your service
using var activity = new Activity("AIRequest").Start();
activity?.SetTag("model", _options.DeploymentName);
activity?.SetTag("message_length", userMessage.Length);
try
{
var response = await Client.GetChatCompletionsAsync(...);
activity?.SetTag("success", true);
}
catch (Exception ex)
{
activity?.SetTag("error", ex.Message);
throw;
}
## Conclusion
You’ve now built a production-grade AI-augmented backend with Azure OpenAI and ASP.NET Core. The architecture you’ve implemented provides:
– **Abstraction layers** that isolate AI logic from business logic
– **Resilience patterns** that handle failures gracefully
– **Scalability mechanisms** for high-volume scenarios
– **Security practices** that protect sensitive credentials
– **Observability** for monitoring and debugging
**Next steps:**
1. Deploy your application to Azure App Service or Azure Container Instances
2. Implement Azure Key Vault for credential management
3. Set up Application Insights for production monitoring
4. Experiment with different models (gpt-4, gpt-4o) to optimize cost vs. capability
5. Build domain-specific agents that leverage your business data
6. Implement fine-tuning for specialized use cases
The foundation is solid. Now extend it with your domain expertise.
—
## Frequently Asked Questions
### Q1: How do I choose between gpt-35-turbo, gpt-4o-mini, and gpt-4?
**A:** This is a cost-vs-capability tradeoff:
– **gpt-35-turbo**: Fastest and cheapest. Use for simple tasks like classification or summarization.
– **gpt-4o-mini**: Balanced option. Recommended for most production applications.
– **gpt-4**: Most capable but expensive. Use for complex reasoning, code generation, or specialized analysis.
Start with gpt-4o-mini and benchmark against your requirements.
### Q2: What’s the difference between streaming and non-streaming responses?
**A:** Streaming returns tokens progressively, enabling real-time UI updates and perceived faster responses. Non-streaming waits for the complete response. Use streaming for user-facing chat applications; use non-streaming for backend analysis where you need the full result before proceeding.
### Q3: How do I prevent prompt injection attacks?
**A:** Implement strict input validation, use system prompts that define boundaries, and never concatenate user input directly into prompts. Instead, use structured formats:
// ✗ Vulnerable
var prompt = $"Analyze this: {userInput}";
// ✓ Safe
var messages = new[]
{
new ChatMessage(ChatRole.System, "You are an analyzer. Only respond with analysis."),
new ChatMessage(ChatRole.User, userInput)
};
### Q4: How do I handle Azure OpenAI quota limits?
**A:** Monitor your usage in the Azure Portal, implement request throttling with `SemaphoreSlim`, and use exponential backoff for retries. Consider requesting quota increases for production workloads.
### Q5: Can I use Azure OpenAI with other .NET frameworks like Blazor or MAUI?
**A:** Yes. The Azure.AI.OpenAI SDK works with any .NET application. For Blazor, call your ASP.NET Core backend API instead of directly accessing Azure OpenAI from the browser (for security). For MAUI, use the same patterns shown here.
### Q6: How do I optimize costs for high-volume AI requests?
**A:** Implement caching for repeated queries, batch similar requests together, use gpt-4o-mini instead of gpt-4 when possible, and monitor token usage. Consider implementing a request queue with off-peak processing.
### Q7: What’s the best way to handle long conversations with context?
**A:** Maintain conversation history in memory or a database, but truncate old messages to stay within token limits. Implement a sliding window approach:
private const int MaxHistoryMessages = 10;
private List TrimHistory(List history)
{
if (history.Count > MaxHistoryMessages)
return history.Skip(history.Count - MaxHistoryMessages).ToList();
return history;
}
### Q8: How do I test AI functionality without hitting Azure OpenAI every time?
**A:** Use the `MockAIService` pattern shown earlier. Inject `IAIService` as a dependency, allowing you to swap implementations in tests. Use xUnit or NUnit with Moq for unit testing.
### Q9: What should I do if the AI response is inappropriate or harmful?
**A:** Implement content filtering using Azure Content Safety API or similar services. Add a validation layer after receiving the response:
private async Task IsContentSafeAsync(string content)
{
// Call Azure Content Safety API
// Return true if safe, false otherwise
}
### Q10: How do I monitor token usage and costs?
**A:** Log token counts from the response object and aggregate them:
var response = await Client.GetChatCompletionsAsync(...);
var totalTokens = response.Value.Usage.TotalTokens;
var promptTokens = response.Value.Usage.PromptTokens;
var completionTokens = response.Value.Usage.CompletionTokens;
logger.LogInformation(
"Tokens used - Prompt: {PromptTokens}, Completion: {CompletionTokens}, Total: {TotalTokens}",
promptTokens,
completionTokens,
totalTokens);
Send this data to Application Insights for cost tracking and optimization.
Master Effortless Cloud-Native .NET Microservices Using DAPR, gRPC & Azure Kubernetes Service
Headless Architecture in .NET Microservices with gRPC
.NET Core Microservices and Azure Kubernetes Service
External Resources
1️⃣ Microsoft Learn – ASP.NET Core Documentation
https://learn.microsoft.com/aspnet/core
2️⃣ Azure OpenAI Service Overview
https://learn.microsoft.com/azure/ai-services/openai/overview
3️⃣ Azure OpenAI Chat Completions API Reference
https://learn.microsoft.com/azure/ai-services/openai/reference
