Intelligent Applications Archives

Transform Your Backend into a Smart Autonomous Decision Layer

Executive Summary

Building Intelligent, Agentic APIs with ASP.NET Core and Azure OpenAI

Modern applications need far more than static JSON—they require intelligence, reasoning, and autonomous action. By integrating Azure OpenAI into ASP.NET Core, you can build agentic APIs capable of understanding natural language, analyzing content, and orchestrating workflows with minimal human intervention.

This guide shows how to go beyond basic chatbot calls and create production-ready AI APIs, unlocking:

Natural language decision-making
Content analysis pipelines
Real-time streaming responses
Tool calling for agent workflows
Resilient patterns suited for enterprise delivery

Whether you’re automating business operations or creating smart assistants, this blueprint gives you everything you need.

Prerequisites

Before writing a single line of code, make sure you have:

.NET 6+ (prefer .NET 8 for best performance)
Azure subscription
Azure OpenAI model deployment (gpt-4o-mini recommended)
IDE (Visual Studio or VS Code)
API key + endpoint
Familiarity with async patterns and dependency injection

Required NuGet packages

Install these packages in your ASP.NET Core project:

“`
dotnet add package Azure.AI.OpenAI
dotnet add package Azure.Identity
dotnet add package Microsoft.Extensions.AI
dotnet add package Microsoft.Extensions.Configuration.UserSecrets
“`

Step 1 — Securely Configure Azure OpenAI

Options class

Start by setting up secure credential management. Create a configuration class to encapsulate Azure OpenAI settings:


namespace YourApp.AI.Configuration;

public class AzureOpenAIOptions
{
    public string Endpoint { get; set; } = string.Empty;
    public string DeploymentName { get; set; } = string.Empty;
    public string ApiKey { get; set; } = string.Empty;
}

Add your credentials to `appsettings.json`:


{
  "AzureOpenAI": {
    "Endpoint": "https://your-resource.openai.azure.com/",
    "DeploymentName": "gpt-4o-mini",
    "ApiKey": "your-api-key-here"
  }
}

For local development, use .NET user secrets to avoid committing credentials:


dotnet user-secrets init
dotnet user-secrets set "AzureOpenAI:Endpoint" "https://your-resource.openai.azure.com/"
dotnet user-secrets set "AzureOpenAI:DeploymentName" "gpt-4o-mini"
dotnet user-secrets set "AzureOpenAI:ApiKey" "your-api-key-here"

Step 2 — Create an AI Abstraction Service

Build a clean abstraction layer that isolates Azure OpenAI details from your business logic:


namespace YourApp.AI.Services;

using Azure;
using Azure.AI.OpenAI;
using Microsoft.Extensions.Options;

public interface IAIService
{
    Task GenerateResponseAsync(string userMessage, CancellationToken cancellationToken = default);
    Task AnalyzeContentAsync(string content, string analysisPrompt, CancellationToken cancellationToken = default);
    IAsyncEnumerable StreamResponseAsync(string userMessage, CancellationToken cancellationToken = default);
}

public class AzureOpenAIService(IOptions options) : IAIService
{
    private readonly AzureOpenAIOptions _options = options.Value;
    private OpenAIClient? _client;

    private OpenAIClient Client => _client ??= new OpenAIClient(
        new Uri(_options.Endpoint),
        new AzureKeyCredential(_options.ApiKey));

    public async Task GenerateResponseAsync(string userMessage, CancellationToken cancellationToken = default)
    {
        var chatCompletionOptions = new ChatCompletionOptions
        {
            Temperature = 0.7f,
            MaxTokens = 2000,
        };

        var messages = new[]
        {
            new ChatMessage(ChatRole.System, "You are a helpful assistant that provides accurate, concise responses."),
            new ChatMessage(ChatRole.User, userMessage)
        };

        var response = await Client.GetChatCompletionsAsync(
            _options.DeploymentName,
            messages,
            chatCompletionOptions,
            cancellationToken);

        return response.Value.Choices.Message.Content;
    }

    public async Task AnalyzeContentAsync(string content, string analysisPrompt, CancellationToken cancellationToken = default)
    {
        var systemPrompt = $"You are an expert analyst. {analysisPrompt}";
        
        var messages = new[]
        {
            new ChatMessage(ChatRole.System, systemPrompt),
            new ChatMessage(ChatRole.User, content)
        };

        var response = await Client.GetChatCompletionsAsync(
            _options.DeploymentName,
            messages,
            cancellationToken: cancellationToken);

        return response.Value.Choices.Message.Content;
    }

    public async IAsyncEnumerable StreamResponseAsync(
        string userMessage,
        [System.Runtime.CompilerServices.EnumeratorCancellation] CancellationToken cancellationToken = default)
    {
        var messages = new[]
        {
            new ChatMessage(ChatRole.System, "You are a helpful assistant."),
            new ChatMessage(ChatRole.User, userMessage)
        };

        using var streamingResponse = await Client.GetChatCompletionsStreamingAsync(
            _options.DeploymentName,
            messages,
            cancellationToken: cancellationToken);

        await foreach (var update in streamingResponse.EnumerateUpdatesAsync(cancellationToken))
        {
            if (update.ContentUpdate != null)
            {
                yield return update.ContentUpdate;
            }
        }
    }
}

Step 3 — Register Services in Dependency Injection

Configure your services in `Program.cs`:


var builder = WebApplicationBuilder.CreateBuilder(args);

// Add configuration
builder.Services.Configure(
    builder.Configuration.GetSection("AzureOpenAI"));

// Register AI service
builder.Services.AddScoped<IAIService, AzureOpenAIService>();

// Add HTTP client for downstream integrations
builder.Services.AddHttpClient();

builder.Services.AddControllers();
builder.Services.AddOpenApi();

var app = builder.Build();

if (app.Environment.IsDevelopment())
{
    app.MapOpenApi();
}

app.UseHttpsRedirection();
app.MapControllers();

app.Run();

Step 4 — Build REST Intelligence Endpoints

Create a controller that exposes AI capabilities as REST endpoints:


namespace YourApp.Controllers;

using Microsoft.AspNetCore.Mvc;
using YourApp.AI.Services;

[ApiController]
[Route("api/[controller]")]
public class IntelligenceController(IAIService aiService) : ControllerBase
{
    [HttpPost("analyze")]
    public async Task AnalyzeContent(
        [FromBody] AnalysisRequest request,
        CancellationToken cancellationToken)
    {
        if (string.IsNullOrWhiteSpace(request.Content))
            return BadRequest("Content is required.");

        var analysis = await aiService.AnalyzeContentAsync(
            request.Content,
            request.AnalysisPrompt ?? "Provide a detailed analysis.",
            cancellationToken);

        return Ok(new { analysis });
    }

    [HttpPost("chat")]
    public async Task Chat(
        [FromBody] ChatRequest request,
        CancellationToken cancellationToken)
    {
        if (string.IsNullOrWhiteSpace(request.Message))
            return BadRequest("Message is required.");

        var response = await aiService.GenerateResponseAsync(
            request.Message,
            cancellationToken);

        return Ok(new { response });
    }

    [HttpPost("stream")]
    public async IAsyncEnumerable StreamChat(
        [FromBody] ChatRequest request,
        [System.Runtime.CompilerServices.EnumeratorCancellation] CancellationToken cancellationToken)
    {
        if (string.IsNullOrWhiteSpace(request.Message))
            yield break;

        await foreach (var chunk in aiService.StreamResponseAsync(request.Message, cancellationToken))
        {
            yield return chunk;
        }
    }
}

public record AnalysisRequest(string Content, string? AnalysisPrompt = null);
public record ChatRequest(string Message);

Step 5 — Enable Agentic Behavior (Tool Calling)

Create an advanced service that enables the AI to call functions autonomously:


namespace YourApp.AI.Services;

using Azure.AI.OpenAI;

public interface IAgentService
{
    Task ExecuteAgentAsync(string userRequest, CancellationToken cancellationToken = default);
}

public class AgentService(IAIService aiService, IHttpClientFactory httpClientFactory) : IAgentService
{
    public async Task ExecuteAgentAsync(string userRequest, CancellationToken cancellationToken = default)
    {
        var conversationHistory = new List
        {
            new ChatMessage(ChatRole.System, 
                "You are an intelligent agent. When asked to perform tasks, use available tools. " +
                "Available tools: GetWeather, FetchUserData, SendNotification."),
            new ChatMessage(ChatRole.User, userRequest)
        };

        var response = await aiService.GenerateResponseAsync(userRequest, cancellationToken);

        // In production, implement actual tool calling logic here
        // This would involve parsing the AI response for tool calls and executing them

        return new AgentResponse
        {
            InitialResponse = response,
            ExecutedActions = new List(),
            FinalResult = response
        };
    }
}

public class AgentResponse
{
    public string InitialResponse { get; set; } = string.Empty;
    public List ExecutedActions { get; set; } = new();
    public string FinalResult { get; set; } = string.Empty;
}

## Production-Ready C# Examples

Production-Ready C# Enhancements

Retry + resilience using Polly


namespace YourApp.AI.Services;

using Polly;
using Polly.CircuitBreaker;
using Azure;

public class ResilientAzureOpenAIService(
    IOptions options,
    ILogger logger) : IAIService
{
    private readonly AzureOpenAIOptions _options = options.Value;
    private OpenAIClient? _client;
    private IAsyncPolicy<Response>? _retryPolicy;

    private OpenAIClient Client => _client ??= new OpenAIClient(
        new Uri(_options.Endpoint),
        new AzureKeyCredential(_options.ApiKey));

    private IAsyncPolicy<Response> RetryPolicy =>
        _retryPolicy ??= Policy
            .Handle(ex => ex.Status >= 500)
            .Or()
            .OrResult<Response>(r => !r.GetRawResponse().IsError)
            .WaitAndRetryAsync(
                retryCount: 3,
                sleepDurationProvider: attempt => TimeSpan.FromSeconds(Math.Pow(2, attempt)),
                onRetry: (outcome, timespan, retryCount, context) =>
                {
                    logger.LogWarning(
                        "Retry {RetryCount} after {DelayMs}ms due to {Reason}",
                        retryCount,
                        timespan.TotalMilliseconds,
                        outcome.Exception?.Message ?? "rate limit");
                });

    public async Task GenerateResponseAsync(
        string userMessage,
        CancellationToken cancellationToken = default)
    {
        var messages = new[]
        {
            new ChatMessage(ChatRole.System, "You are a helpful assistant."),
            new ChatMessage(ChatRole.User, userMessage)
        };

        var chatCompletionOptions = new ChatCompletionOptions { MaxTokens = 2000 };

        try
        {
            var response = await RetryPolicy.ExecuteAsync(
                async () => await Client.GetChatCompletionsAsync(
                    _options.DeploymentName,
                    messages,
                    chatCompletionOptions,
                    cancellationToken),
                cancellationToken);

            return response.Value.Choices.Message.Content;
        }
        catch (Azure.RequestFailedException ex) when (ex.Status == 429)
        {
            logger.LogError("Rate limit exceeded. Implement backoff strategy.");
            throw;
        }
    }

    public async Task AnalyzeContentAsync(
        string content,
        string analysisPrompt,
        CancellationToken cancellationToken = default)
    {
        // Implementation similar to GenerateResponseAsync
        throw new NotImplementedException();
    }

    public IAsyncEnumerable StreamResponseAsync(
        string userMessage,
        CancellationToken cancellationToken = default)
    {
        throw new NotImplementedException();
    }
}

Content Analysis Pipelines


namespace YourApp.Features.ContentAnalysis;

using YourApp.AI.Services;

public interface IContentAnalyzer
{
    Task AnalyzeAsync(string content, CancellationToken cancellationToken = default);
}

public class ContentAnalyzer(IAIService aiService, ILogger logger) : IContentAnalyzer
{
    public async Task AnalyzeAsync(
        string content,
        CancellationToken cancellationToken = default)
    {
        logger.LogInformation("Starting content analysis for {ContentLength} characters", content.Length);

        var sentimentTask = aiService.AnalyzeContentAsync(
            content,
            "Analyze the sentiment. Respond with: positive, negative, or neutral.",
            cancellationToken);

        var summaryTask = aiService.AnalyzeContentAsync(
            content,
            "Provide a concise summary in 2-3 sentences.",
            cancellationToken);

        var keywordsTask = aiService.AnalyzeContentAsync(
            content,
            "Extract 5 key topics or keywords as a comma-separated list.",
            cancellationToken);

        await Task.WhenAll(sentimentTask, summaryTask, keywordsTask);

        return new ContentAnalysisResult
        {
            Sentiment = await sentimentTask,
            Summary = await summaryTask,
            Keywords = (await keywordsTask).Split(',').Select(k => k.Trim()).ToList(),
            AnalyzedAt = DateTime.UtcNow
        };
    }
}

public class ContentAnalysisResult
{
    public string Sentiment { get; set; } = string.Empty;
    public string Summary { get; set; } = string.Empty;
    public List Keywords { get; set; } = new();
    public DateTime AnalyzedAt { get; set; }
}

Common Pitfalls & Troubleshooting

Pitfall 1: Hardcoded Credentials

Problem: Storing API keys directly in code or configuration files committed to version control.

Solution: Always use Azure Key Vault or .NET user secrets:


// In production, use Azure Key Vault
builder.Services.AddAzureAppConfiguration(options =>
    options.Connect(builder.Configuration["AppConfig:ConnectionString"])
        .Select(KeyFilter.Any, LabelFilter.Null)
        .Select(KeyFilter.Any, builder.Environment.EnvironmentName));

Pitfall 2: Unhandled Rate Limiting

Problem: Azure OpenAI enforces rate limits; exceeding them causes request failures.

Solution: Implement exponential backoff and circuit breaker patterns (shown in the resilient example above).

Pitfall 3: Streaming Without Proper Cancellation

Problem: Long-running streaming operations don’t respect cancellation tokens, consuming resources.

Solution: Always pass `CancellationToken` through the entire call chain and use `EnumeratorCancellation` attribute.

Pitfall 4: Memory Leaks from Unclosed Clients

**Problem:** Creating new `OpenAIClient` instances repeatedly without disposal.

**Solution:** Use lazy initialization or dependency injection to maintain a single client instance:


private OpenAIClient Client => _client ??= new OpenAIClient(
    new Uri(_options.Endpoint),
    new AzureKeyCredential(_options.ApiKey));

### Pitfall 5: Ignoring Token Limits

**Problem:** Sending prompts that exceed the model’s token limit, causing failures.

**Solution:** Implement token counting and truncation:


private const int MaxTokens = 2000;
private const int SafetyMargin = 100;

private string TruncateIfNeeded(string content)
{
    // Rough estimate: 1 token ≈ 4 characters
    var estimatedTokens = content.Length / 4;
    if (estimatedTokens > MaxTokens - SafetyMargin)
    {
        var maxChars = (MaxTokens - SafetyMargin) * 4;
        return content[..maxChars];
    }
    return content;
}

## Performance & Scalability Considerations

### 1. Connection Pooling

Reuse HTTP connections by maintaining a single `OpenAIClient` instance per application:


// ✓ Good: Single instance
private OpenAIClient Client => _client ??= new OpenAIClient(...);

// ✗ Bad: New instance per request
var client = new OpenAIClient(...);

### 2. Async All the Way

Never block on async operations:


// ✓ Good
var result = await aiService.GenerateResponseAsync(message);

// ✗ Bad
var result = aiService.GenerateResponseAsync(message).Result;

### 3. Implement Caching for Repeated Queries


public class CachedAIService(IAIService innerService, IMemoryCache cache) : IAIService
{
    private const string CacheKeyPrefix = "ai_response_";
    private const int CacheDurationSeconds = 3600;

    public async Task GenerateResponseAsync(
        string userMessage,
        CancellationToken cancellationToken = default)
    {
        var cacheKey = $"{CacheKeyPrefix}{userMessage.GetHashCode()}";

        if (cache.TryGetValue(cacheKey, out string? cachedResponse))
            return cachedResponse!;

        var response = await innerService.GenerateResponseAsync(userMessage, cancellationToken);

        cache.Set(cacheKey, response, TimeSpan.FromSeconds(CacheDurationSeconds));

        return response;
    }

    // Other methods...
}

### 4. Batch Processing for High Volume


public class BatchAnalysisService(IAIService aiService)
{
    public async Task<List> AnalyzeBatchAsync(
        IEnumerable items,
        string analysisPrompt,
        int maxConcurrency = 5,
        CancellationToken cancellationToken = default)
    {
        var semaphore = new SemaphoreSlim(maxConcurrency);
        var tasks = new List<Task>();

        foreach (var item in items)
        {
            await semaphore.WaitAsync(cancellationToken);

            tasks.Add(Task.Run(async () =>
            {
                try
                {
                    return await aiService.AnalyzeContentAsync(item, analysisPrompt, cancellationToken);
                }
                finally
                {
                    semaphore.Release();
                }
            }, cancellationToken));
        }

        var results = await Task.WhenAll(tasks);
        return results.ToList();
    }
}

### 5. Regional Deployment for Low Latency

Deploy your ASP.NET Core application in the same Azure region as your OpenAI resource to minimize network latency.

## Practical Best Practices

### 1. Structured Logging


logger.LogInformation(
    "AI request completed. Model: {Model}, Tokens: {Tokens}, Duration: {Duration}ms",
    _options.DeploymentName,
    response.Usage.TotalTokens,
    stopwatch.ElapsedMilliseconds);

### 2. Input Validation and Sanitization


private void ValidateInput(string userMessage)
{
    if (string.IsNullOrWhiteSpace(userMessage))
        throw new ArgumentException("Message cannot be empty.");

    if (userMessage.Length > 10000)
        throw new ArgumentException("Message exceeds maximum length.");

    // Prevent prompt injection
    if (userMessage.Contains("ignore previous instructions", StringComparison.OrdinalIgnoreCase))
        throw new ArgumentException("Invalid message content.");
}

### 3. Testing with Mocks


public class MockAIService : IAIService
{
    public Task GenerateResponseAsync(string userMessage, CancellationToken cancellationToken = default)
    {
        return Task.FromResult("Mock response for testing");
    }

    public Task AnalyzeContentAsync(string content, string analysisPrompt, CancellationToken cancellationToken = default)
    {
        return Task.FromResult("Mock analysis");
    }

    public async IAsyncEnumerable StreamResponseAsync(string userMessage, [System.Runtime.CompilerServices.EnumeratorCancellation] CancellationToken cancellationToken = default)
    {
        yield return "Mock ";
        yield return "streaming ";
        yield return "response";
    }
}

### 4. Monitoring and Observability


builder.Services.AddApplicationInsightsTelemetry();

// In your service
using var activity = new Activity("AIRequest").Start();
activity?.SetTag("model", _options.DeploymentName);
activity?.SetTag("message_length", userMessage.Length);

try
{
    var response = await Client.GetChatCompletionsAsync(...);
    activity?.SetTag("success", true);
}
catch (Exception ex)
{
    activity?.SetTag("error", ex.Message);
    throw;
}

## Conclusion

You’ve now built a production-grade AI-augmented backend with Azure OpenAI and ASP.NET Core. The architecture you’ve implemented provides:

– **Abstraction layers** that isolate AI logic from business logic
– **Resilience patterns** that handle failures gracefully
– **Scalability mechanisms** for high-volume scenarios
– **Security practices** that protect sensitive credentials
– **Observability** for monitoring and debugging

**Next steps:**

1. Deploy your application to Azure App Service or Azure Container Instances
2. Implement Azure Key Vault for credential management
3. Set up Application Insights for production monitoring
4. Experiment with different models (gpt-4, gpt-4o) to optimize cost vs. capability
5. Build domain-specific agents that leverage your business data
6. Implement fine-tuning for specialized use cases

The foundation is solid. Now extend it with your domain expertise.

—

## Frequently Asked Questions

### Q1: How do I choose between gpt-35-turbo, gpt-4o-mini, and gpt-4?

**A:** This is a cost-vs-capability tradeoff:

– **gpt-35-turbo**: Fastest and cheapest. Use for simple tasks like classification or summarization.
– **gpt-4o-mini**: Balanced option. Recommended for most production applications.
– **gpt-4**: Most capable but expensive. Use for complex reasoning, code generation, or specialized analysis.

Start with gpt-4o-mini and benchmark against your requirements.

### Q2: What’s the difference between streaming and non-streaming responses?

**A:** Streaming returns tokens progressively, enabling real-time UI updates and perceived faster responses. Non-streaming waits for the complete response. Use streaming for user-facing chat applications; use non-streaming for backend analysis where you need the full result before proceeding.

### Q3: How do I prevent prompt injection attacks?

**A:** Implement strict input validation, use system prompts that define boundaries, and never concatenate user input directly into prompts. Instead, use structured formats:


// ✗ Vulnerable
var prompt = $"Analyze this: {userInput}";

// ✓ Safe
var messages = new[]
{
    new ChatMessage(ChatRole.System, "You are an analyzer. Only respond with analysis."),
    new ChatMessage(ChatRole.User, userInput)
};

### Q4: How do I handle Azure OpenAI quota limits?

**A:** Monitor your usage in the Azure Portal, implement request throttling with `SemaphoreSlim`, and use exponential backoff for retries. Consider requesting quota increases for production workloads.

### Q5: Can I use Azure OpenAI with other .NET frameworks like Blazor or MAUI?

**A:** Yes. The Azure.AI.OpenAI SDK works with any .NET application. For Blazor, call your ASP.NET Core backend API instead of directly accessing Azure OpenAI from the browser (for security). For MAUI, use the same patterns shown here.

### Q6: How do I optimize costs for high-volume AI requests?

**A:** Implement caching for repeated queries, batch similar requests together, use gpt-4o-mini instead of gpt-4 when possible, and monitor token usage. Consider implementing a request queue with off-peak processing.

### Q7: What’s the best way to handle long conversations with context?

**A:** Maintain conversation history in memory or a database, but truncate old messages to stay within token limits. Implement a sliding window approach:


private const int MaxHistoryMessages = 10;

private List TrimHistory(List history)
{
    if (history.Count > MaxHistoryMessages)
        return history.Skip(history.Count - MaxHistoryMessages).ToList();
    return history;
}

### Q8: How do I test AI functionality without hitting Azure OpenAI every time?

**A:** Use the `MockAIService` pattern shown earlier. Inject `IAIService` as a dependency, allowing you to swap implementations in tests. Use xUnit or NUnit with Moq for unit testing.

### Q9: What should I do if the AI response is inappropriate or harmful?

**A:** Implement content filtering using Azure Content Safety API or similar services. Add a validation layer after receiving the response:


private async Task IsContentSafeAsync(string content)
{
    // Call Azure Content Safety API
    // Return true if safe, false otherwise
}

### Q10: How do I monitor token usage and costs?

**A:** Log token counts from the response object and aggregate them:


var response = await Client.GetChatCompletionsAsync(...);
var totalTokens = response.Value.Usage.TotalTokens;
var promptTokens = response.Value.Usage.PromptTokens;
var completionTokens = response.Value.Usage.CompletionTokens;

logger.LogInformation(
    "Tokens used - Prompt: {PromptTokens}, Completion: {CompletionTokens}, Total: {TotalTokens}",
    promptTokens,
    completionTokens,
    totalTokens);

Send this data to Application Insights for cost tracking and optimization.

Master Effortless Cloud-Native .NET Microservices Using DAPR, gRPC & Azure Kubernetes Service

Headless Architecture in .NET Microservices with gRPC

AI-Driven .NET Development in 2026: How Senior Architects Master .NET 10 for Elite Performance Tuning

.NET Core Microservices and Azure Kubernetes Service

External Resources

1️⃣ Microsoft Learn – ASP.NET Core Documentation
https://learn.microsoft.com/aspnet/core

2️⃣ Azure OpenAI Service Overview
https://learn.microsoft.com/azure/ai-services/openai/overview

3️⃣ Azure OpenAI Chat Completions API Reference
https://learn.microsoft.com/azure/ai-services/openai/reference