Building an AI Chatbot with Laravel and Streaming Responses

· 13 min read

Most AI chatbot tutorials show you how to send a prompt and wait for a complete response. In production, that means users stare at a spinner for 5-15 seconds. Streaming fixes this — tokens appear as they're generated, just like ChatGPT.

This guide builds a complete chatbot: streaming backend with Server-Sent Events, conversation memory, multi-provider support, and a lightweight frontend with vanilla JS.

Architecture

┌──────────┐    POST /chat      ┌──────────────┐   Stream    ┌───────────┐
│  Browser  │ ──────────────────▶│   Laravel     │ ──────────▶│  OpenAI/  │
│  (SSE)    │ ◀─────────────────│   Controller  │ ◀──────────│  Claude   │
│           │  text/event-stream │              │   Chunks    │           │
└──────────┘                    └──────────────┘             └───────────┘

Key decisions:

  • Server-Sent Events (SSE) over WebSockets — simpler, HTTP-native, no extra infrastructure
  • Session-based memory — no database for conversations (kept simple)
  • Provider abstraction — swap OpenAI ↔ Anthropic without touching controllers

Install Dependencies

composer require openai-php/laravel
# or for Anthropic:
composer require anthropic-ai/laravel
# .env
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4o

# Or Anthropic
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_MODEL=claude-sonnet-4-20250514

The Chat Service

Provider Interface

// app/Services/AI/ChatProviderInterface.php

namespace App\Services\AI;

use Generator;

interface ChatProviderInterface
{
    /**
     * Send messages and get a complete response.
     *
     * @param array<int, array{role: string, content: string}> $messages
     */
    public function chat(array $messages, string $model = ''): string;

    /**
     * Send messages and stream the response token by token.
     *
     * @param array<int, array{role: string, content: string}> $messages
     * @return Generator<int, string, void, void>
     */
    public function stream(array $messages, string $model = ''): Generator;
}

OpenAI Implementation

// app/Services/AI/OpenAIChatProvider.php

namespace App\Services\AI;

use Generator;
use OpenAI\Laravel\Facades\OpenAI;

class OpenAIChatProvider implements ChatProviderInterface
{
    public function chat(array $messages, string $model = ''): string
    {
        $model = $model ?: config('services.openai.model', 'gpt-4o');

        $response = OpenAI::chat()->create([
            'model' => $model,
            'messages' => $messages,
            'max_tokens' => 2048,
        ]);

        return $response->choices[0]->message->content ?? '';
    }

    public function stream(array $messages, string $model = ''): Generator
    {
        $model = $model ?: config('services.openai.model', 'gpt-4o');

        $stream = OpenAI::chat()->createStreamed([
            'model' => $model,
            'messages' => $messages,
            'max_tokens' => 2048,
        ]);

        foreach ($stream as $response) {
            $delta = $response->choices[0]->delta->content ?? '';
            if ($delta !== '') {
                yield $delta;
            }
        }
    }
}

Anthropic Implementation

// app/Services/AI/AnthropicChatProvider.php

namespace App\Services\AI;

use Generator;
use Illuminate\Support\Facades\Http;

class AnthropicChatProvider implements ChatProviderInterface
{
    private string $baseUrl = 'https://api.anthropic.com/v1';

    public function chat(array $messages, string $model = ''): string
    {
        $model = $model ?: config('services.anthropic.model', 'claude-sonnet-4-20250514');

        $response = Http::withHeaders($this->headers())
            ->post("{$this->baseUrl}/messages", [
                'model' => $model,
                'max_tokens' => 2048,
                'messages' => $this->formatMessages($messages),
                'system' => $this->extractSystem($messages),
            ]);

        return $response->json('content.0.text', '');
    }

    public function stream(array $messages, string $model = ''): Generator
    {
        $model = $model ?: config('services.anthropic.model', 'claude-sonnet-4-20250514');

        $response = Http::withHeaders($this->headers())
            ->withOptions(['stream' => true])
            ->post("{$this->baseUrl}/messages", [
                'model' => $model,
                'max_tokens' => 2048,
                'messages' => $this->formatMessages($messages),
                'system' => $this->extractSystem($messages),
                'stream' => true,
            ]);

        $body = $response->getBody();
        $buffer = '';

        while (!$body->eof()) {
            $buffer .= $body->read(1024);

            while (($pos = strpos($buffer, "\n")) !== false) {
                $line = substr($buffer, 0, $pos);
                $buffer = substr($buffer, $pos + 1);

                if (!str_starts_with($line, 'data: ')) {
                    continue;
                }

                $data = json_decode(substr($line, 6), true);

                if ($data === null) {
                    continue;
                }

                if (($data['type'] ?? '') === 'content_block_delta') {
                    $text = $data['delta']['text'] ?? '';
                    if ($text !== '') {
                        yield $text;
                    }
                }

                if (($data['type'] ?? '') === 'message_stop') {
                    return;
                }
            }
        }
    }

    private function headers(): array
    {
        return [
            'x-api-key' => config('services.anthropic.api_key'),
            'anthropic-version' => '2023-06-01',
            'Content-Type' => 'application/json',
        ];
    }

    private function formatMessages(array $messages): array
    {
        return array_values(array_filter(
            $messages,
            fn (array $msg) => $msg['role'] !== 'system'
        ));
    }

    private function extractSystem(array $messages): string
    {
        foreach ($messages as $msg) {
            if ($msg['role'] === 'system') {
                return $msg['content'];
            }
        }

        return 'You are a helpful assistant.';
    }
}

Service Provider Binding

// app/Providers/AppServiceProvider.php

use App\Services\AI\ChatProviderInterface;
use App\Services\AI\OpenAIChatProvider;
use App\Services\AI\AnthropicChatProvider;

public function register(): void
{
    $this->app->singleton(ChatProviderInterface::class, function () {
        $provider = config('services.ai.default', 'openai');

        return match ($provider) {
            'anthropic' => new AnthropicChatProvider(),
            default => new OpenAIChatProvider(),
        };
    });
}
// config/services.php

'ai' => [
    'default' => env('AI_PROVIDER', 'openai'),
],

Streaming Controller

This is the core — returning a StreamedResponse with Server-Sent Events:

// app/Http/Controllers/ChatController.php

namespace App\Http\Controllers;

use App\Services\AI\ChatProviderInterface;
use Illuminate\Http\Request;
use Illuminate\Support\Facades\RateLimiter;
use Symfony\Component\HttpFoundation\StreamedResponse;

class ChatController extends Controller
{
    public function __construct(
        private ChatProviderInterface $chatProvider,
    ) {}

    public function index()
    {
        return view('chat.index');
    }

    public function stream(Request $request): StreamedResponse
    {
        $request->validate([
            'message' => ['required', 'string', 'max:2000'],
        ]);

        // Rate limiting: 20 messages per minute per session
        $key = 'chat:' . $request->session()->getId();
        if (RateLimiter::tooManyAttempts($key, 20)) {
            abort(429, 'Too many messages. Please wait a moment.');
        }
        RateLimiter::hit($key, 60);

        $userMessage = $request->input('message');

        // Build conversation from session
        $conversation = $request->session()->get('chat_history', []);
        $conversation[] = ['role' => 'user', 'content' => $userMessage];

        // Prepare messages with system prompt
        $messages = array_merge(
            [['role' => 'system', 'content' => $this->systemPrompt()]],
            $this->trimConversation($conversation),
        );

        return new StreamedResponse(function () use ($messages, $conversation, $request) {
            $fullResponse = '';

            // Send each token as an SSE event
            foreach ($this->chatProvider->stream($messages) as $token) {
                $fullResponse .= $token;

                echo "data: " . json_encode(['token' => $token]) . "\n\n";

                if (ob_get_level() > 0) {
                    ob_flush();
                }
                flush();
            }

            // Send completion event
            echo "data: " . json_encode(['done' => true]) . "\n\n";

            if (ob_get_level() > 0) {
                ob_flush();
            }
            flush();

            // Save assistant response to session
            $conversation[] = ['role' => 'assistant', 'content' => $fullResponse];
            $request->session()->put('chat_history', $conversation);

        }, 200, [
            'Content-Type' => 'text/event-stream',
            'Cache-Control' => 'no-cache',
            'Connection' => 'keep-alive',
            'X-Accel-Buffering' => 'no', // Disable Nginx buffering
        ]);
    }

    public function clear(Request $request)
    {
        $request->session()->forget('chat_history');

        return response()->json(['status' => 'cleared']);
    }

    private function systemPrompt(): string
    {
        return <<<'PROMPT'
        You are a helpful technical assistant for a Laravel developer blog.
        Answer questions about Laravel, PHP, DevOps, and web development.
        Be concise, use code examples when helpful, and format with Markdown.
        If you're unsure, say so. Do not make up information.
        PROMPT;
    }

    /**
     * Keep only the last N messages to stay within token limits.
     */
    private function trimConversation(array $conversation, int $maxMessages = 20): array
    {
        if (count($conversation) <= $maxMessages) {
            return $conversation;
        }

        return array_slice($conversation, -$maxMessages);
    }
}

Routes

// routes/web.php

Route::get('/chat', [ChatController::class, 'index'])->name('chat.index');
Route::post('/chat/stream', [ChatController::class, 'stream'])->name('chat.stream');
Route::post('/chat/clear', [ChatController::class, 'clear'])->name('chat.clear');

Frontend: Vanilla JS with SSE

No React, no Vue — just Blade and vanilla JavaScript:

{{-- resources/views/chat/index.blade.php --}}

@extends('layouts.app')

@section('content')
<div class="max-w-3xl mx-auto px-4 py-8">
    <div class="flex items-center justify-between mb-6">
        <h1 class="text-2xl font-bold dark:text-white">AI Chat</h1>
        <button id="clear-btn"
                class="text-sm text-gray-500 hover:text-red-500 transition">
            Clear History
        </button>
    </div>

    {{-- Message Container --}}
    <div id="messages"
         class="space-y-4 mb-6 max-h-[60vh] overflow-y-auto scroll-smooth">
        <div class="text-gray-400 text-center py-8" id="empty-state">
            Ask me anything about Laravel, PHP, or web development.
        </div>
    </div>

    {{-- Input Form --}}
    <form id="chat-form" class="flex gap-3">
        @csrf
        <input type="text"
               id="message-input"
               name="message"
               placeholder="Type your message..."
               maxlength="2000"
               autocomplete="off"
               class="flex-1 rounded-lg border border-gray-300 dark:border-gray-600
                      bg-white dark:bg-gray-800 px-4 py-3
                      text-gray-900 dark:text-white
                      focus:ring-2 focus:ring-indigo-500 focus:border-transparent
                      outline-none transition"
               required>
        <button type="submit"
                id="send-btn"
                class="bg-indigo-600 hover:bg-indigo-700 text-white px-6 py-3
                       rounded-lg font-medium transition disabled:opacity-50
                       disabled:cursor-not-allowed">
            Send
        </button>
    </form>
</div>

<script>
document.addEventListener('DOMContentLoaded', () => {
    const form = document.getElementById('chat-form');
    const input = document.getElementById('message-input');
    const messages = document.getElementById('messages');
    const sendBtn = document.getElementById('send-btn');
    const clearBtn = document.getElementById('clear-btn');
    const emptyState = document.getElementById('empty-state');

    let isStreaming = false;

    form.addEventListener('submit', async (e) => {
        e.preventDefault();

        const message = input.value.trim();
        if (!message || isStreaming) return;

        // Remove empty state
        if (emptyState) emptyState.remove();

        // Add user message
        appendMessage('user', message);
        input.value = '';
        setStreaming(true);

        // Create assistant message container
        const assistantEl = appendMessage('assistant', '');
        const contentEl = assistantEl.querySelector('.message-content');

        try {
            const response = await fetch('{{ route("chat.stream") }}', {
                method: 'POST',
                headers: {
                    'Content-Type': 'application/json',
                    'X-CSRF-TOKEN': '{{ csrf_token() }}',
                    'Accept': 'text/event-stream',
                },
                body: JSON.stringify({ message }),
            });

            if (!response.ok) {
                throw new Error(`HTTP ${response.status}`);
            }

            const reader = response.body.getReader();
            const decoder = new TextDecoder();
            let buffer = '';
            let fullText = '';

            while (true) {
                const { done, value } = await reader.read();
                if (done) break;

                buffer += decoder.decode(value, { stream: true });
                const lines = buffer.split('\n');
                buffer = lines.pop(); // Keep incomplete line

                for (const line of lines) {
                    if (!line.startsWith('data: ')) continue;

                    try {
                        const data = JSON.parse(line.slice(6));

                        if (data.token) {
                            fullText += data.token;
                            contentEl.innerHTML = renderMarkdown(fullText);
                            scrollToBottom();
                        }

                        if (data.done) {
                            // Streaming complete
                        }
                    } catch {
                        // Skip malformed JSON
                    }
                }
            }
        } catch (error) {
            contentEl.textContent = `Error: ${error.message}. Please try again.`;
            contentEl.classList.add('text-red-500');
        } finally {
            setStreaming(false);
            input.focus();
        }
    });

    clearBtn.addEventListener('click', async () => {
        await fetch('{{ route("chat.clear") }}', {
            method: 'POST',
            headers: {
                'X-CSRF-TOKEN': '{{ csrf_token() }}',
            },
        });

        messages.innerHTML = `
            <div class="text-gray-400 text-center py-8" id="empty-state">
                Ask me anything about Laravel, PHP, or web development.
            </div>
        `;
    });

    function appendMessage(role, content) {
        const wrapper = document.createElement('div');
        wrapper.className = `flex ${role === 'user' ? 'justify-end' : 'justify-start'}`;

        const bubble = document.createElement('div');
        bubble.className = role === 'user'
            ? 'bg-indigo-600 text-white rounded-2xl rounded-br-md px-4 py-3 max-w-[80%]'
            : 'bg-gray-100 dark:bg-gray-800 text-gray-900 dark:text-gray-100 rounded-2xl rounded-bl-md px-4 py-3 max-w-[80%]';

        const contentEl = document.createElement('div');
        contentEl.className = 'message-content prose dark:prose-invert prose-sm max-w-none';
        contentEl.innerHTML = content ? renderMarkdown(content) : '<span class="animate-pulse">●●●</span>';

        bubble.appendChild(contentEl);
        wrapper.appendChild(bubble);
        messages.appendChild(wrapper);

        scrollToBottom();
        return wrapper;
    }

    function renderMarkdown(text) {
        // Basic Markdown rendering — for production, use a library like marked.js
        return text
            // Code blocks
            .replace(/```(\w+)?\n([\s\S]*?)```/g, '<pre><code class="language-$1">$2</code></pre>')
            // Inline code
            .replace(/`([^`]+)`/g, '<code>$1</code>')
            // Bold
            .replace(/\*\*(.+?)\*\*/g, '<strong>$1</strong>')
            // Italic
            .replace(/\*(.+?)\*/g, '<em>$1</em>')
            // Line breaks
            .replace(/\n/g, '<br>');
    }

    function scrollToBottom() {
        messages.scrollTop = messages.scrollHeight;
    }

    function setStreaming(state) {
        isStreaming = state;
        sendBtn.disabled = state;
        input.disabled = state;
    }

    // Focus input on load
    input.focus();
});
</script>
@endsection

Nginx Configuration

Nginx buffers SSE by default — disable it:

location /chat/stream {
    proxy_pass http://127.0.0.1:8000;
    proxy_buffering off;
    proxy_cache off;
    proxy_set_header Connection '';
    proxy_http_version 1.1;
    chunked_transfer_encoding off;

    # Increase timeout for long streams
    proxy_read_timeout 120s;
}

Or handle it in the response header (already done in our controller):

'X-Accel-Buffering' => 'no', // This tells Nginx to disable buffering

Conversation Memory

Session-Based (Simple)

Already implemented in the controller. Conversations live in the session and disappear when the session expires.

Database-Based (Persistent)

For persistent conversations across sessions:

// database/migrations/create_chat_conversations_table.php

Schema::create('chat_conversations', function (Blueprint $table) {
    $table->id();
    $table->string('session_id')->index();
    $table->json('messages');
    $table->string('title')->nullable();
    $table->timestamps();
});
// app/Models/ChatConversation.php

namespace App\Models;

use Illuminate\Database\Eloquent\Model;

class ChatConversation extends Model
{
    protected $fillable = ['session_id', 'messages', 'title'];

    protected function casts(): array
    {
        return [
            'messages' => 'array',
        ];
    }
}

Token Counting and Cost Control

Prevent runaway API costs:

// app/Services/AI/TokenCounter.php

namespace App\Services\AI;

class TokenCounter
{
    /**
     * Rough estimate: ~4 characters per token for English text.
     */
    public static function estimate(string $text): int
    {
        return (int) ceil(mb_strlen($text) / 4);
    }

    public static function estimateMessages(array $messages): int
    {
        $total = 0;
        foreach ($messages as $message) {
            $total += self::estimate($message['content']);
            $total += 4; // Message overhead
        }
        return $total;
    }
}

Use in the controller:

// Before sending to API
$estimatedTokens = TokenCounter::estimateMessages($messages);
$maxInputTokens = 4000;

if ($estimatedTokens > $maxInputTokens) {
    // Trim oldest messages until within budget
    while (TokenCounter::estimateMessages($messages) > $maxInputTokens && count($messages) > 2) {
        array_splice($messages, 1, 1); // Remove oldest non-system message
    }
}

Error Handling

// In ChatController::stream()

return new StreamedResponse(function () use ($messages, $conversation, $request) {
    try {
        $fullResponse = '';

        foreach ($this->chatProvider->stream($messages) as $token) {
            $fullResponse .= $token;
            echo "data: " . json_encode(['token' => $token]) . "\n\n";

            if (ob_get_level() > 0) {
                ob_flush();
            }
            flush();
        }

        echo "data: " . json_encode(['done' => true]) . "\n\n";

        // Save to session
        $conversation[] = ['role' => 'assistant', 'content' => $fullResponse];
        $request->session()->put('chat_history', $conversation);

    } catch (\OpenAI\Exceptions\ErrorException $e) {
        $error = match ($e->getCode()) {
            429 => 'Rate limited by AI provider. Please try again in a moment.',
            500, 503 => 'AI service is temporarily unavailable.',
            default => 'An error occurred. Please try again.',
        };

        echo "data: " . json_encode(['error' => $error]) . "\n\n";

    } catch (\Throwable $e) {
        report($e);
        echo "data: " . json_encode(['error' => 'An unexpected error occurred.']) . "\n\n";
    }

    if (ob_get_level() > 0) {
        ob_flush();
    }
    flush();

}, 200, [
    'Content-Type' => 'text/event-stream',
    'Cache-Control' => 'no-cache',
    'Connection' => 'keep-alive',
    'X-Accel-Buffering' => 'no',
]);

Handle errors in the frontend:

// In the SSE reader loop
if (data.error) {
    contentEl.textContent = data.error;
    contentEl.classList.add('text-red-500');
    setStreaming(false);
    return;
}

Testing

// tests/Feature/ChatTest.php

namespace Tests\Feature;

use App\Services\AI\ChatProviderInterface;
use Tests\TestCase;

class ChatTest extends TestCase
{
    public function test_chat_page_loads(): void
    {
        $response = $this->get('/chat');
        $response->assertStatus(200);
        $response->assertSee('AI Chat');
    }

    public function test_stream_requires_message(): void
    {
        $response = $this->postJson('/chat/stream', []);
        $response->assertStatus(422);
    }

    public function test_stream_returns_event_stream(): void
    {
        // Mock the provider
        $mock = $this->mock(ChatProviderInterface::class);
        $mock->shouldReceive('stream')
            ->once()
            ->andReturnUsing(function () {
                yield 'Hello';
                yield ' world';
            });

        $response = $this->post('/chat/stream', [
            'message' => 'Hi',
        ]);

        $response->assertHeader('content-type', 'text/event-stream');
    }

    public function test_clear_removes_chat_history(): void
    {
        $this->session(['chat_history' => [
            ['role' => 'user', 'content' => 'Hello'],
        ]]);

        $response = $this->postJson('/chat/clear');
        $response->assertJson(['status' => 'cleared']);
    }

    public function test_rate_limiting(): void
    {
        for ($i = 0; $i < 21; $i++) {
            $response = $this->post('/chat/stream', [
                'message' => "Message {$i}",
            ]);
        }

        $response->assertStatus(429);
    }
}

Production Checklist

Item Status
API keys in .env, not in code Required
Rate limiting per session Required
Input validation and max length Required
Nginx buffering disabled Required
Error handling with user-friendly messages Required
Token counting / conversation trimming Recommended
CSRF protection on POST endpoints Required
Conversation persistence (database) Optional
Response content filtering Recommended
Monitoring API costs Recommended

Summary

The streaming chatbot pattern:

  1. Provider interface → swap AI providers without code changes
  2. StreamedResponse → SSE delivers tokens in real-time
  3. Session memory → conversations persist across messages
  4. Vanilla JS + fetch → read the stream, render incrementally
  5. Nginx config → disable buffering for SSE
  6. Rate limiting → protect against abuse and cost overrun

The biggest gotcha is Nginx buffering. If streaming "works locally but not in production," check proxy_buffering and X-Accel-Buffering first.

Comments