Xây Dựng AI Chatbot Với Laravel Và Streaming Response

Hầu hết tutorial chatbot AI chỉ cho bạn gửi prompt và chờ response hoàn chỉnh. Trên production, điều đó nghĩa là người dùng nhìn spinner 5-15 giây. Streaming giải quyết vấn đề này — token xuất hiện khi được tạo ra, y hệt ChatGPT.

Bài này xây dựng chatbot hoàn chỉnh: streaming backend với Server-Sent Events, bộ nhớ hội thoại, hỗ trợ nhiều provider, và frontend nhẹ với vanilla JS.

Kiến Trúc

┌──────────┐    POST /chat      ┌──────────────┐   Stream    ┌───────────┐
│  Browser  │ ──────────────────▶│   Laravel     │ ──────────▶│  OpenAI/  │
│  (SSE)    │ ◀─────────────────│   Controller  │ ◀──────────│  Claude   │
│           │  text/event-stream │              │   Chunks    │           │
└──────────┘                    └──────────────┘             └───────────┘

Quyết định thiết kế:

Server-Sent Events (SSE) thay vì WebSockets — đơn giản hơn, HTTP-native, không cần hạ tầng thêm
Session-based memory — không dùng database cho hội thoại (giữ đơn giản)
Provider abstraction — đổi OpenAI ↔ Anthropic mà không sửa controller

Cài Đặt Dependencies

composer require openai-php/laravel
# hoặc cho Anthropic:
composer require anthropic-ai/laravel

# .env
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4o

# Hoặc Anthropic
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_MODEL=claude-sonnet-4-20250514

Chat Service

Provider Interface

// app/Services/AI/ChatProviderInterface.php

namespace App\Services\AI;

use Generator;

interface ChatProviderInterface
{
    /**
     * Gửi messages và nhận response hoàn chỉnh.
     *
     * @param array<int, array{role: string, content: string}> $messages
     */
    public function chat(array $messages, string $model = ''): string;

    /**
     * Gửi messages và stream response từng token.
     *
     * @param array<int, array{role: string, content: string}> $messages
     * @return Generator<int, string, void, void>
     */
    public function stream(array $messages, string $model = ''): Generator;
}

OpenAI Implementation

// app/Services/AI/OpenAIChatProvider.php

namespace App\Services\AI;

use Generator;
use OpenAI\Laravel\Facades\OpenAI;

class OpenAIChatProvider implements ChatProviderInterface
{
    public function chat(array $messages, string $model = ''): string
    {
        $model = $model ?: config('services.openai.model', 'gpt-4o');

        $response = OpenAI::chat()->create([
            'model' => $model,
            'messages' => $messages,
            'max_tokens' => 2048,
        ]);

        return $response->choices[0]->message->content ?? '';
    }

    public function stream(array $messages, string $model = ''): Generator
    {
        $model = $model ?: config('services.openai.model', 'gpt-4o');

        $stream = OpenAI::chat()->createStreamed([
            'model' => $model,
            'messages' => $messages,
            'max_tokens' => 2048,
        ]);

        foreach ($stream as $response) {
            $delta = $response->choices[0]->delta->content ?? '';
            if ($delta !== '') {
                yield $delta;
            }
        }
    }
}

Anthropic Implementation

// app/Services/AI/AnthropicChatProvider.php

namespace App\Services\AI;

use Generator;
use Illuminate\Support\Facades\Http;

class AnthropicChatProvider implements ChatProviderInterface
{
    private string $baseUrl = 'https://api.anthropic.com/v1';

    public function chat(array $messages, string $model = ''): string
    {
        $model = $model ?: config('services.anthropic.model', 'claude-sonnet-4-20250514');

        $response = Http::withHeaders($this->headers())
            ->post("{$this->baseUrl}/messages", [
                'model' => $model,
                'max_tokens' => 2048,
                'messages' => $this->formatMessages($messages),
                'system' => $this->extractSystem($messages),
            ]);

        return $response->json('content.0.text', '');
    }

    public function stream(array $messages, string $model = ''): Generator
    {
        $model = $model ?: config('services.anthropic.model', 'claude-sonnet-4-20250514');

        $response = Http::withHeaders($this->headers())
            ->withOptions(['stream' => true])
            ->post("{$this->baseUrl}/messages", [
                'model' => $model,
                'max_tokens' => 2048,
                'messages' => $this->formatMessages($messages),
                'system' => $this->extractSystem($messages),
                'stream' => true,
            ]);

        $body = $response->getBody();
        $buffer = '';

        while (!$body->eof()) {
            $buffer .= $body->read(1024);

            while (($pos = strpos($buffer, "\n")) !== false) {
                $line = substr($buffer, 0, $pos);
                $buffer = substr($buffer, $pos + 1);

                if (!str_starts_with($line, 'data: ')) {
                    continue;
                }

                $data = json_decode(substr($line, 6), true);

                if ($data === null) {
                    continue;
                }

                if (($data['type'] ?? '') === 'content_block_delta') {
                    $text = $data['delta']['text'] ?? '';
                    if ($text !== '') {
                        yield $text;
                    }
                }

                if (($data['type'] ?? '') === 'message_stop') {
                    return;
                }
            }
        }
    }

    private function headers(): array
    {
        return [
            'x-api-key' => config('services.anthropic.api_key'),
            'anthropic-version' => '2023-06-01',
            'Content-Type' => 'application/json',
        ];
    }

    private function formatMessages(array $messages): array
    {
        return array_values(array_filter(
            $messages,
            fn (array $msg) => $msg['role'] !== 'system'
        ));
    }

    private function extractSystem(array $messages): string
    {
        foreach ($messages as $msg) {
            if ($msg['role'] === 'system') {
                return $msg['content'];
            }
        }

        return 'You are a helpful assistant.';
    }
}

Service Provider Binding

// app/Providers/AppServiceProvider.php

use App\Services\AI\ChatProviderInterface;
use App\Services\AI\OpenAIChatProvider;
use App\Services\AI\AnthropicChatProvider;

public function register(): void
{
    $this->app->singleton(ChatProviderInterface::class, function () {
        $provider = config('services.ai.default', 'openai');

        return match ($provider) {
            'anthropic' => new AnthropicChatProvider(),
            default => new OpenAIChatProvider(),
        };
    });
}

// config/services.php

'ai' => [
    'default' => env('AI_PROVIDER', 'openai'),
],

Streaming Controller

Đây là phần cốt lõi — trả về StreamedResponse với Server-Sent Events:

// app/Http/Controllers/ChatController.php

namespace App\Http\Controllers;

use App\Services\AI\ChatProviderInterface;
use Illuminate\Http\Request;
use Illuminate\Support\Facades\RateLimiter;
use Symfony\Component\HttpFoundation\StreamedResponse;

class ChatController extends Controller
{
    public function __construct(
        private ChatProviderInterface $chatProvider,
    ) {}

    public function index()
    {
        return view('chat.index');
    }

    public function stream(Request $request): StreamedResponse
    {
        $request->validate([
            'message' => ['required', 'string', 'max:2000'],
        ]);

        // Rate limiting: 20 messages mỗi phút mỗi session
        $key = 'chat:' . $request->session()->getId();
        if (RateLimiter::tooManyAttempts($key, 20)) {
            abort(429, 'Quá nhiều tin nhắn. Vui lòng đợi một chút.');
        }
        RateLimiter::hit($key, 60);

        $userMessage = $request->input('message');

        // Xây dựng hội thoại từ session
        $conversation = $request->session()->get('chat_history', []);
        $conversation[] = ['role' => 'user', 'content' => $userMessage];

        // Chuẩn bị messages với system prompt
        $messages = array_merge(
            [['role' => 'system', 'content' => $this->systemPrompt()]],
            $this->trimConversation($conversation),
        );

        return new StreamedResponse(function () use ($messages, $conversation, $request) {
            $fullResponse = '';

            // Gửi mỗi token dưới dạng SSE event
            foreach ($this->chatProvider->stream($messages) as $token) {
                $fullResponse .= $token;

                echo "data: " . json_encode(['token' => $token]) . "\n\n";

                if (ob_get_level() > 0) {
                    ob_flush();
                }
                flush();
            }

            // Gửi event hoàn thành
            echo "data: " . json_encode(['done' => true]) . "\n\n";

            if (ob_get_level() > 0) {
                ob_flush();
            }
            flush();

            // Lưu response assistant vào session
            $conversation[] = ['role' => 'assistant', 'content' => $fullResponse];
            $request->session()->put('chat_history', $conversation);

        }, 200, [
            'Content-Type' => 'text/event-stream',
            'Cache-Control' => 'no-cache',
            'Connection' => 'keep-alive',
            'X-Accel-Buffering' => 'no', // Tắt Nginx buffering
        ]);
    }

    public function clear(Request $request)
    {
        $request->session()->forget('chat_history');

        return response()->json(['status' => 'cleared']);
    }

    private function systemPrompt(): string
    {
        return <<<'PROMPT'
        You are a helpful technical assistant for a Laravel developer blog.
        Answer questions about Laravel, PHP, DevOps, and web development.
        Be concise, use code examples when helpful, and format with Markdown.
        If you're unsure, say so. Do not make up information.
        PROMPT;
    }

    /**
     * Giữ lại N messages cuối cùng để nằm trong giới hạn token.
     */
    private function trimConversation(array $conversation, int $maxMessages = 20): array
    {
        if (count($conversation) <= $maxMessages) {
            return $conversation;
        }

        return array_slice($conversation, -$maxMessages);
    }
}

Routes

// routes/web.php

Route::get('/chat', [ChatController::class, 'index'])->name('chat.index');
Route::post('/chat/stream', [ChatController::class, 'stream'])->name('chat.stream');
Route::post('/chat/clear', [ChatController::class, 'clear'])->name('chat.clear');

Frontend: Vanilla JS Với SSE

Không React, không Vue — chỉ Blade và vanilla JavaScript:

{{-- resources/views/chat/index.blade.php --}}

@extends('layouts.app')

@section('content')
<div class="max-w-3xl mx-auto px-4 py-8">
    <div class="flex items-center justify-between mb-6">
        <h1 class="text-2xl font-bold dark:text-white">AI Chat</h1>
        <button id="clear-btn"
                class="text-sm text-gray-500 hover:text-red-500 transition">
            Xóa Lịch Sử
        </button>
    </div>

    {{-- Message Container --}}
    <div id="messages"
         class="space-y-4 mb-6 max-h-[60vh] overflow-y-auto scroll-smooth">
        <div class="text-gray-400 text-center py-8" id="empty-state">
            Hỏi tôi bất cứ điều gì về Laravel, PHP, hoặc web development.
        </div>
    </div>

    {{-- Input Form --}}
    <form id="chat-form" class="flex gap-3">
        @csrf
        <input type="text"
               id="message-input"
               name="message"
               placeholder="Nhập tin nhắn..."
               maxlength="2000"
               autocomplete="off"
               class="flex-1 rounded-lg border border-gray-300 dark:border-gray-600
                      bg-white dark:bg-gray-800 px-4 py-3
                      text-gray-900 dark:text-white
                      focus:ring-2 focus:ring-indigo-500 focus:border-transparent
                      outline-none transition"
               required>
        <button type="submit"
                id="send-btn"
                class="bg-indigo-600 hover:bg-indigo-700 text-white px-6 py-3
                       rounded-lg font-medium transition disabled:opacity-50
                       disabled:cursor-not-allowed">
            Gửi
        </button>
    </form>
</div>

<script>
document.addEventListener('DOMContentLoaded', () => {
    const form = document.getElementById('chat-form');
    const input = document.getElementById('message-input');
    const messages = document.getElementById('messages');
    const sendBtn = document.getElementById('send-btn');
    const clearBtn = document.getElementById('clear-btn');
    const emptyState = document.getElementById('empty-state');

    let isStreaming = false;

    form.addEventListener('submit', async (e) => {
        e.preventDefault();

        const message = input.value.trim();
        if (!message || isStreaming) return;

        // Xóa empty state
        if (emptyState) emptyState.remove();

        // Thêm tin nhắn user
        appendMessage('user', message);
        input.value = '';
        setStreaming(true);

        // Tạo container tin nhắn assistant
        const assistantEl = appendMessage('assistant', '');
        const contentEl = assistantEl.querySelector('.message-content');

        try {
            const response = await fetch('{{ route("chat.stream") }}', {
                method: 'POST',
                headers: {
                    'Content-Type': 'application/json',
                    'X-CSRF-TOKEN': '{{ csrf_token() }}',
                    'Accept': 'text/event-stream',
                },
                body: JSON.stringify({ message }),
            });

            if (!response.ok) {
                throw new Error(`HTTP ${response.status}`);
            }

            const reader = response.body.getReader();
            const decoder = new TextDecoder();
            let buffer = '';
            let fullText = '';

            while (true) {
                const { done, value } = await reader.read();
                if (done) break;

                buffer += decoder.decode(value, { stream: true });
                const lines = buffer.split('\n');
                buffer = lines.pop(); // Giữ lại dòng chưa hoàn chỉnh

                for (const line of lines) {
                    if (!line.startsWith('data: ')) continue;

                    try {
                        const data = JSON.parse(line.slice(6));

                        if (data.token) {
                            fullText += data.token;
                            contentEl.innerHTML = renderMarkdown(fullText);
                            scrollToBottom();
                        }

                        if (data.done) {
                            // Streaming hoàn tất
                        }
                    } catch {
                        // Bỏ qua JSON lỗi
                    }
                }
            }
        } catch (error) {
            contentEl.textContent = `Lỗi: ${error.message}. Vui lòng thử lại.`;
            contentEl.classList.add('text-red-500');
        } finally {
            setStreaming(false);
            input.focus();
        }
    });

    clearBtn.addEventListener('click', async () => {
        await fetch('{{ route("chat.clear") }}', {
            method: 'POST',
            headers: {
                'X-CSRF-TOKEN': '{{ csrf_token() }}',
            },
        });

        messages.innerHTML = `
            <div class="text-gray-400 text-center py-8" id="empty-state">
                Hỏi tôi bất cứ điều gì về Laravel, PHP, hoặc web development.
            </div>
        `;
    });

    function appendMessage(role, content) {
        const wrapper = document.createElement('div');
        wrapper.className = `flex ${role === 'user' ? 'justify-end' : 'justify-start'}`;

        const bubble = document.createElement('div');
        bubble.className = role === 'user'
            ? 'bg-indigo-600 text-white rounded-2xl rounded-br-md px-4 py-3 max-w-[80%]'
            : 'bg-gray-100 dark:bg-gray-800 text-gray-900 dark:text-gray-100 rounded-2xl rounded-bl-md px-4 py-3 max-w-[80%]';

        const contentEl = document.createElement('div');
        contentEl.className = 'message-content prose dark:prose-invert prose-sm max-w-none';
        contentEl.innerHTML = content ? renderMarkdown(content) : '<span class="animate-pulse">●●●</span>';

        bubble.appendChild(contentEl);
        wrapper.appendChild(bubble);
        messages.appendChild(wrapper);

        scrollToBottom();
        return wrapper;
    }

    function renderMarkdown(text) {
        // Render Markdown cơ bản — production nên dùng thư viện như marked.js
        return text
            // Code blocks
            .replace(/```(\w+)?\n([\s\S]*?)```/g, '<pre><code class="language-$1">$2</code></pre>')
            // Inline code
            .replace(/`([^`]+)`/g, '<code>$1</code>')
            // Bold
            .replace(/\*\*(.+?)\*\*/g, '<strong>$1</strong>')
            // Italic
            .replace(/\*(.+?)\*/g, '<em>$1</em>')
            // Line breaks
            .replace(/\n/g, '<br>');
    }

    function scrollToBottom() {
        messages.scrollTop = messages.scrollHeight;
    }

    function setStreaming(state) {
        isStreaming = state;
        sendBtn.disabled = state;
        input.disabled = state;
    }

    // Focus input khi load
    input.focus();
});
</script>
@endsection

Cấu Hình Nginx

Nginx buffer SSE mặc định — cần tắt:

location /chat/stream {
    proxy_pass http://127.0.0.1:8000;
    proxy_buffering off;
    proxy_cache off;
    proxy_set_header Connection '';
    proxy_http_version 1.1;
    chunked_transfer_encoding off;

    # Tăng timeout cho stream dài
    proxy_read_timeout 120s;
}

Hoặc xử lý trong response header (đã làm trong controller):

'X-Accel-Buffering' => 'no', // Thông báo Nginx tắt buffering

Bộ Nhớ Hội Thoại

Session-Based (Đơn Giản)

Đã triển khai trong controller. Hội thoại sống trong session và mất khi session hết hạn.

Database-Based (Bền Vững)

Cho hội thoại bền vững xuyên session:

// database/migrations/create_chat_conversations_table.php

Schema::create('chat_conversations', function (Blueprint $table) {
    $table->id();
    $table->string('session_id')->index();
    $table->json('messages');
    $table->string('title')->nullable();
    $table->timestamps();
});

// app/Models/ChatConversation.php

namespace App\Models;

use Illuminate\Database\Eloquent\Model;

class ChatConversation extends Model
{
    protected $fillable = ['session_id', 'messages', 'title'];

    protected function casts(): array
    {
        return [
            'messages' => 'array',
        ];
    }
}

Đếm Token Và Kiểm Soát Chi Phí

Ngăn chi phí API tăng vọt:

// app/Services/AI/TokenCounter.php

namespace App\Services\AI;

class TokenCounter
{
    /**
     * Ước tính: ~4 ký tự mỗi token cho văn bản tiếng Anh.
     */
    public static function estimate(string $text): int
    {
        return (int) ceil(mb_strlen($text) / 4);
    }

    public static function estimateMessages(array $messages): int
    {
        $total = 0;
        foreach ($messages as $message) {
            $total += self::estimate($message['content']);
            $total += 4; // Message overhead
        }
        return $total;
    }
}

Sử dụng trong controller:

// Trước khi gửi đến API
$estimatedTokens = TokenCounter::estimateMessages($messages);
$maxInputTokens = 4000;

if ($estimatedTokens > $maxInputTokens) {
    // Cắt bớt messages cũ nhất cho đến khi vừa budget
    while (TokenCounter::estimateMessages($messages) > $maxInputTokens && count($messages) > 2) {
        array_splice($messages, 1, 1); // Xóa message cũ nhất không phải system
    }
}

Xử Lý Lỗi

// Trong ChatController::stream()

return new StreamedResponse(function () use ($messages, $conversation, $request) {
    try {
        $fullResponse = '';

        foreach ($this->chatProvider->stream($messages) as $token) {
            $fullResponse .= $token;
            echo "data: " . json_encode(['token' => $token]) . "\n\n";

            if (ob_get_level() > 0) {
                ob_flush();
            }
            flush();
        }

        echo "data: " . json_encode(['done' => true]) . "\n\n";

        // Lưu vào session
        $conversation[] = ['role' => 'assistant', 'content' => $fullResponse];
        $request->session()->put('chat_history', $conversation);

    } catch (\OpenAI\Exceptions\ErrorException $e) {
        $error = match ($e->getCode()) {
            429 => 'Bị rate limit bởi AI provider. Vui lòng thử lại sau.',
            500, 503 => 'Dịch vụ AI tạm thời không khả dụng.',
            default => 'Đã xảy ra lỗi. Vui lòng thử lại.',
        };

        echo "data: " . json_encode(['error' => $error]) . "\n\n";

    } catch (\Throwable $e) {
        report($e);
        echo "data: " . json_encode(['error' => 'Đã xảy ra lỗi không mong đợi.']) . "\n\n";
    }

    if (ob_get_level() > 0) {
        ob_flush();
    }
    flush();

}, 200, [
    'Content-Type' => 'text/event-stream',
    'Cache-Control' => 'no-cache',
    'Connection' => 'keep-alive',
    'X-Accel-Buffering' => 'no',
]);

Xử lý lỗi ở frontend:

// Trong SSE reader loop
if (data.error) {
    contentEl.textContent = data.error;
    contentEl.classList.add('text-red-500');
    setStreaming(false);
    return;
}

Kiểm Thử

// tests/Feature/ChatTest.php

namespace Tests\Feature;

use App\Services\AI\ChatProviderInterface;
use Tests\TestCase;

class ChatTest extends TestCase
{
    public function test_chat_page_loads(): void
    {
        $response = $this->get('/chat');
        $response->assertStatus(200);
        $response->assertSee('AI Chat');
    }

    public function test_stream_requires_message(): void
    {
        $response = $this->postJson('/chat/stream', []);
        $response->assertStatus(422);
    }

    public function test_stream_returns_event_stream(): void
    {
        // Mock provider
        $mock = $this->mock(ChatProviderInterface::class);
        $mock->shouldReceive('stream')
            ->once()
            ->andReturnUsing(function () {
                yield 'Hello';
                yield ' world';
            });

        $response = $this->post('/chat/stream', [
            'message' => 'Hi',
        ]);

        $response->assertHeader('content-type', 'text/event-stream');
    }

    public function test_clear_removes_chat_history(): void
    {
        $this->session(['chat_history' => [
            ['role' => 'user', 'content' => 'Hello'],
        ]]);

        $response = $this->postJson('/chat/clear');
        $response->assertJson(['status' => 'cleared']);
    }

    public function test_rate_limiting(): void
    {
        for ($i = 0; $i < 21; $i++) {
            $response = $this->post('/chat/stream', [
                'message' => "Message {$i}",
            ]);
        }

        $response->assertStatus(429);
    }
}

Checklist Production

Hạng mục	Trạng thái
API keys trong `.env`, không trong code	Bắt buộc
Rate limiting mỗi session	Bắt buộc
Validate input và giới hạn độ dài	Bắt buộc
Tắt Nginx buffering	Bắt buộc
Xử lý lỗi với thông báo thân thiện	Bắt buộc
Đếm token / cắt hội thoại	Khuyến nghị
CSRF protection trên POST endpoints	Bắt buộc
Lưu hội thoại bền vững (database)	Tùy chọn
Lọc nội dung response	Khuyến nghị
Giám sát chi phí API	Khuyến nghị

Tổng Kết

Pattern chatbot streaming:

Provider interface → đổi AI provider không cần sửa code
StreamedResponse → SSE truyền token real-time
Session memory → hội thoại bền vững xuyên messages
Vanilla JS + fetch → đọc stream, render dần dần
Cấu hình Nginx → tắt buffering cho SSE
Rate limiting → bảo vệ khỏi lạm dụng và chi phí quá mức

Lỗi thường gặp nhất là Nginx buffering. Nếu streaming "chạy local nhưng không chạy trên production," kiểm tra proxy_buffering và X-Accel-Buffering trước tiên.