MySQL Query Optimization with EXPLAIN — Understanding and Tuning Queries from the Ground Up

· 9 min read

Have you ever written a query that runs fast on your local machine but "freezes" in production with millions of records? Or wondered why a simple query takes several seconds to execute?

EXPLAIN is the "X-ray" tool that lets you see exactly how MySQL processes your query. Instead of guessing, you'll know precisely:

  • Whether MySQL is using indexes or not
  • How many rows need to be scanned
  • The JOIN order
  • Whether there are temporary tables or filesort operations

EXPLAIN Basics

Syntax

EXPLAIN SELECT * FROM users WHERE email = 'john@example.com';

Or with more readable formats:

EXPLAIN FORMAT=JSON SELECT * FROM users WHERE email = 'john@example.com';
EXPLAIN FORMAT=TREE SELECT * FROM users WHERE email = 'john@example.com'; -- MySQL 8.0+

Sample Output

+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | users | ALL  | NULL          | NULL | NULL    | NULL | 1000 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+

This result shows a serious problem: type = ALL means MySQL is scanning all 1000 rows!

Understanding Each EXPLAIN Column

1. id — Execution Order

  • Same id: executed simultaneously (JOIN)
  • Different id: executed sequentially (higher numbers execute first)
EXPLAIN 
SELECT * FROM orders 
WHERE user_id IN (SELECT id FROM users WHERE status = 'active');
+----+-------------+--------+...
| id | select_type | table  |...
+----+-------------+--------+...
|  1 | PRIMARY     | orders |...
|  2 | SUBQUERY    | users  |...
+----+-------------+--------+...

2. select_type — Query Type

Value Meaning
SIMPLE Simple query, no subquery or UNION
PRIMARY Outermost query
SUBQUERY Subquery in SELECT or WHERE
DERIVED Subquery in FROM (creates temp table)
UNION Second or later part of UNION
DEPENDENT SUBQUERY Subquery dependent on outer query (⚠️ slow!)

Warning: DEPENDENT SUBQUERY is usually very slow because MySQL must run the subquery for EACH row of the outer query.

3. type — How MySQL Accesses Data (MOST IMPORTANT!)

Ranked from best to worst:

Type Description Performance
system Table has only 1 row 🟢 Excellent
const Matches at most 1 row (PRIMARY KEY or UNIQUE) 🟢 Excellent
eq_ref JOIN with PRIMARY KEY, each left row matches exactly 1 right row 🟢 Excellent
ref Lookup using non-unique index 🟢 Good
range Search within an index range (BETWEEN, <, >, IN) 🟡 OK
index Full scan on index (doesn't read data rows) 🟠 Poor
ALL Full table scan 🔴 Terrible

Goal: Your query should have type const, eq_ref, ref, or range.

-- BAD: type = ALL (full table scan)
EXPLAIN SELECT * FROM users WHERE YEAR(created_at) = 2024;

-- GOOD: type = range
EXPLAIN SELECT * FROM users 
WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01';

4. possible_keys and key

  • possible_keys: Indexes MySQL could use
  • key: Index MySQL actually chose
possible_keys: idx_email,idx_status,idx_created_at
key: idx_email

If possible_keys has many indexes but MySQL chooses wrong, you can use index hints:

SELECT * FROM users USE INDEX (idx_status) WHERE status = 'active';
SELECT * FROM users FORCE INDEX (idx_created_at) WHERE created_at > '2024-01-01';

5. key_len — Length of Index Used

Number of bytes of index used. Important for composite indexes:

-- Index: (country, city, district)
-- key_len = 3 (only uses country)
SELECT * FROM addresses WHERE country = 'VN';

-- key_len = 255 (uses country + city)
SELECT * FROM addresses WHERE country = 'VN' AND city = 'Hanoi';

-- key_len = 507 (uses all 3 columns)
SELECT * FROM addresses WHERE country = 'VN' AND city = 'Hanoi' AND district = 'CauGiay';

6. rows — Number of Rows to Scan (Estimate)

This number is an estimate, not 100% accurate. But it shows the scale of work MySQL must do.

Rule of thumb:

  • rows < 100: Good
  • rows < 1000: Acceptable
  • rows > 10000: Needs review
  • rows > 100000: Red alert!

7. Extra — Important Additional Information

Value Meaning
Using index 🟢 Covering index — no need to read data row
Using where 🟡 Filter applied after reading data
Using temporary 🔴 Creates temp table (often from GROUP BY, DISTINCT)
Using filesort 🔴 Sort without index (resource intensive)
Using index condition 🟢 Index Condition Pushdown (ICP)

Goal: Avoid Using temporary and Using filesort when possible.

EXPLAIN ANALYZE — Measure Actual Time (MySQL 8.0.18+)

EXPLAIN only gives estimates. EXPLAIN ANALYZE actually runs the query and measures time:

EXPLAIN ANALYZE 
SELECT u.name, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at > '2024-01-01'
GROUP BY u.id;

Output:

-> Group aggregate: count(o.id)  (cost=1234.56 rows=500) (actual time=12.34..45.67 rows=487 loops=1)
    -> Nested loop left join  (cost=987.65 rows=5000) (actual time=0.12..34.56 rows=4523 loops=1)
        -> Filter: (u.created_at > '2024-01-01')  (actual time=0.05..2.34 rows=487 loops=1)
            -> Table scan on u  (cost=123.45 rows=1000) (actual time=0.03..1.23 rows=1000 loops=1)
        -> Index lookup on o using idx_user_id (user_id=u.id)  (cost=1.23 rows=5) (actual time=0.02..0.05 rows=9.28 loops=487)

Explanation:

  • cost: Estimated cost
  • rows: Estimated rows
  • actual time: Actual time (ms) — time to first row..total time
  • loops: Number of iterations (important for nested loops)

Common Problems and Solutions

Problem 1: Full Table Scan

-- ❌ BAD
EXPLAIN SELECT * FROM orders WHERE YEAR(created_at) = 2024;
-- type: ALL, rows: 1000000

Cause: Using a function on a column prevents MySQL from using the index.

-- ✅ GOOD
EXPLAIN SELECT * FROM orders 
WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01';
-- type: range, rows: 45000

Problem 2: Using filesort with ORDER BY

-- ❌ BAD
EXPLAIN SELECT * FROM products WHERE category_id = 5 ORDER BY price;
-- Extra: Using filesort

Solution: Create a composite index including both WHERE and ORDER BY columns:

CREATE INDEX idx_category_price ON products(category_id, price);

-- ✅ GOOD - no more filesort
EXPLAIN SELECT * FROM products WHERE category_id = 5 ORDER BY price;

Problem 3: Using temporary with GROUP BY

-- ❌ BAD
EXPLAIN SELECT user_id, COUNT(*) FROM orders GROUP BY user_id;
-- Extra: Using temporary; Using filesort

Solution:

CREATE INDEX idx_user_id ON orders(user_id);

-- ✅ GOOD - uses index for grouping
EXPLAIN SELECT user_id, COUNT(*) FROM orders GROUP BY user_id;
-- Extra: Using index

Problem 4: DEPENDENT SUBQUERY

-- ❌ BAD - Subquery runs for EVERY row of outer query
EXPLAIN SELECT * FROM products p 
WHERE price > (SELECT AVG(price) FROM products WHERE category_id = p.category_id);
-- select_type: DEPENDENT SUBQUERY

Solution: Convert to JOIN:

-- ✅ GOOD
EXPLAIN 
SELECT p.* FROM products p
JOIN (
    SELECT category_id, AVG(price) as avg_price 
    FROM products 
    GROUP BY category_id
) cat_avg ON p.category_id = cat_avg.category_id
WHERE p.price > cat_avg.avg_price;

Problem 5: JOIN Without Index

-- ❌ BAD
EXPLAIN 
SELECT * FROM orders o 
JOIN customers c ON o.customer_email = c.email;
-- type: ALL for both tables

Solution:

CREATE INDEX idx_customer_email ON orders(customer_email);
CREATE INDEX idx_email ON customers(email);

-- ✅ GOOD - type: ref or eq_ref

Real-World Example: Optimizing a Complex Query

Original Query (Slow)

EXPLAIN
SELECT 
    u.name,
    u.email,
    COUNT(o.id) as total_orders,
    SUM(o.total_amount) as total_spent
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.status = 'active'
    AND u.created_at > '2024-01-01'
GROUP BY u.id
ORDER BY total_spent DESC
LIMIT 10;

EXPLAIN Result:

+----+-------------+-------+------+---------------+------+---------+------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows   | Extra                                        |
+----+-------------+-------+------+---------------+------+---------+------+--------+----------------------------------------------+
|  1 | SIMPLE      | u     | ALL  | PRIMARY       | NULL | NULL    | NULL | 100000 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | o     | ALL  | NULL          | NULL | NULL    | NULL | 500000 | Using where; Using join buffer               |
+----+-------------+-------+------+---------------+------+---------+------+--------+----------------------------------------------+

Problems:

  1. Full table scan on users (type: ALL)
  2. Full table scan on orders (type: ALL)
  3. Using temporary + filesort

Step 1: Add Index for WHERE Conditions

CREATE INDEX idx_users_status_created ON users(status, created_at);

Step 2: Add Index for JOIN

CREATE INDEX idx_orders_user_id ON orders(user_id);

Step 3: Create Covering Index (Optional)

CREATE INDEX idx_orders_user_total ON orders(user_id, total_amount);

Query After Optimization

EXPLAIN
SELECT 
    u.name,
    u.email,
    COUNT(o.id) as total_orders,
    SUM(o.total_amount) as total_spent
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.status = 'active'
    AND u.created_at > '2024-01-01'
GROUP BY u.id
ORDER BY total_spent DESC
LIMIT 10;

Result:

+----+-------------+-------+-------+---------------------------+------------------------+---------+--------+------+----------------------------------------------+
| id | select_type | table | type  | possible_keys             | key                    | key_len | ref    | rows | Extra                                        |
+----+-------------+-------+-------+---------------------------+------------------------+---------+--------+------+----------------------------------------------+
|  1 | SIMPLE      | u     | range | idx_users_status_created  | idx_users_status_created | 8      | NULL   | 5000 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | o     | ref   | idx_orders_user_total     | idx_orders_user_total  | 4       | u.id   | 5    | Using index                                  |
+----+-------------+-------+-------+---------------------------+------------------------+---------+--------+------+----------------------------------------------+

Improvements:

  • users: from 100,000 rows → 5,000 rows (type: range)
  • orders: from 500,000 rows → 5 rows per user (type: ref, Using index)

Using EXPLAIN in Laravel

Using toRawSql() + EXPLAIN

$query = User::where('status', 'active')
    ->where('created_at', '>', '2024-01-01')
    ->with('orders');

// Get raw SQL
$sql = $query->toRawSql();
dump($sql);

// Run EXPLAIN
$explain = DB::select('EXPLAIN ' . $sql);
dump($explain);

Debug with Laravel Debugbar

// config/debugbar.php
'collectors' => [
    'queries' => true,
],

Debugbar will display all queries and their execution times.

Macro for Direct EXPLAIN

// app/Providers/AppServiceProvider.php
use Illuminate\Database\Query\Builder;

Builder::macro('explain', function () {
    $sql = $this->toSql();
    $bindings = $this->getBindings();
    
    return DB::select('EXPLAIN ' . $sql, $bindings);
});

// Usage
User::where('status', 'active')->explain();

Query Optimization Checklist

When facing a slow query, check in this order:

  1. Run EXPLAIN and look at the type column

    • type is ALL? → Need to add index
    • type is index? → Can be improved
  2. Check rows

    • rows > 10000? → Need better filtering
  3. Check Extra

    • Using filesort? → Review ORDER BY
    • Using temporary? → Review GROUP BY
  4. Check WHERE clause

    • Using functions on columns?
    • Implicit type conversion?
  5. Check JOINs

    • Indexes on JOIN columns?
    • Is JOIN order reasonable?

Conclusion

EXPLAIN is an indispensable tool for any developer working with MySQL. Instead of guessing or trial-and-error, EXPLAIN gives you an accurate view of how MySQL processes your query.

Key takeaways:

  1. Always run EXPLAIN before deploying complex queries
  2. Goal: type = const/eq_ref/ref/range, avoid ALL
  3. Avoid Using temporary and Using filesort
  4. Don't use functions on columns in WHERE
  5. Composite index column order matters
  6. Use EXPLAIN ANALYZE (MySQL 8.0+) for actual timing

In the next article, we'll dive deep into MySQL Indexing Strategies — how to design effective indexes for real-world use cases.

Comments