MySQL Query Optimization with EXPLAIN — Understanding and Tuning Queries from the Ground Up
Have you ever written a query that runs fast on your local machine but "freezes" in production with millions of records? Or wondered why a simple query takes several seconds to execute?
EXPLAIN is the "X-ray" tool that lets you see exactly how MySQL processes your query. Instead of guessing, you'll know precisely:
- Whether MySQL is using indexes or not
- How many rows need to be scanned
- The JOIN order
- Whether there are temporary tables or filesort operations
EXPLAIN Basics
Syntax
EXPLAIN SELECT * FROM users WHERE email = 'john@example.com';
Or with more readable formats:
EXPLAIN FORMAT=JSON SELECT * FROM users WHERE email = 'john@example.com';
EXPLAIN FORMAT=TREE SELECT * FROM users WHERE email = 'john@example.com'; -- MySQL 8.0+
Sample Output
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | users | ALL | NULL | NULL | NULL | NULL | 1000 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
This result shows a serious problem: type = ALL means MySQL is scanning all 1000 rows!
Understanding Each EXPLAIN Column
1. id — Execution Order
- Same
id: executed simultaneously (JOIN) - Different
id: executed sequentially (higher numbers execute first)
EXPLAIN
SELECT * FROM orders
WHERE user_id IN (SELECT id FROM users WHERE status = 'active');
+----+-------------+--------+...
| id | select_type | table |...
+----+-------------+--------+...
| 1 | PRIMARY | orders |...
| 2 | SUBQUERY | users |...
+----+-------------+--------+...
2. select_type — Query Type
| Value | Meaning |
|---|---|
| SIMPLE | Simple query, no subquery or UNION |
| PRIMARY | Outermost query |
| SUBQUERY | Subquery in SELECT or WHERE |
| DERIVED | Subquery in FROM (creates temp table) |
| UNION | Second or later part of UNION |
| DEPENDENT SUBQUERY | Subquery dependent on outer query (⚠️ slow!) |
Warning: DEPENDENT SUBQUERY is usually very slow because MySQL must run the subquery for EACH row of the outer query.
3. type — How MySQL Accesses Data (MOST IMPORTANT!)
Ranked from best to worst:
| Type | Description | Performance |
|---|---|---|
system |
Table has only 1 row | 🟢 Excellent |
const |
Matches at most 1 row (PRIMARY KEY or UNIQUE) | 🟢 Excellent |
eq_ref |
JOIN with PRIMARY KEY, each left row matches exactly 1 right row | 🟢 Excellent |
ref |
Lookup using non-unique index | 🟢 Good |
range |
Search within an index range (BETWEEN, <, >, IN) | 🟡 OK |
index |
Full scan on index (doesn't read data rows) | 🟠 Poor |
ALL |
Full table scan | 🔴 Terrible |
Goal: Your query should have type const, eq_ref, ref, or range.
-- BAD: type = ALL (full table scan)
EXPLAIN SELECT * FROM users WHERE YEAR(created_at) = 2024;
-- GOOD: type = range
EXPLAIN SELECT * FROM users
WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01';
4. possible_keys and key
possible_keys: Indexes MySQL could usekey: Index MySQL actually chose
possible_keys: idx_email,idx_status,idx_created_at
key: idx_email
If possible_keys has many indexes but MySQL chooses wrong, you can use index hints:
SELECT * FROM users USE INDEX (idx_status) WHERE status = 'active';
SELECT * FROM users FORCE INDEX (idx_created_at) WHERE created_at > '2024-01-01';
5. key_len — Length of Index Used
Number of bytes of index used. Important for composite indexes:
-- Index: (country, city, district)
-- key_len = 3 (only uses country)
SELECT * FROM addresses WHERE country = 'VN';
-- key_len = 255 (uses country + city)
SELECT * FROM addresses WHERE country = 'VN' AND city = 'Hanoi';
-- key_len = 507 (uses all 3 columns)
SELECT * FROM addresses WHERE country = 'VN' AND city = 'Hanoi' AND district = 'CauGiay';
6. rows — Number of Rows to Scan (Estimate)
This number is an estimate, not 100% accurate. But it shows the scale of work MySQL must do.
Rule of thumb:
rows< 100: Goodrows< 1000: Acceptablerows> 10000: Needs reviewrows> 100000: Red alert!
7. Extra — Important Additional Information
| Value | Meaning |
|---|---|
Using index |
🟢 Covering index — no need to read data row |
Using where |
🟡 Filter applied after reading data |
Using temporary |
🔴 Creates temp table (often from GROUP BY, DISTINCT) |
Using filesort |
🔴 Sort without index (resource intensive) |
Using index condition |
🟢 Index Condition Pushdown (ICP) |
Goal: Avoid Using temporary and Using filesort when possible.
EXPLAIN ANALYZE — Measure Actual Time (MySQL 8.0.18+)
EXPLAIN only gives estimates. EXPLAIN ANALYZE actually runs the query and measures time:
EXPLAIN ANALYZE
SELECT u.name, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at > '2024-01-01'
GROUP BY u.id;
Output:
-> Group aggregate: count(o.id) (cost=1234.56 rows=500) (actual time=12.34..45.67 rows=487 loops=1)
-> Nested loop left join (cost=987.65 rows=5000) (actual time=0.12..34.56 rows=4523 loops=1)
-> Filter: (u.created_at > '2024-01-01') (actual time=0.05..2.34 rows=487 loops=1)
-> Table scan on u (cost=123.45 rows=1000) (actual time=0.03..1.23 rows=1000 loops=1)
-> Index lookup on o using idx_user_id (user_id=u.id) (cost=1.23 rows=5) (actual time=0.02..0.05 rows=9.28 loops=487)
Explanation:
cost: Estimated costrows: Estimated rowsactual time: Actual time (ms) — time to first row..total timeloops: Number of iterations (important for nested loops)
Common Problems and Solutions
Problem 1: Full Table Scan
-- ❌ BAD
EXPLAIN SELECT * FROM orders WHERE YEAR(created_at) = 2024;
-- type: ALL, rows: 1000000
Cause: Using a function on a column prevents MySQL from using the index.
-- ✅ GOOD
EXPLAIN SELECT * FROM orders
WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01';
-- type: range, rows: 45000
Problem 2: Using filesort with ORDER BY
-- ❌ BAD
EXPLAIN SELECT * FROM products WHERE category_id = 5 ORDER BY price;
-- Extra: Using filesort
Solution: Create a composite index including both WHERE and ORDER BY columns:
CREATE INDEX idx_category_price ON products(category_id, price);
-- ✅ GOOD - no more filesort
EXPLAIN SELECT * FROM products WHERE category_id = 5 ORDER BY price;
Problem 3: Using temporary with GROUP BY
-- ❌ BAD
EXPLAIN SELECT user_id, COUNT(*) FROM orders GROUP BY user_id;
-- Extra: Using temporary; Using filesort
Solution:
CREATE INDEX idx_user_id ON orders(user_id);
-- ✅ GOOD - uses index for grouping
EXPLAIN SELECT user_id, COUNT(*) FROM orders GROUP BY user_id;
-- Extra: Using index
Problem 4: DEPENDENT SUBQUERY
-- ❌ BAD - Subquery runs for EVERY row of outer query
EXPLAIN SELECT * FROM products p
WHERE price > (SELECT AVG(price) FROM products WHERE category_id = p.category_id);
-- select_type: DEPENDENT SUBQUERY
Solution: Convert to JOIN:
-- ✅ GOOD
EXPLAIN
SELECT p.* FROM products p
JOIN (
SELECT category_id, AVG(price) as avg_price
FROM products
GROUP BY category_id
) cat_avg ON p.category_id = cat_avg.category_id
WHERE p.price > cat_avg.avg_price;
Problem 5: JOIN Without Index
-- ❌ BAD
EXPLAIN
SELECT * FROM orders o
JOIN customers c ON o.customer_email = c.email;
-- type: ALL for both tables
Solution:
CREATE INDEX idx_customer_email ON orders(customer_email);
CREATE INDEX idx_email ON customers(email);
-- ✅ GOOD - type: ref or eq_ref
Real-World Example: Optimizing a Complex Query
Original Query (Slow)
EXPLAIN
SELECT
u.name,
u.email,
COUNT(o.id) as total_orders,
SUM(o.total_amount) as total_spent
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.status = 'active'
AND u.created_at > '2024-01-01'
GROUP BY u.id
ORDER BY total_spent DESC
LIMIT 10;
EXPLAIN Result:
+----+-------------+-------+------+---------------+------+---------+------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+--------+----------------------------------------------+
| 1 | SIMPLE | u | ALL | PRIMARY | NULL | NULL | NULL | 100000 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | o | ALL | NULL | NULL | NULL | NULL | 500000 | Using where; Using join buffer |
+----+-------------+-------+------+---------------+------+---------+------+--------+----------------------------------------------+
Problems:
- Full table scan on
users(type: ALL) - Full table scan on
orders(type: ALL) - Using temporary + filesort
Step 1: Add Index for WHERE Conditions
CREATE INDEX idx_users_status_created ON users(status, created_at);
Step 2: Add Index for JOIN
CREATE INDEX idx_orders_user_id ON orders(user_id);
Step 3: Create Covering Index (Optional)
CREATE INDEX idx_orders_user_total ON orders(user_id, total_amount);
Query After Optimization
EXPLAIN
SELECT
u.name,
u.email,
COUNT(o.id) as total_orders,
SUM(o.total_amount) as total_spent
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.status = 'active'
AND u.created_at > '2024-01-01'
GROUP BY u.id
ORDER BY total_spent DESC
LIMIT 10;
Result:
+----+-------------+-------+-------+---------------------------+------------------------+---------+--------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------------------+------------------------+---------+--------+------+----------------------------------------------+
| 1 | SIMPLE | u | range | idx_users_status_created | idx_users_status_created | 8 | NULL | 5000 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | o | ref | idx_orders_user_total | idx_orders_user_total | 4 | u.id | 5 | Using index |
+----+-------------+-------+-------+---------------------------+------------------------+---------+--------+------+----------------------------------------------+
Improvements:
users: from 100,000 rows → 5,000 rows (type: range)orders: from 500,000 rows → 5 rows per user (type: ref, Using index)
Using EXPLAIN in Laravel
Using toRawSql() + EXPLAIN
$query = User::where('status', 'active')
->where('created_at', '>', '2024-01-01')
->with('orders');
// Get raw SQL
$sql = $query->toRawSql();
dump($sql);
// Run EXPLAIN
$explain = DB::select('EXPLAIN ' . $sql);
dump($explain);
Debug with Laravel Debugbar
// config/debugbar.php
'collectors' => [
'queries' => true,
],
Debugbar will display all queries and their execution times.
Macro for Direct EXPLAIN
// app/Providers/AppServiceProvider.php
use Illuminate\Database\Query\Builder;
Builder::macro('explain', function () {
$sql = $this->toSql();
$bindings = $this->getBindings();
return DB::select('EXPLAIN ' . $sql, $bindings);
});
// Usage
User::where('status', 'active')->explain();
Query Optimization Checklist
When facing a slow query, check in this order:
-
Run EXPLAIN and look at the
typecolumn- type is
ALL? → Need to add index - type is
index? → Can be improved
- type is
-
Check
rows- rows > 10000? → Need better filtering
-
Check
Extra- Using filesort? → Review ORDER BY
- Using temporary? → Review GROUP BY
-
Check WHERE clause
- Using functions on columns?
- Implicit type conversion?
-
Check JOINs
- Indexes on JOIN columns?
- Is JOIN order reasonable?
Conclusion
EXPLAIN is an indispensable tool for any developer working with MySQL. Instead of guessing or trial-and-error, EXPLAIN gives you an accurate view of how MySQL processes your query.
Key takeaways:
- Always run EXPLAIN before deploying complex queries
- Goal: type = const/eq_ref/ref/range, avoid ALL
- Avoid Using temporary and Using filesort
- Don't use functions on columns in WHERE
- Composite index column order matters
- Use EXPLAIN ANALYZE (MySQL 8.0+) for actual timing
In the next article, we'll dive deep into MySQL Indexing Strategies — how to design effective indexes for real-world use cases.