Interview questions & answers

SQL interview questions & answers

Q: What is an index and what are its trade-offs?

An index is a data structure, usually a B-tree, that speeds up row lookups by avoiding full table scans. It dramatically improves SELECT and JOIN performance on indexed columns. The trade-off is extra storage and slower INSERT, UPDATE, and DELETE operations, because the index must also be maintained.

Q: What is the difference between a clustered and a non-clustered index?

A clustered index determines the physical order in which rows are stored on disk, so a table can have only one. A non-clustered index is a separate structure that stores key values plus pointers back to the rows, and a table can have many. Lookups via a non-clustered index may require an extra step to fetch the full row.

Q: What are aggregate functions? Give examples.

Aggregate functions compute a single summary value over a set of rows, ignoring NULLs except for COUNT(*). Common examples are COUNT, SUM, AVG, MIN, and MAX. They are typically used with GROUP BY to produce one result row per group.

SQL (Structured Query Language) is the standard language for defining, querying, and manipulating data in relational databases. It lets you retrieve and combine rows from one or more tables, aggregate them, and enforce structure through constraints and keys. Nearly every data, backend, and analytics role expects working SQL fluency.

Practise in a free AI mock interview Create my free profile

Updated 2026-06-18 · 15 real, commonly-asked questions with answers.

Key takeaways

SQL (Structured Query Language) is the standard language for defining, querying, and manipulating data in relational databases.
Core areas to revise for SQL: Joins, Aggregation & GROUP BY, Subqueries, Indexes, Window functions.
This guide answers 15 of the most-asked SQL interview questions — rehearse them in OnJob's free AI mock interview.

JoinsAggregation & GROUP BYSubqueriesIndexesWindow functionsNormalizationTransactions & ACIDQuery optimization

Top 15 SQL interview questions

Q1.What is the difference between WHERE and HAVING?

WHERE filters individual rows before any grouping or aggregation happens, so it cannot reference aggregate functions. HAVING filters groups after GROUP BY has aggregated the rows, so it can use functions like COUNT or SUM. In short: use WHERE for row-level conditions and HAVING for conditions on aggregated results.

Q2.Explain the different types of JOIN.

An INNER JOIN returns only rows that match in both tables. A LEFT (or RIGHT) OUTER JOIN returns all rows from the left (or right) table plus matching rows from the other, filling non-matches with NULL. A FULL OUTER JOIN returns matched rows plus unmatched rows from both sides, and a CROSS JOIN returns the Cartesian product of every row combination.

Q3.What is the difference between DELETE, TRUNCATE, and DROP?

DELETE removes rows one at a time based on an optional WHERE clause and can be rolled back; it is logged per row. TRUNCATE removes all rows quickly by deallocating data pages without logging individual deletions, and it cannot be filtered. DROP removes the entire table structure along with its data, indexes, and constraints.

Q4.What are primary keys and foreign keys?

A primary key uniquely identifies each row in a table and cannot be NULL or duplicated. A foreign key is a column (or set of columns) in one table that references the primary key of another table, enforcing referential integrity. Together they model relationships and prevent orphaned records.

Q5.What is normalization and why is it used?

Normalization is the process of organizing columns and tables to reduce data redundancy and improve integrity, typically through normal forms (1NF, 2NF, 3NF). It splits data into related tables so each fact is stored once, avoiding update anomalies. The trade-off is that heavily normalized schemas may need more joins, so analytics systems sometimes denormalize for read speed.

Q6.What is the difference between UNION and UNION ALL?

UNION combines the result sets of two queries and removes duplicate rows, which requires an extra sort or hash step. UNION ALL also combines them but keeps every row including duplicates, making it faster. Both queries must have the same number of columns with compatible data types.

Q7.How do you find the second highest salary in a table?

One reliable method is a correlated subquery: SELECT MAX(salary) FROM employees WHERE salary < (SELECT MAX(salary) FROM employees). Alternatively use a window function with DENSE_RANK ordered by salary descending and filter for rank 2. DENSE_RANK handles ties cleanly so equal top salaries do not skip the second position.

Q8.What is an index and what are its trade-offs?

An index is a data structure, usually a B-tree, that speeds up row lookups by avoiding full table scans. It dramatically improves SELECT and JOIN performance on indexed columns. The trade-off is extra storage and slower INSERT, UPDATE, and DELETE operations, because the index must also be maintained.

Q9.What is the difference between a clustered and a non-clustered index?

A clustered index determines the physical order in which rows are stored on disk, so a table can have only one. A non-clustered index is a separate structure that stores key values plus pointers back to the rows, and a table can have many. Lookups via a non-clustered index may require an extra step to fetch the full row.

Q10.What are aggregate functions? Give examples.

Aggregate functions compute a single summary value over a set of rows, ignoring NULLs except for COUNT(*). Common examples are COUNT, SUM, AVG, MIN, and MAX. They are typically used with GROUP BY to produce one result row per group.

Q11.Explain GROUP BY with an example.

GROUP BY collapses rows that share the same value in the listed columns into a single summary row, which you then aggregate. For example, selecting department and COUNT(*) grouped by department returns one row per department with its headcount. Any non-aggregated column in the SELECT must appear in the GROUP BY.

Q12.What is a subquery and what is a correlated subquery?

A subquery is a query nested inside another query, often in WHERE, FROM, or SELECT. A correlated subquery references a column from the outer query, so it is re-evaluated once for each outer row. Non-correlated subqueries run once and can be cached, making them generally faster.

Q13.How does NULL behave in SQL comparisons?

NULL represents an unknown or missing value, and any comparison with it using equals, less-than, or greater-than yields UNKNOWN rather than TRUE. To test for NULL you must use IS NULL or IS NOT NULL. Aggregate functions skip NULLs, and functions like COALESCE let you substitute a default.

Q14.What are window functions and how do they differ from GROUP BY?

Window functions perform calculations across a set of rows related to the current row using an OVER clause, without collapsing those rows. Unlike GROUP BY, they preserve every input row while adding computed columns like running totals or rankings. Examples include ROW_NUMBER, RANK, DENSE_RANK, LAG, and LEAD.

Q15.What is the difference between RANK, DENSE_RANK, and ROW_NUMBER?

ROW_NUMBER assigns a unique sequential number to each row with no ties. RANK gives tied rows the same number but then skips the next values, producing 1, 2, 2, 4. DENSE_RANK also gives ties the same number but does not skip, producing 1, 2, 2, 3.

Practise & prepare

Free AI mock interview

Practise SQL out loud

Reading answers is step one. Rehearse them in OnJob's free AI mock interview, get instant feedback, then apply to AI-matched jobs in one click.

Start a free mock interview Create my free profile