Top 50 SQL & RDBMS Interview Questions with Answers (2026): Beginner to SQL Expert

SQL is tested in virtually every backend, data engineering, full-stack, and analytics interview. Whether you're applying for a junior developer role or a senior data engineering position, interviewers use SQL to assess your ability to think in sets, model data relationships, and write queries that perform at scale.
This guide covers the 50 most frequently asked SQL interview questions with technically precise answers and a βWhy Interviewers Ask Thisβ section that reveals the exact trap or insight behind every question β so you never get blindsided.
Topics covered: SQL vs MySQL, DDL/DML/DCL/TCL subsets, Primary & Foreign Keys, GROUP BY & HAVING, all JOIN types (INNER, LEFT, RIGHT, FULL, CROSS, SELF), UNION vs UNION ALL, aggregate functions, window functions (RANK, DENSE_RANK, ROW_NUMBER), CTEs, correlated subqueries, views, materialized views, clustered & non-clustered indexes, TRUNCATE vs DELETE vs DROP, ACID properties, stored procedures, triggers, transactions, normalization, denormalization, and SQL injection.
Contents
- 1.Basic SQL & RDBMS Concepts (Q1βQ7)SQL Β· MySQL Β· DDL/DML Β· Primary Key Β· Foreign Key Β· NULL
- 2.Querying & Filtering Data (Q8βQ14)WHERE Β· HAVING Β· GROUP BY Β· ORDER BY Β· IN Β· BETWEEN Β· LIKE Β· DISTINCT
- 3.SQL Joins (Q15βQ23)INNER Β· LEFT Β· RIGHT Β· FULL Β· CROSS Β· SELF Β· UNION Β· UNION ALL
- 4.Functions & Aggregation (Q24βQ29)Aggregate Fns Β· COUNT(*) Β· String Fns Β· COALESCE Β· Window Fns Β· RANK
- 5.Subqueries & Views (Q30βQ34)Subquery Β· Correlated Subquery Β· CTE Β· View Β· Materialized View
- 6.Indexes, Transactions & Security (Q35βQ50)Index Β· Clustered Β· TRUNCATE vs DELETE Β· Constraints Β· Procedures Β· Triggers Β· ACID Β· Normalization Β· SQL Injection
- 7.Common Interview MistakesWHERE vs HAVING Β· NULL behavior Β· SELECT * in production Β· Index trade-offs
- 8.Expert Interview StrategyQuery execution order Β· Window functions Β· EXPLAIN plans Β· Normalization trade-offs
- 9.Real-World Job ApplicationsData Analyst Β· Backend Engineer Β· Data Engineer
Basic SQL & RDBMS Concepts Interview Questions (Q1βQ7)
1. What is SQL?
SQL (Structured Query Language) is a standard programming language specifically designed for managing, querying, and manipulating relational databases. It allows users to perform operations such as creating, retrieving, updating, and deleting data (CRUD operations) within structured tables. SQL operates exclusively on relational (tabular) databases β not NoSQL systems like MongoDB or Redis.
π‘ Why Interviewers Ask This: The ultimate baseline test. You must explicitly mention that SQL is used for relational (tabular) databases β not NoSQL. This distinction is what they are checking for.
2. What is the difference between SQL and MySQL?
- SQL: The standard language used to query and operate any relational database system. It is not software β it is a specification.
- MySQL: A popular open-source Relational Database Management System (RDBMS) software product that uses SQL as its query language to manage the actual database.
- Other RDBMS examples: PostgreSQL, Microsoft SQL Server, Oracle Database, SQLite β all use SQL as their language.
π‘ Why Interviewers Ask This: To ensure you understand the difference between a programming language specification (SQL) and a database engine implementation (MySQL). Many beginners confuse the two.
3. What are the subsets of SQL commands?
- DDL (Data Definition Language): Defines the database structure. Commands:
CREATE,ALTER,DROP,TRUNCATE. - DML (Data Manipulation Language): Manipulates the data within tables. Commands:
INSERT,UPDATE,DELETE. - DQL (Data Query Language): Retrieves data. Command:
SELECT. - DCL (Data Control Language): Manages user permissions. Commands:
GRANT,REVOKE. - TCL (Transaction Control Language): Manages transactions. Commands:
COMMIT,ROLLBACK,SAVEPOINT.
π‘ Why Interviewers Ask This: Highly tested. Classic trap: if an interviewer asks what type of command TRUNCATE is, the answer is DDL β not DML. Beginners always say DML.
4. What is a Primary Key?
A Primary Key is a constraint that uniquely identifies each row in a database table. Rules: (1) All values must be unique β no two rows can share the same primary key. (2) It cannot contain a NULL value. (3) A table can have only one primary key (though it can be composite β spanning multiple columns). It is typically implemented as a clustered index, physically ordering the table data.
π‘ Why Interviewers Ask This: The most fundamental constraint in database design. Explicitly mention: unique, no NULLs, only one per table. These three rules are the expected answer.
5. What is a Foreign Key?
A Foreign Key is a column (or group of columns) in one table that references the Primary Key of another table, creating a link between the two. It enforces referential integrity β ensuring you cannot insert a value in the foreign key column that doesn't exist in the referenced parent table, and (depending on configuration) controlling what happens when a parent row is deleted or updated.
π‘ Why Interviewers Ask This: Tests your knowledge of relational mapping. The expected follow-up is about ON DELETE CASCADE vs ON DELETE RESTRICT β know both behaviours.
6. What is a Unique Constraint? How is it different from a Primary Key?
- Unique Constraint: Ensures all values in a column are distinct. A table can have multiple unique constraints. It can accept one NULL value (in most databases).
- Primary Key: Also enforces uniqueness but cannot accept NULL. A table can have only one primary key. It is the table's main identifier.
- Summary: Primary Key = Unique + NOT NULL + only one per table. Unique Constraint = Unique + allows one NULL + multiple per table.
π‘ Why Interviewers Ask This: The classic PK vs Unique comparison. The ability to accept a NULL value is the critical differentiator that trips up candidates.
7. What is a NULL value in SQL?
A NULL value represents a missing, unknown, or inapplicable value in a column. It is not the same as zero (0), an empty string (''), or false. Critical rules: (1) Checking for NULL requires IS NULL or IS NOT NULL β using = NULL in a WHERE clause will never return results because NULL = NULL evaluates to UNKNOWN (not TRUE). (2) Most aggregate functions automatically ignore NULL values (except COUNT(*)).
π‘ Why Interviewers Ask This: NULLs cause catastrophic bugs in production. The key insight: NULL = NULL is UNKNOWN, not TRUE. You must use IS NULL.
Querying & Filtering Data Interview Questions (Q8βQ14)
8. What is the difference between WHERE and HAVING?
- WHERE: Filters individual rows before any grouping occurs. Cannot be used with aggregate functions (
SUM,COUNT, etc.). - HAVING: Filters groups after the
GROUP BYclause has aggregated rows. Must be used with aggregate functions. - Execution order: WHERE runs first β GROUP BY aggregates β HAVING filters groups.
π‘ Why Interviewers Ask This: The most frequently asked filtering question in SQL interviews. Memorize the execution order: WHERE β GROUP BY β HAVING.
9. What does the GROUP BY statement do?
The GROUP BY statement groups rows with the same values in specified columns into summary rows, collapsing multiple rows into one per group. It is almost always used with aggregate functions like COUNT(), SUM(), AVG(), MAX(), or MIN() to perform calculations on each group. Example: SELECT Department, COUNT(*) FROM Employees GROUP BY Department β returns one row per department with the employee count.
π‘ Why Interviewers Ask This: Essential for all data analysis and reporting roles. Know that every column in SELECT that is not an aggregate function must appear in the GROUP BY clause.
10. What does the ORDER BY clause do?
The ORDER BY clause sorts the result-set in ascending (ASC) or descending (DESC) order based on one or more columns. By default, ORDER BY sorts in ascending order β you must explicitly write DESC for descending. It is the last clause evaluated in a query (after WHERE and HAVING), so it can reference column aliases defined in SELECT.
π‘ Why Interviewers Ask This: Basic syntax knowledge. The key fact: ASC is the default and can be omitted. DESC must always be explicit.
11. What is the IN operator?
The IN operator allows you to specify multiple values in a WHERE clause as a shorthand for multiple OR conditions. Example: WHERE Status IN ('Shipped', 'Processing', 'Delivered') is equivalent to WHERE Status = 'Shipped' OR Status = 'Processing' OR Status = 'Delivered'. IN can also take a subquery: WHERE UserID IN (SELECT UserID FROM PremiumUsers).
π‘ Why Interviewers Ask This: Tests your ability to write clean, readable queries. Also sets up the IN vs EXISTS performance comparison question β EXISTS is faster for large subquery results.
12. What is the BETWEEN operator?
The BETWEEN operator selects values within a given range (inclusive on both ends). Values can be numbers, text, or dates. Example: WHERE Price BETWEEN 10 AND 20 returns rows where Price is 10, 20, or any value in between. NOT BETWEEN excludes the range. For dates: WHERE OrderDate BETWEEN '2026-01-01' AND '2026-03-31'.
π‘ Why Interviewers Ask This: Often asked with date filtering. Critical fact: both boundary values (10 and 20) are included in the result β it is inclusive.
13. What is the LIKE operator?
The LIKE operator searches for a specified pattern in a column using two wildcards:
%(percent): Represents zero, one, or multiple characters.LIKE 'A%'= starts with A.LIKE '%son'= ends with "son".LIKE '%an%'= contains "an"._(underscore): Represents exactly one character.LIKE '_ohn'matches "John" or "Bohn".
π‘ Why Interviewers Ask This: Essential for text search. Performance note: LIKE '%value%' (leading wildcard) cannot use an index and causes a full table scan.
14. What does the DISTINCT keyword do?
The DISTINCT keyword in a SELECT statement returns only unique (non-duplicate) values, filtering out any duplicate rows from the result set. Example: SELECT DISTINCT Department FROM Employees returns each department name exactly once, regardless of how many employees are in it. Performance note: DISTINCT requires a sort or hash operation to deduplicate β it is more expensive than a plain SELECT. Use GROUP BY as an alternative when you also need aggregation.
π‘ Why Interviewers Ask This: Practical data-cleaning test. DISTINCT operates on the entire row, not just one column β if multiple columns are selected, all combinations are deduplicated together.
SQL Joins Interview Questions (Q15βQ23)
15. What is a JOIN in SQL?
A JOIN clause combines rows from two or more tables based on a related column between them β typically a Foreign Key in one table matching the Primary Key of another. JOINs are the fundamental mechanism that makes relational databases "relational" β they let you query data spread across multiple normalized tables in a single statement. Without JOINs, you would need to run multiple queries and merge the results manually in application code.
π‘ Why Interviewers Ask This: The bread and butter of relational databases. The follow-up is always naming the types. Know: INNER, LEFT, RIGHT, FULL OUTER, CROSS, and SELF.
16. What is an INNER JOIN?
An INNER JOIN returns only the rows that have matching values in both tables based on the join condition. If a row in either table does not have a match in the other table, that row is completely excluded from the result set. JOIN without a type keyword defaults to an INNER JOIN. Use case: get all orders that have a matching customer β orphaned orders with no customer record are excluded.
π‘ Why Interviewers Ask This: The most common join type. Know that just writing JOIN (no type) defaults to an INNER JOIN in all major databases.
17. What is a LEFT JOIN (LEFT OUTER JOIN)?
A LEFT JOIN returns all records from the left table, and the matched records from the right table. For left table rows that have no match in the right table, the result contains NULL values for all right table columns. Use case: get a list of all customers, including those who have made zero purchases (their order columns will be NULL).
π‘ Why Interviewers Ask This: The most practical join. Interviewers confirm you know when to choose LEFT JOIN over INNER JOIN β when you need the "all from left, matched from right" pattern.
18. What is a RIGHT JOIN (RIGHT OUTER JOIN)?
A RIGHT JOIN returns all records from the right table, and the matched records from the left table. For right table rows with no match in the left table, the result contains NULL values for the left table columns. A RIGHT JOIN is the mirror image of a LEFT JOIN β any RIGHT JOIN can be rewritten as a LEFT JOIN by swapping the table order. In practice, most developers default to LEFT JOINs and swap table order, making RIGHT JOIN rare in real code.
π‘ Why Interviewers Ask This: Proves you understand table order execution. The key insight: it is exactly the inverse of LEFT JOIN.
19. What is a FULL OUTER JOIN?
A FULL OUTER JOIN returns all records when there is a match in either the left or the right table. Unmatched rows from both sides appear with NULL values for the columns of the table that had no match β it's the union of a LEFT JOIN and a RIGHT JOIN. Not all databases support FULL OUTER JOIN: MySQL does not support it natively. In MySQL, it is simulated by combining a LEFT JOIN and a RIGHT JOIN with UNION.
π‘ Why Interviewers Ask This: Tests knowledge that MySQL requires a UNION workaround. This MySQL limitation is the expected gotcha in the answer.
20. What is a CROSS JOIN?
A CROSS JOIN returns the Cartesian product of two tables β every row in the first table is paired with every row in the second table. If Table A has 10 rows and Table B has 10 rows, the result has 100 rows. There is no ON clause. Use cases: generating test data, creating all possible combinations (e.g., all colours Γ all sizes for a product). Danger: Forgetting to include the ON clause in a regular JOIN accidentally produces a CROSS JOIN, potentially returning millions of rows.
π‘ Why Interviewers Ask This: The "forgotten ON clause" danger is the real reason this is asked. A CROSS JOIN on two production tables of 1M rows each generates 1 trillion rows.
21. What is a SELF JOIN?
A SELF JOIN is a join where a table is joined with itself. It is used to query hierarchical data or compare rows within the same table. Classic use case: finding employee-manager relationships where both the employee and their manager live in the same Employees table. Table aliases are mandatory β you must give the same table two different aliases to distinguish the two "copies": FROM Employees A JOIN Employees B ON A.ManagerID = B.ID.
π‘ Why Interviewers Ask This: Advanced logic test. The employee-manager hierarchy is the canonical example. Aliases are technically required β mention them explicitly.
22. What is the difference between JOIN and UNION?
- JOIN: Combines tables horizontally β adds new columns to the result by matching rows from multiple tables based on a related key. The result is wider.
- UNION: Combines the result sets of two or more
SELECTstatements vertically β adds new rows. The queries must have the same number of columns and compatible data types. The result is taller.
π‘ Why Interviewers Ask This: The most frequently confused concept for junior SQL developers. Memorize: JOIN adds columns. UNION adds rows.
23. What is the difference between UNION and UNION ALL?
- UNION: Combines two result sets and removes duplicate rows. Internally performs a sort/hash deduplication step β slower.
- UNION ALL: Combines two result sets and keeps all duplicate rows. No deduplication step β significantly faster.
- Best practice: Always use
UNION ALLunless duplicate removal is explicitly required. The deduplication cost ofUNIONis unnecessary overhead if your data is already distinct or duplicates are acceptable.
π‘ Why Interviewers Ask This: Performance tuning awareness. Senior engineers default to UNION ALL β it's a signal that you think about query cost, not just correctness.
Functions & Aggregation Interview Questions (Q24βQ29)
24. What are Aggregate Functions?
Aggregate Functions perform a calculation on a set of values and return a single scalar result. They are used with GROUP BY to compute metrics per group:
COUNT()β number of rowsSUM()β total of a numeric columnAVG()β arithmetic meanMIN()/MAX()β smallest / largest value
All aggregate functions automatically ignore NULL values β except COUNT(*) which counts all rows including those with NULLs.
π‘ Why Interviewers Ask This: Core data analysis knowledge. The NULL-ignoring behaviour is what they test β it affects AVG() results in ways developers don't expect.
25. What is the difference between COUNT(*) and COUNT(column_name)?
COUNT(*): Counts the total number of rows in the result set β including rows where the column values are NULL.COUNT(column_name): Counts only the rows where that specific column is NOT NULL. Rows with NULL in that column are excluded from the count.- Example: If a table has 10 rows and 3 have
Phone = NULL, thenCOUNT(*) = 10butCOUNT(Phone) = 7.
π‘ Why Interviewers Ask This: A classic trap question. Many developers wrongly assume COUNT(column) and COUNT(*) always return the same value β the NULL distinction trips them up.
26. What are String Functions in SQL?
String Functions manipulate text data in queries. Common examples:
CONCAT(str1, str2)β Concatenates two or more strings.SUBSTRING(str, start, length)β Extracts a portion of a string.UPPER(str)/LOWER(str)β Converts case.LENGTH(str)β Returns the number of characters.TRIM(str)β Removes leading and trailing spaces.REPLACE(str, old, new)β Replaces occurrences of a substring.
π‘ Why Interviewers Ask This: Tests ability to clean and format raw data. Data engineers use these constantly to prepare messy imported data for downstream consumption.
27. What is the COALESCE() function?
COALESCE() evaluates a comma-separated list of arguments and returns the first non-NULL value. If all arguments are NULL, it returns NULL. It is the SQL equivalent of a null-coalescing fallback. Classic use: SELECT COALESCE(Phone, Email, 'N/A') FROM Users β if Phone is NULL it tries Email; if Email is also NULL it returns the literal string 'N/A'. It can accept any number of arguments and short-circuits at the first non-NULL.
π‘ Why Interviewers Ask This: The ultimate tool for handling missing data gracefully. COALESCE is far cleaner than chaining multiple CASE WHEN ... IS NULL expressions.
28. What are Window Functions?
Window Functions perform a calculation across a set of rows that are "related" to the current row β without collapsing those rows into a single summary row (unlike aggregate functions). Each row retains its own identity in the output. Window functions use the OVER() clause, which defines the "window" of rows. The PARTITION BY sub-clause divides the window by group; ORDER BY within OVER() defines the order for ranking/running functions. Examples: RANK(), DENSE_RANK(), ROW_NUMBER(), LAG(), LEAD(), SUM() OVER().
π‘ Why Interviewers Ask This: The dividing line between junior and mid-level data engineers. The OVER() clause is the mandatory signature of a window function β mention it explicitly.
29. What is the difference between RANK(), DENSE_RANK(), and ROW_NUMBER()?
All three are ranking window functions. The difference lies in how they handle tied values:
ROW_NUMBER(): Assigns a unique sequential integer to every row β no ties ever. (1, 2, 3, 4)RANK(): Assigns the same rank to ties, then skips the next rank number. (1, 2, 2, 4) β rank 3 is skipped.DENSE_RANK(): Assigns the same rank to ties, then does not skip the next rank. (1, 2, 2, 3) β no gaps.
Use DENSE_RANK() when finding the Nth highest value, to avoid rank gaps from ties.
π‘ Why Interviewers Ask This: A guaranteed question for Data Analyst and Backend Engineer roles. The "skip vs no-skip" distinction between RANK and DENSE_RANK is the exact differentiator they test.
Subqueries & Views Interview Questions (Q30βQ34)
30. What is a Subquery?
A Subquery (Inner Query or Nested Query) is a query embedded inside another SQL query β inside a SELECT, INSERT, UPDATE, or DELETE statement. The inner query executes first, and its result is passed to the outer query. Subqueries can appear in the WHERE clause, the FROM clause (as a derived table), or the SELECT list. Example: SELECT Name FROM Employees WHERE Salary > (SELECT AVG(Salary) FROM Employees) β finds all employees earning above average.
π‘ Why Interviewers Ask This: Proves you can write multi-step logic in a single statement. The avg-salary example is the canonical illustration β know it.
31. What is a Correlated Subquery?
A Correlated Subquery is a subquery that references one or more columns from the outer query, creating a dependency between the inner and outer queries. Because it cannot run independently, it must be executed once for every single row processed by the outer query β giving it an effective O(NΒ²) complexity. Example: finding all employees who earn more than the average salary in their own department requires a correlated subquery that re-computes the department average for each employee row. They are inherently slow and should be replaced with a JOIN or window function whenever possible.
π‘ Why Interviewers Ask This: Performance test. The O(NΒ²) complexity and the recommendation to replace with JOINs/window functions is the senior-level insight they look for.
32. What is a CTE (Common Table Expression)?
A CTE is a temporary, named result set defined with the WITH clause that you can reference in the subsequent SELECT, INSERT, UPDATE, or DELETE statement. Unlike subqueries, CTEs are defined once at the top and are referenced by name, making complex multi-step queries readable from top to bottom. Recursive CTEs (WITH RECURSIVE) can traverse hierarchical data like an org chart or file system. They are not physically materialized (unless the database optimizer decides to do so) and exist only for the duration of the query.
π‘ Why Interviewers Ask This: Senior developers strongly prefer CTEs over deeply nested subqueries. Mentioning Recursive CTEs for hierarchical data is a senior-level signal.
33. What is a View?
A View is a virtual table defined by a stored SQL query. It has rows and columns like a real table, but it does not physically store data β every time you query a view, the underlying SQL runs to generate the result. Benefits: (1) Simplification β hide complex JOIN logic behind an easy-to-query name. (2) Security β expose only selected columns to certain users (e.g., show employee names but hide salaries). (3) Consistency β multiple queries/apps share the same query definition.
π‘ Why Interviewers Ask This: Security use case (hiding sensitive columns) is the most important reason to mention. Views are the standard mechanism for column-level security in enterprise databases.
34. What is a Materialized View?
Unlike a standard View, a Materialized View physically stores the query result data on disk and is refreshed periodically (on schedule, on demand, or on data change β depending on the database). The trade-off: vastly faster reads for expensive aggregations at the cost of potentially stale data between refreshes. Use cases: pre-computing complex dashboard aggregations in data warehouses (Snowflake, BigQuery, PostgreSQL). Materialized views are the foundational building block of OLAP (analytical) systems where queries aggregate billions of rows.
π‘ Why Interviewers Ask This: Advanced Data Warehousing knowledge. The stale-vs-fast trade-off and the mention of OLAP/Snowflake signals genuine experience beyond basic SQL.
Indexes, Transactions & Security Interview Questions (Q35βQ50)
35. What is an Index in SQL?
An Index is an internal data structure β typically a B-Tree β that the database engine maintains to speed up data retrieval. Like a book's index, it allows the engine to jump directly to relevant rows instead of scanning every row in the table (a full table scan). Trade-offs: indexes dramatically speed up SELECT queries but slow down INSERT, UPDATE, and DELETE operations because the index structures must also be updated with every data change. They also consume additional storage.
π‘ Why Interviewers Ask This: The #1 database performance tuning question. Always mention both sides: faster reads, slower writes, extra storage. Over-indexing is a real production problem.
36. What is a Clustered Index?
A Clustered Index determines the physical order of data rows on disk β the table rows are sorted and stored in the same order as the index key. Because data can only be physically sorted one way, a table can have only one clustered index. In most databases, the Primary Key automatically becomes the clustered index. Clustered indexes make range queries (BETWEEN, date ranges) extremely fast because the relevant rows are physically adjacent on disk.
π‘ Why Interviewers Ask This: "Only one per table" and "physically sorts the data" are the two key facts. Knowing that range queries benefit most from clustered indexes shows query optimization depth.
37. What is a Non-Clustered Index?
A Non-Clustered Index creates a separate data structure (like the index at the back of a textbook) that holds the indexed column values along with pointers (row locators) back to the actual data rows β but it does not reorder the physical data. A table can have multiple non-clustered indexes (SQL Server allows up to 999). A lookup via a non-clustered index requires an extra step β following the pointer to the actual row (a "key lookup" or "bookmark lookup") β making it slightly slower than a clustered index for the same column.
π‘ Why Interviewers Ask This: Distinguishes logical lookup (non-clustered) from physical ordering (clustered). Multiple non-clustered indexes are allowed β that's the key contrast.
38. What is the difference between TRUNCATE, DELETE, and DROP?
DELETE(DML): Removes specific rows matching aWHEREclause (or all rows if no WHERE). Logs every deleted row individually β slowest but fully rollback-able. Triggers fire.TRUNCATE(DDL): Removes all rows from the table instantly by deallocating data pages. Does not log individual rows β very fast. Generally cannot be rolled back (in most databases). Resets identity/auto-increment counters. Triggers do not fire.DROP(DDL): Permanently removes the entire table (structure, all data, all indexes, all constraints) from the database. Irreversible without a backup.
π‘ Why Interviewers Ask This: A guaranteed interview question. Confusing these three commands is how production databases get destroyed. TRUNCATE = DDL (not DML) is the classic trap.
39. What are SQL Constraints?
Constraints are rules enforced on columns to maintain data integrity at the schema level:
NOT NULL: Column cannot contain NULL.UNIQUE: All values in the column must be distinct.PRIMARY KEY: NOT NULL + UNIQUE β uniquely identifies each row.FOREIGN KEY: Enforces referential integrity between two tables.CHECK: Ensures values satisfy a specific boolean condition (e.g.,CHECK (Age >= 18)).DEFAULT: Assigns a default value if none is provided during INSERT.
π‘ Why Interviewers Ask This: Data integrity is a senior engineering concern. Constraints are the database's last line of defense against bad data β they enforce rules even when application code has bugs.
40. What is a Stored Procedure?
A Stored Procedure is a precompiled, named block of SQL code stored in the database that can be invoked with a CALL or EXEC statement. Because it is precompiled, the query plan is cached β repeated executions are faster than sending raw SQL over the network each time. Benefits: (1) Reduced network traffic β send one procedure name rather than large SQL strings. (2) Reusability β centralized business logic. (3) Security β users can execute a procedure without having direct table access.
π‘ Why Interviewers Ask This: Enterprise architecture knowledge. Many legacy banking and ERP systems run almost entirely through stored procedures. Know the three benefits above.
41. What is the difference between a Stored Procedure and a Function?
- Function: Must return a single value (scalar) or a table. Cannot modify database state (no INSERT/UPDATE/DELETE in most databases). Can be used directly inside a
SELECTstatement:SELECT dbo.GetTax(Price) FROM Products. Best for: calculations and transformations. - Stored Procedure: Can return zero, one, or multiple result sets. Can modify database state. Must be called explicitly with
CALLorEXECβ cannot be embedded in aSELECT. Best for: business logic workflows with side effects.
π‘ Why Interviewers Ask This: You must know when to use which. Functions = pure calculations. Procedures = stateful business logic. The "can't use a procedure in SELECT" rule is the syntax trap.
42. What is a Database Trigger?
A Trigger is a special stored procedure that automatically fires in response to a specific event (INSERT, UPDATE, or DELETE) on a particular table, without being explicitly called. Triggers run either BEFORE or AFTER the triggering event. Primary use case: audit logging β automatically insert a record into a UserAuditLog table every time a user's profile is updated, capturing who changed what and when. Also used for enforcing complex business rules that CHECK constraints cannot handle.
π‘ Why Interviewers Ask This: Audit trail use case is the standard expected example. Know that triggers fire automatically β they are invisible to the application code and can cause unexpected performance overhead if misused.
43. What is a Transaction?
A Transaction is a sequence of one or more SQL operations that are executed as a single, indivisible logical unit of work. The defining rule: a transaction must execute completely (COMMIT) or not at all (ROLLBACK). If any step in the sequence fails, the entire transaction is rolled back, leaving the database in the state it was in before the transaction began. Classic example: a bank transfer β debit account A and credit account B must happen atomically. If the credit fails after the debit succeeds, you must roll back the debit to avoid losing money.
π‘ Why Interviewers Ask This: The bank transfer is the canonical example β know it verbatim. Transactions are the foundation of every financial, e-commerce, and booking system.
44. What are the ACID properties?
ACID is a set of four properties that guarantee reliable database transactions:
- Atomicity: "All or nothing." Every operation in the transaction succeeds completely, or the entire transaction is rolled back. No partial commits.
- Consistency: A transaction brings the database from one valid state to another valid state. No integrity constraint is violated at the end.
- Isolation: Concurrent transactions execute as if they were run serially β intermediate states of a transaction are invisible to other transactions.
- Durability: Once a transaction is committed, the data is permanently saved β it survives system crashes, power failures, or restarts (via write-ahead logging).
π‘ Why Interviewers Ask This: The most famous question in database history. You must define all four letters clearly. Common follow-up: "Which ACID property is sacrificed in NoSQL systems for scalability?" β answer: Consistency (BASE model).
45. What does the COMMIT command do?
The COMMIT command permanently saves all changes made during the current transaction to the database, ending the transaction. After a COMMIT, the changes are: (1) Durable β persisted to disk via write-ahead log, surviving crashes. (2) Visible β now visible to all other transactions and users. Once committed, the changes cannot be undone with ROLLBACK β they are permanent. If AUTOCOMMIT mode is on (MySQL default), every individual statement is automatically committed immediately after execution.
π‘ Why Interviewers Ask This: Transaction lifecycle test. The AUTOCOMMIT mode detail is the advanced insight β in MySQL, you must explicitly start a transaction with BEGIN/START TRANSACTION when you want multi-statement atomicity.
46. What does the ROLLBACK command do?
The ROLLBACK command undoes all changes made during the current transaction, reverting the database to its state before the transaction began. It is the "undo" mechanism when an error occurs mid-transaction. In application code, ROLLBACK should be called in the error handler (CATCH block) when any step in a transaction fails β ensuring the database is never left in a partial state. SAVEPOINT allows partial rollbacks β you can roll back to a named intermediate point without rejecting the entire transaction.
π‘ Why Interviewers Ask This: Error handling architecture test. Mentioning SAVEPOINT for partial rollbacks is a senior-level detail most candidates miss.
47. What is Data Normalization?
Normalization is the process of organizing database tables to reduce data redundancy and improve data integrity by applying a series of normal forms:
- 1NF (First Normal Form): Eliminate repeating groups. Each column must hold atomic (indivisible) values. Each row must be unique.
- 2NF (Second Normal Form): Must be in 1NF + remove partial dependencies β every non-key column must depend on the entire primary key (relevant for composite keys).
- 3NF (Third Normal Form): Must be in 2NF + remove transitive dependencies β non-key columns must depend only on the primary key, not on other non-key columns.
π‘ Why Interviewers Ask This: Normalization optimizes for write-heavy (OLTP) workloads β less data redundancy means fewer places to update. Know all three normal forms in order.
48. What is Denormalization?
Denormalization is the intentional process of adding redundancy back into a previously normalized database β merging tables and duplicating data to reduce the number of JOIN operations required for common read queries. It trades off data consistency and write performance for significantly faster reads. Use case: reporting dashboards where complex 5-table JOINs run millions of times per day. Denormalization collapses those JOINs by pre-combining the data. It is the foundation of OLAP (analytical) data warehouse design (star schema, snowflake schema).
π‘ Why Interviewers Ask This: Tests real engineering trade-offs. While normalization optimizes writes (OLTP), denormalization optimizes reads (OLAP). Knowing both and when to apply each is what senior engineers are tested on.
49. What is SQL Injection and how is it prevented?
SQL Injection is a critical security vulnerability where an attacker injects malicious SQL code through an input field to manipulate the database query. Example: a login form that builds a query as 'SELECT * FROM Users WHERE username = ' + input β an attacker entering ' OR '1'='1 makes the query always return true, bypassing authentication. Prevention: always use Parameterized Queries (Prepared Statements) β the database driver treats user input as a data value, never as executable SQL code. Additional defenses: ORMs, input validation, principle of least privilege (database user should have minimal permissions).
π‘ Why Interviewers Ask This: The most critical web security question in SQL interviews. Parameterized queries are the correct, non-negotiable answer. Mentioning ORMs and least privilege shows security depth.
50. How do you find the Nth highest salary using SQL?
The modern, scalable solution uses DENSE_RANK() β the window function that ranks without skipping numbers on ties:
WITH RankedSalaries AS (
SELECT
Salary,
DENSE_RANK() OVER (ORDER BY Salary DESC) AS Rank
FROM Employees
)
SELECT Salary
FROM RankedSalaries
WHERE Rank = N; -- Replace N with 2, 3, etc.Use DENSE_RANK instead of RANK to correctly handle tied salary values β with RANK(), ties cause rank gaps that could skip your target N.
π‘ Why Interviewers Ask This: The most famous SQL logic question. It tests CTEs, window functions, and sorting simultaneously. Using DENSE_RANK() instead of RANK() is what separates an advanced answer from a basic one.
Common Mistakes in SQL Interviews
- Confusing WHERE with HAVING:
WHEREfilters rows before grouping;HAVINGfilters groups after aggregation. WritingWHERE COUNT(*) > 5is a syntax error that signals weak SQL fundamentals. - Not understanding NULL behavior:
NULL = NULLis not TRUE β it's UNKNOWN. Using=instead ofIS NULLsilently drops rows. NULL in aggregations is ignored byCOUNT(column)but counted byCOUNT(*). - Using SELECT * in production queries: Fetching all columns wastes bandwidth, breaks index-only scans, and makes queries fragile when schema changes. Always specify the exact columns you need.
- Forgetting about indexing when discussing performance:Saying "add an index" without explaining B-tree vs hash indexes, covering indexes, or the write penalty of excessive indexing shows surface-level optimization knowledge.
- Mixing up INNER JOIN, LEFT JOIN, and CROSS JOIN: INNER JOIN returns only matching rows. LEFT JOIN keeps all left rows with NULLs for non-matches. CROSS JOIN produces a Cartesian product. Using the wrong join type silently returns incorrect result sets.
- Not knowing window functions:
ROW_NUMBER(),RANK(),LAG(),LEAD(), andSUM() OVER()are essential for modern SQL interviews. Solving these with subqueries shows you're stuck in pre-2003 SQL.
Expert Interview Strategy for SQL Roles
- Always explain the query execution order.FROM β WHERE β GROUP BY β HAVING β SELECT β ORDER BY β LIMIT. Knowing this order explains why you can't use column aliases in WHERE and why HAVING can reference aggregates.
- Use window functions to show modern SQL skills. Replace correlated subqueries with
ROW_NUMBER() OVER(PARTITION BY ...). Interviewers increasingly test window functions β they're faster, cleaner, and expected at mid-level+. - Discuss query optimization with EXPLAIN plans. Mention sequential scan vs index scan, join algorithms (nested loop, hash join, merge join), and how to read an execution plan. This differentiates you from candidates who only write queries.
- Know normalization AND when to denormalize. Explain 1NF through 3NF and BCNF. Then explain that read-heavy analytics workloads benefit from denormalization. Showing both sides demonstrates practical database design experience.
- Write clean, formatted SQL. Use uppercase keywords (
SELECT,FROM,WHERE), proper indentation, and meaningful aliases. Clean SQL is easier to debug, review, and demonstrates professional habits.
How These Concepts Apply in Real SQL Jobs
Data Analyst
Writes complex queries with window functions for trend analysis, uses CTEs for readable multi-step aggregations, joins multiple tables for cross-departmental reports, and optimizes slow dashboards by analyzing EXPLAIN plans.
Backend Engineer
Designs normalized schemas for transactional systems, writes efficient parameterized queries to prevent SQL injection, manages migrations and index strategies, and uses transactions with proper isolation levels for data consistency.
Data Engineer
Builds ETL pipelines with complex SQL transformations, optimizes query performance on data warehouses (Snowflake, BigQuery, Redshift), uses partitioning and clustering for petabyte-scale tables, and writes stored procedures for automated data workflows.
Conclusion: Master SQL Interviews
These 50 SQL interview questions cover the essential concepts you'll encounter in data analyst, backend engineer, data engineer, and database administrator roles. Mastering these topics demonstrates a solid understanding of query writing, joins, aggregations, window functions, indexing, normalization, and query optimization.
SQL interviews test both correctness and efficiency. Each answer includes the optimal query approach, performance considerations, and what interviewers are evaluating β from basic syntax to advanced optimization techniques.
After reviewing these answers, reinforce your learning with hands-on query practice on real datasets. The combination of query writing + EXPLAIN analysis + schema design creates the strongest foundation for SQL interviews.
Topics covered in this guide
Topics in this guide: SQL syntax, JOIN types, window functions, CTEs, indexes, transaction isolations, normalization, stored procedures, SQL injection.
For freshers: SELECT queries, basic filtering (WHERE, HAVING), GROUP BY, simple JOINs, primary/foreign keys, basic index usage.
For experienced professionals: Analytical window functions (LEAD, LAG, RANK), complex correlated subqueries, query execution plans, index tuning, transaction concurrency anomalies.
Interview preparation tips: Always explain your JOIN strategy and how indexes speed up specific queries. Be ready to write window functions and CTEs on the whiteboard.
Frequently Asked Questions
Q.Is SQL important for software engineering interviews?
Q.What SQL topics are most asked for freshers?
Q.What SQL topics are tested for experienced roles?
Q.What is the difference between OLTP and OLAP in SQL context?
Q.How do you optimize a slow SQL query?
Found these questions helpful? Share them with your peers.
Common Interview Mistakes
Errors that eliminate candidates
- Giving textbook definitions without showing a concrete SQL use case.
- Skipping trade-offs and answering as if there is only one correct engineering decision.
- Over-answering for 2-3 minutes without structure, metrics, or outcomes.
Expert Interview Strategy
30-second answer rule
- Start with a one-line definition, then explain one real scenario from SQL.
- Use a 3-step structure: concept, practical example, and interviewer intent.
- Close with one trade-off (performance, scale, security, or maintainability).
Real-World Job Applications
These SQL patterns are directly tested for production roles where interviewers expect clear debugging steps, architecture trade-offs, and communication under time pressure.
Conclusion
Mastering these SQL interview questions means explaining concepts quickly, connecting them to real systems, and justifying decisions with practical trade-offs.
Frequently Asked Questions
How should I prepare this topic in 7 days? Focus on high-frequency patterns, rehearse 30-second answers, and revise one practical example per category.
What do interviewers score most? Clarity, structured thinking, and your ability to reason through constraints and trade-offs.