Direct Answer: What are SQL window functions?
SQL window functions perform calculations across a set of table rows that are related to the current row, without collapsing the rows into a single summary output (unlike GROUP BY). Defined by the OVER() clause, they allow analysts to compute metrics like running totals, row rankings, and period-over-period differences while retaining detail-level records.
In structured database querying, **SQL window functions** are the defining line between junior and senior data analysts. They are tested in virtually every technical interview at top MNCs and tech startups across India. If you cannot explain the difference between `RANK()` and `DENSE_RANK()`, or write a running total query, you will struggle to clear data analytics interviews.
This comprehensive guide details the syntax, operations, and business use cases for the most critical SQL window functions.
1. Understanding the OVER() Clause
The defining characteristic of any window function is the `OVER()` clause. This clause specifies how to partition and order the rows before the function calculates values. The general syntax is:
FUNCTION_NAME() OVER (
PARTITION BY partition_column
ORDER BY sort_column
)
- PARTITION BY: Divides the result set into groups (like `GROUP BY` but doesn't collapse rows).
- ORDER BY: Sorts the rows inside each partition to define the "window" of execution.
2. Ranking Functions: ROW_NUMBER, RANK, and DENSE_RANK
Ranking functions assign a sequential number to rows inside a partition. However, they handle duplicate values (ties) differently:
- `ROW_NUMBER()`: Assigns a unique sequential integer to every single row, regardless of ties.
- `RANK()`: Assigns the same rank to ties, but leaves gaps in the sequential numbering.
- `DENSE_RANK()`: Assigns the same rank to ties, leaving **no gaps** in the sequential numbering.
Ties Comparison Example
Consider a sales table with duplicate values:
| Employee | Sales | ROW_NUMBER() | RANK() | DENSE_RANK() |
|---|---|---|---|---|
| Anil | 10,000 | 1 | 1 | 1 |
| Vijay | 8,000 | 2 | 2 | 2 |
| Rahul | 8,000 | 3 | 2 | 2 |
| Kiran | 5,000 | 4 | 4 (gap left!) | 3 (no gaps) |
3. Value Functions: LAG and LEAD
Value functions allow you to reference data from surrounding rows without performing self-joins. This is extremely useful for calculating month-on-month growth or time differences:
- `LAG(column, offset)`: Retrieves a value from a specified number of rows **before** the current row.
- `LEAD(column, offset)`: Retrieves a value from a specified number of rows **after** the current row.
For example, to calculate sales growth compared to the previous month:
SELECT
SalesMonth,
Revenue,
LAG(Revenue, 1) OVER (ORDER BY SalesMonth) AS PreviousMonthRevenue,
Revenue - LAG(Revenue, 1) OVER (ORDER BY SalesMonth) AS RevenueChange
FROM MonthlySales;
4. Aggregate Window Functions
You can also use standard aggregates like `SUM()`, `AVG()`, `COUNT()`, and `MIN()/MAX()` as window functions. By adding `ORDER BY` inside `OVER()`, you compute running aggregates.
Here is how to calculate a running total of revenue over time:
SELECT
OrderDate,
DailyRevenue,
SUM(DailyRevenue) OVER (ORDER BY OrderDate) AS RunningTotalRevenue
FROM SalesOrders;
Master SQL Server for Data Analytics
SQL is the bedrock of data analytics. At Sasthra Analytics, Mr. Anil Kumar leads comprehensive SQL training covering basics, joins, subqueries, CTEs, complex window functions, performance indexing, and stored procedures. Prepare for MNC technical interviews with our bank of 500+ SQL interview Q&As and live project involvement.
Enquire About the Program