Window functions are a powerful feature in SQL that allow you to perform calculations across a set of table rows that are somehow related to the current row. They are particularly useful in data analysis and reporting, making them a common topic in technical interviews for software engineers and data scientists. In this article, we will explore the concept of window functions, their syntax, and how to apply them in interview scenarios.
Window functions operate on a set of rows defined by an OVER() clause. Unlike aggregate functions, which return a single value for a group of rows, window functions return a value for each row in the result set. This allows you to perform calculations like running totals, moving averages, and ranking without collapsing the result set.
The basic syntax of a window function is as follows:
<window_function>(<column>) OVER (<window_specification>)
ROW_NUMBER(): Assigns a unique sequential integer to rows within a partition of a result set.
SELECT employee_id, department_id, ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank
FROM employees;
RANK(): Similar to ROW_NUMBER(), but it assigns the same rank to rows with equal values.
SELECT employee_id, department_id, RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank
FROM employees;
SUM(): Calculates the cumulative sum of a column.
SELECT employee_id, salary, SUM(salary) OVER (ORDER BY employee_id) AS running_total
FROM employees;
AVG(): Computes the average of a column over a specified window.
SELECT employee_id, salary, AVG(salary) OVER (PARTITION BY department_id) AS avg_salary
FROM employees;
When preparing for interviews, you may encounter questions that require you to demonstrate your understanding of window functions. Here are a few common scenarios:
Question: Write a query to calculate the running total of sales for each month.
Solution:
SELECT month, sales,
SUM(sales) OVER (ORDER BY month) AS running_total
FROM monthly_sales;
Question: Write a query to rank employees within each department based on their salary.
Solution:
SELECT employee_id, department_id, salary,
RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS salary_rank
FROM employees;
Question: Write a query to calculate a 3-month moving average of sales.
Solution:
SELECT month, sales,
AVG(sales) OVER (ORDER BY month ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg
FROM monthly_sales;
Window functions are essential for performing complex calculations in SQL without losing the detail of individual rows. Mastering these functions can significantly enhance your data manipulation skills and improve your performance in technical interviews. Practice using window functions with different datasets to become proficient and confident in your SQL abilities.