Window Functions: Ranking, Aggregating, and Filtering in SQL

Window functions are a powerful feature in SQL that allow you to perform calculations across a set of table rows that are somehow related to the current row. This capability is particularly useful for tasks such as ranking, aggregating, and filtering data without collapsing the result set into a single output row. In this article, we will explore how to effectively use window functions in SQL, focusing on their applications in ranking, aggregating, and filtering data.

Understanding Window Functions

A window function operates on a set of rows defined by an OVER() clause. The basic syntax of a window function is:

function_name(column_name) OVER (PARTITION BY column_name ORDER BY column_name)
  • PARTITION BY: Divides the result set into partitions to which the window function is applied.
  • ORDER BY: Defines the order of rows within each partition.

Ranking Data with Window Functions

One of the most common uses of window functions is to rank data. SQL provides several ranking functions:

  • ROW_NUMBER(): Assigns a unique sequential integer to rows within a partition.
  • RANK(): Similar to ROW_NUMBER(), but rows with equal values receive the same rank, and the next rank is skipped.
  • DENSE_RANK(): Similar to RANK(), but does not skip ranks for ties.

Example: Ranking Employees by Salary

SELECT employee_id, salary,
       RANK() OVER (ORDER BY salary DESC) AS salary_rank
FROM employees;

In this example, employees are ranked based on their salary in descending order. Tied salaries will receive the same rank, and the next rank will be incremented accordingly.

Aggregating Data with Window Functions

Window functions can also be used for aggregating data without collapsing the result set. Common aggregate functions include SUM(), AVG(), COUNT(), etc.

Example: Calculating Running Total of Sales

SELECT order_id, sale_amount,
       SUM(sale_amount) OVER (ORDER BY order_date) AS running_total
FROM sales;

This query calculates a running total of sales amounts ordered by the order date, allowing you to see cumulative sales over time.

Filtering Data with Window Functions

While window functions do not directly filter rows, you can use them in conjunction with common table expressions (CTEs) or subqueries to filter based on the results of a window function.

Example: Filtering Top N Sales per Region

WITH ranked_sales AS (
    SELECT region, sale_amount,
           RANK() OVER (PARTITION BY region ORDER BY sale_amount DESC) AS rank
    FROM sales
)
SELECT region, sale_amount
FROM ranked_sales
WHERE rank <= 3;

In this example, we first rank sales within each region and then filter to get the top 3 sales per region.

Conclusion

Window functions are essential for performing complex data analysis in SQL. By mastering ranking, aggregating, and filtering with window functions, you can enhance your data manipulation skills and prepare effectively for technical interviews. Understanding these concepts will not only help you in interviews but also in real-world data analysis tasks.