Window Functions
Window functions perform calculations across a set of rows that are related to the current row. Unlike aggregate functions that return a single result for a group of rows, window functions return a value for each row while considering a "window" of related rows.
See also:
[SELECT] [WITH (CTEs)] [ Window Function Reference ]
Syntax
Window functions use the OVER clause to define the window specification:
SELECT <windowFunction> OVER ( <windowSpec> ) FROM <multipartIdentifier>;
windowFunctionCall::=
Window Function
Window functions are specialized functions that perform calculations across a set of rows defined by the window specification. They can be either dedicated window functions (like ROW_NUMBER or LAG) or aggregate functions (like SUM or COUNT) used with an OVER clause.
{ <aggregateFunction> | ROW_NUMBER() | RANK() | DENSE_RANK() | PERCENT_RANK()
| CUME_DIST() | NTILE(<int>) | LAG(<expr> [,<offset> [,<default>]])
| LEAD(<expr> [,<offset> [,<default>]]) | FIRST_VALUE(<expr>)
| LAST_VALUE(<expr>) | NTH_VALUE(<expr>, <int>) }
windowFunction::=
Window Specification
The window specification defines how rows are partitioned, ordered, and framed for the window function calculation. PARTITION BY divides rows into groups, ORDER BY determines the sequence within each partition, and the optional frame clause specifies which rows relative to the current row are included in the calculation.
[ <windowName> ]
[ PARTITION BY <fieldList> ]
[ ORDER BY <fieldList> [ ASC | DESC ] ]
[ { ROWS | RANGE | GROUPS }
{ UNBOUNDED PRECEDING | <int> PRECEDING | CURRENT ROW }
| { ROWS | RANGE | GROUPS } BETWEEN
{ UNBOUNDED PRECEDING | <int> PRECEDING | CURRENT ROW | <int> FOLLOWING }
AND
{ <int> PRECEDING | CURRENT ROW | <int> FOLLOWING | UNBOUNDED FOLLOWING } ]
windowSpec::=
Available Window Functions
| Category | Functions |
|---|---|
| Ranking | ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE() |
| Offset | LAG(), LEAD(), FIRST_VALUE(), LAST_VALUE(), NTH_VALUE() |
| Distribution | PERCENT_RANK(), CUME_DIST() |
| Aggregate | SUM(), COUNT(), AVG(), MIN(), MAX() with the OVER clause |
Examples
Ranking with ROW_NUMBER, RANK, and DENSE_RANK
Use ranking functions to assign rankings to rows based on column values.
-- Rank contributors by contribution count
SELECT
login,
contributions,
ROW_NUMBER() OVER (ORDER BY contributions DESC) as row_num,
RANK() OVER (ORDER BY contributions DESC) as rank,
DENSE_RANK() OVER (ORDER BY contributions DESC) as dense_rank
FROM github.repos.contributors
WHERE owner = 'stackql'
AND repo = 'stackql';
Running totals and percentages
Aggregate functions like SUM and COUNT can be used as window functions to calculate running totals and percentages.
-- Running total and percentage of contributions
SELECT
login,
contributions,
SUM(contributions) OVER (ORDER BY contributions DESC) as running_total,
SUM(contributions) OVER () as total_contributions,
ROUND(100.0 * contributions / SUM(contributions) OVER (), 2) as pct_of_total
FROM github.repos.contributors
WHERE owner = 'stackql'
AND repo = 'stackql';
Comparing rows with LAG and LEAD
Use LAG and LEAD to access values from previous or subsequent rows.
-- Compare each release to previous release
SELECT
tag_name,
name,
published_at,
LAG(tag_name, 1) OVER (ORDER BY published_at) as previous_release,
LEAD(tag_name, 1) OVER (ORDER BY published_at) as next_release,
julianday(published_at) - julianday(LAG(published_at, 1) OVER (ORDER BY published_at)) as days_since_last_release
FROM github.repos.releases
WHERE owner = 'stackql'
AND repo = 'stackql'
ORDER BY published_at;
Creating groups with NTILE
Use NTILE to divide rows into a specified number of groups.
-- Group contributors into quartiles
SELECT
login,
contributions,
NTILE(4) OVER (ORDER BY contributions DESC) as quartile,
CASE NTILE(4) OVER (ORDER BY contributions DESC)
WHEN 1 THEN 'Top Contributors'
WHEN 2 THEN 'Active Contributors'
WHEN 3 THEN 'Moderate Contributors'
WHEN 4 THEN 'Occasional Contributors'
END as tier
FROM github.repos.contributors
WHERE owner = 'stackql'
AND repo = 'stackql';
Using PARTITION BY for grouped analysis
Use PARTITION BY to perform window calculations within groups.
-- Track cumulative issues by state
SELECT
number,
title,
state,
created_at,
ROW_NUMBER() OVER (ORDER BY created_at) as issue_sequence,
COUNT(*) OVER (ORDER BY created_at) as cumulative_issues,
COUNT(*) OVER (PARTITION BY state ORDER BY created_at) as cumulative_by_state
FROM github.issues.issues
WHERE owner = 'stackql'
AND repo = 'stackql'
ORDER BY created_at;
Using window frames
Specify a frame clause to control which rows are included in the window calculation.
-- Calculate moving average of commit activity
WITH weekly_totals AS (
SELECT
week,
SUM(json_each.value) as commits_this_week
FROM github.repos.stats_commit_activity, JSON_EACH(days)
WHERE owner = 'stackql'
AND repo = 'stackql'
GROUP BY week
)
SELECT
week,
commits_this_week,
ROUND(AVG(commits_this_week) OVER (ORDER BY week ROWS BETWEEN 3 PRECEDING AND CURRENT ROW), 1) as four_week_moving_avg,
SUM(commits_this_week) OVER (ORDER BY week) as cumulative_commits
FROM weekly_totals
ORDER BY week;
For detailed documentation on each window function, see the Window Function Reference.
For more information, see https://sqlite.org/windowfunctions.html.