Skip to main content

Window Functions

Window functions perform calculations across a set of rows that are related to the current row. Unlike aggregate functions that return a single result for a group of rows, window functions return a value for each row while considering a "window" of related rows.

See also: [SELECT] [WITH (CTEs)] [ Window Function Reference ]


Syntax

Window functions use the OVER clause to define the window specification:

SELECT <windowFunction> OVER ( <windowSpec> ) FROM <multipartIdentifier>;

windowFunctionCall::=

windowFunctionOVER'('windowSpec')'

Window Function

Window functions are specialized functions that perform calculations across a set of rows defined by the window specification. They can be either dedicated window functions (like ROW_NUMBER or LAG) or aggregate functions (like SUM or COUNT) used with an OVER clause.

{ <aggregateFunction> | ROW_NUMBER() | RANK() | DENSE_RANK() | PERCENT_RANK() 
| CUME_DIST() | NTILE(<int>) | LAG(<expr> [,<offset> [,<default>]])
| LEAD(<expr> [,<offset> [,<default>]]) | FIRST_VALUE(<expr>)
| LAST_VALUE(<expr>) | NTH_VALUE(<expr>, <int>) }

windowFunction::=

ROW_NUMBER'('')'RANK'('')'DENSE_RANK'('')'PERCENT_RANK'('')'CUME_DIST'('')'NTILE'('number')'LAG'('expression','number','expression')'LEAD'('expression','number','expression')'FIRST_VALUE'('expression')'LAST_VALUE'('expression')'NTH_VALUE'('expression','number')'aggregateFunction

Window Specification

The window specification defines how rows are partitioned, ordered, and framed for the window function calculation. PARTITION BY divides rows into groups, ORDER BY determines the sequence within each partition, and the optional frame clause specifies which rows relative to the current row are included in the calculation.

[ <windowName> ]
[ PARTITION BY <fieldList> ]
[ ORDER BY <fieldList> [ ASC | DESC ] ]
[ { ROWS | RANGE | GROUPS }
{ UNBOUNDED PRECEDING | <int> PRECEDING | CURRENT ROW }
| { ROWS | RANGE | GROUPS } BETWEEN
{ UNBOUNDED PRECEDING | <int> PRECEDING | CURRENT ROW | <int> FOLLOWING }
AND
{ <int> PRECEDING | CURRENT ROW | <int> FOLLOWING | UNBOUNDED FOLLOWING } ]

windowSpec::=

PARTITIONBYfullFieldNameCOMMAfullFieldNameORDERBYfullFieldNameCOMMAfullFieldNameASCDESCROWSRANGEGROUPSUNBOUNDEDPRECEDINGnumberPRECEDINGCURRENTROWBETWEENUNBOUNDEDPRECEDINGnumberPRECEDINGCURRENTROWnumberFOLLOWINGANDnumberPRECEDINGCURRENTROWnumberFOLLOWINGUNBOUNDEDFOLLOWING

Available Window Functions

CategoryFunctions
RankingROW_NUMBER(), RANK(), DENSE_RANK(), NTILE()
OffsetLAG(), LEAD(), FIRST_VALUE(), LAST_VALUE(), NTH_VALUE()
DistributionPERCENT_RANK(), CUME_DIST()
AggregateSUM(), COUNT(), AVG(), MIN(), MAX() with the OVER clause

Examples

Ranking with ROW_NUMBER, RANK, and DENSE_RANK

Use ranking functions to assign rankings to rows based on column values.

-- Rank contributors by contribution count
SELECT
login,
contributions,
ROW_NUMBER() OVER (ORDER BY contributions DESC) as row_num,
RANK() OVER (ORDER BY contributions DESC) as rank,
DENSE_RANK() OVER (ORDER BY contributions DESC) as dense_rank
FROM github.repos.contributors
WHERE owner = 'stackql'
AND repo = 'stackql';

Running totals and percentages

Aggregate functions like SUM and COUNT can be used as window functions to calculate running totals and percentages.

-- Running total and percentage of contributions
SELECT
login,
contributions,
SUM(contributions) OVER (ORDER BY contributions DESC) as running_total,
SUM(contributions) OVER () as total_contributions,
ROUND(100.0 * contributions / SUM(contributions) OVER (), 2) as pct_of_total
FROM github.repos.contributors
WHERE owner = 'stackql'
AND repo = 'stackql';

Comparing rows with LAG and LEAD

Use LAG and LEAD to access values from previous or subsequent rows.

-- Compare each release to previous release
SELECT
tag_name,
name,
published_at,
LAG(tag_name, 1) OVER (ORDER BY published_at) as previous_release,
LEAD(tag_name, 1) OVER (ORDER BY published_at) as next_release,
julianday(published_at) - julianday(LAG(published_at, 1) OVER (ORDER BY published_at)) as days_since_last_release
FROM github.repos.releases
WHERE owner = 'stackql'
AND repo = 'stackql'
ORDER BY published_at;

Creating groups with NTILE

Use NTILE to divide rows into a specified number of groups.

-- Group contributors into quartiles
SELECT
login,
contributions,
NTILE(4) OVER (ORDER BY contributions DESC) as quartile,
CASE NTILE(4) OVER (ORDER BY contributions DESC)
WHEN 1 THEN 'Top Contributors'
WHEN 2 THEN 'Active Contributors'
WHEN 3 THEN 'Moderate Contributors'
WHEN 4 THEN 'Occasional Contributors'
END as tier
FROM github.repos.contributors
WHERE owner = 'stackql'
AND repo = 'stackql';

Using PARTITION BY for grouped analysis

Use PARTITION BY to perform window calculations within groups.

-- Track cumulative issues by state
SELECT
number,
title,
state,
created_at,
ROW_NUMBER() OVER (ORDER BY created_at) as issue_sequence,
COUNT(*) OVER (ORDER BY created_at) as cumulative_issues,
COUNT(*) OVER (PARTITION BY state ORDER BY created_at) as cumulative_by_state
FROM github.issues.issues
WHERE owner = 'stackql'
AND repo = 'stackql'
ORDER BY created_at;

Using window frames

Specify a frame clause to control which rows are included in the window calculation.

-- Calculate moving average of commit activity
WITH weekly_totals AS (
SELECT
week,
SUM(json_each.value) as commits_this_week
FROM github.repos.stats_commit_activity, JSON_EACH(days)
WHERE owner = 'stackql'
AND repo = 'stackql'
GROUP BY week
)
SELECT
week,
commits_this_week,
ROUND(AVG(commits_this_week) OVER (ORDER BY week ROWS BETWEEN 3 PRECEDING AND CURRENT ROW), 1) as four_week_moving_avg,
SUM(commits_this_week) OVER (ORDER BY week) as cumulative_commits
FROM weekly_totals
ORDER BY week;

For detailed documentation on each window function, see the Window Function Reference.

For more information, see https://sqlite.org/windowfunctions.html.