Skip to main content

Azure Activity Observability using StackQL

In this guide, you'll learn how to query and report Azure activity logs using StackQL, a powerful dev tool that enables querying and deploying cloud infrastructure and resources using SQL syntax.

Tested with embedded sql backend macos linux powershell

Basic query

The basic query shown below uses the azure.monitor.activity_logs resource in the azure provider; substitute the subscriptionId with yours. In this query we will use jsonnet with external variables for StackQL query pre-processing, see Using Variables for more information. In this case we have done this as an inline jsonnet code block, but this can also be supplied as an external file using either the --iqldata or -q arguments to stackql exec or stackql shell.

The JSON_EXTRACT and SPLIT_PART functions are used to manipulate response data.

<<<jsonnet
{
stackqlQuery: {
subscriptionId: '631d1c6d-2a65-43e7-93c2-688bfe4e1468',
filter: "eventTimestamp ge ''" + std.extVar('startdate') + "''",
}
}
>>>
--
-- StackQL query with jsonnet preprocessing to retrieve all recent Azure activity
--
SELECT
eventTimestamp as event_timestamp,
caller,
JSON_EXTRACT(claims, '$.ipaddr') as ip_address,
subscriptionId,
tenantId,
SPLIT_PART(JSON_EXTRACT(authorization, '$.scope'), '/', 5) as resource_group,
SPLIT_PART(resourceId, '/', -1) as resource_name,
level,
JSON_EXTRACT(authorization, '$.action') as action,
JSON_EXTRACT(category, '$.localizedValue') as category,
JSON_EXTRACT(operationName, '$.localizedValue') as operation,
JSON_EXTRACT(status, '$.localizedValue') as status
FROM azure.monitor.activity_logs
WHERE $filter = '{{ .stackqlQuery.filter }}'
AND subscriptionId = '{{ .stackqlQuery.subscriptionId }}';

Running the query interactively

From a machine that is authenticated to Azure (including the Azure Portal Cloud Shell):

# to run on mac or linux
./stackql shell --var startdate=$(date -u -d '1 day ago' '+%Y-%m-%dT%H:%M:%SZ')

or using PowerShell:

# to run using windows powershell
$startDate = (Get-Date).AddDays(-1).ToString("yyyy-MM-ddTHH:mm:ssZ")
stackql.exe shell --var startdate=$startDate

You should see a tabular output with columns for: event_timestamp, caller, ip_address, subscriptionId, tenantId, resource_group, resource_name, level, action, category, operation and status.

Generating a CSV report

By saving the StackQL query to a file (activity_report.iql), you can produce a csv report from the activity query as follows (see --output csv for more information on stackql outputs):

# to run on mac or linux
./stackql exec \
--output csv \
-i activity_report.iql \
-f activity_report.csv \
--var startdate=$(date -u -d '1 day ago' '+%Y-%m-%dT%H:%M:%SZ')

or using PowerShell:

# to run using windows powershell
$startDate = (Get-Date).AddDays(-1).ToString("yyyy-MM-ddTHH:mm:ssZ")
stackql.exe exec `
--output csv `
-i activity_report.iql `
-f activity_report.csv `
--var startdate=$startDate

Server-side filtering with KQL

Server-side filtering in Azure is achieved through the use of KQL (Kusto Query Language). By extending the $filter clause in your query, you can filter data directly on the server. This method is highly efficient as it reduces the amount of data transferred over the network. Below are some examples of server-side filtering (for brevity these queries are using parameters directly instead of using the jsonnet preprocessor):

Specifying a date range for activities

--
-- List events in a time range
--
SELECT
eventTimestamp as event_timestamp,
caller,
JSON_EXTRACT(claims, '$.ipaddr') as ip_address,
subscriptionId,
tenantId,
SPLIT_PART(JSON_EXTRACT(authorization, '$.scope'), '/', 5) as resource_group,
SPLIT_PART(resourceId, '/', -1) as resource_name,
level,
JSON_EXTRACT(authorization, '$.action') as action,
JSON_EXTRACT(category, '$.localizedValue') as category,
JSON_EXTRACT(operationName, '$.localizedValue') as operation,
JSON_EXTRACT(status, '$.localizedValue') as status
FROM azure.monitor.activity_logs
WHERE $filter = 'eventTimestamp ge ''2024-01-24T00:00:00Z'' and eventTimestamp le ''2024-01-25T00:00:00Z'''
AND subscriptionId = '631d1c6d-2a65-43e7-93c2-688bfe4e1468';

Getting activities for a resource group

--
-- List events for a resource group
--
SELECT
eventTimestamp as event_timestamp,
caller,
JSON_EXTRACT(claims, '$.ipaddr') as ip_address,
subscriptionId,
tenantId,
SPLIT_PART(resourceId, '/', -1) as resource_name,
level,
JSON_EXTRACT(authorization, '$.action') as action,
JSON_EXTRACT(category, '$.localizedValue') as category,
JSON_EXTRACT(operationName, '$.localizedValue') as operation,
JSON_EXTRACT(status, '$.localizedValue') as status
FROM azure.monitor.activity_logs
WHERE $filter = 'eventTimestamp ge ''2024-01-24T00:00:00Z'' and eventTimestamp le ''2024-01-25T00:00:00Z'' and resourceGroupName eq ''MC_kube-03_aks862a_australiaeast'''
AND subscriptionId = '631d1c6d-2a65-43e7-93c2-688bfe4e1468';

Getting activities for a resource type

--
-- List events for a resource provider
--
SELECT
eventTimestamp as event_timestamp,
caller,
JSON_EXTRACT(claims, '$.ipaddr') as ip_address,
subscriptionId,
tenantId,
SPLIT_PART(JSON_EXTRACT(authorization, '$.scope'), '/', 5) as resource_group,
SPLIT_PART(resourceId, '/', -1) as resource_name,
level,
JSON_EXTRACT(authorization, '$.action') as action,
JSON_EXTRACT(category, '$.localizedValue') as category,
JSON_EXTRACT(operationName, '$.localizedValue') as operation,
JSON_EXTRACT(status, '$.localizedValue') as status
FROM azure.monitor.activity_logs
WHERE $filter = 'eventTimestamp ge ''2024-01-24T00:00:00Z'' and eventTimestamp le ''2024-01-25T00:00:00Z'' and resourceProvider eq ''Microsoft.Compute'''
AND subscriptionId = '631d1c6d-2a65-43e7-93c2-688bfe4e1468';

Getting activities for a specific resource

Note we added an additional field (correlationId) which we will use in the next server-side filter...

--
-- List events for resource using the resourceUri
--
SELECT
eventTimestamp as event_timestamp,
caller,
JSON_EXTRACT(claims, '$.ipaddr') as ip_address,
subscriptionId,
tenantId,
SPLIT_PART(resourceId, '/', -1) as resource_name,
correlationId,
level,
JSON_EXTRACT(authorization, '$.action') as action,
JSON_EXTRACT(category, '$.localizedValue') as category,
JSON_EXTRACT(operationName, '$.localizedValue') as operation,
JSON_EXTRACT(status, '$.localizedValue') as status
FROM azure.monitor.activity_logs
WHERE $filter = 'eventTimestamp ge ''2024-01-24T00:00:00Z'' and eventTimestamp le ''2024-01-25T00:00:00Z'' and resourceUri eq ''/subscriptions/631d1c6d-2a65-43e7-93c2-688bfe4e1468/resourceGroups/MC_kube-03_aks862a_australiaeast/providers/Microsoft.Compute/virtualMachines/aks-agentpool-35064155-1'''
AND subscriptionId = '631d1c6d-2a65-43e7-93c2-688bfe4e1468';

Getting activities for a specific resource

The correlation ID is a unique identifier used to track and group events related to a particular operation or transaction in Azure. This query helps in pinpointing specific events within a given time frame that are related to a particular operation, making it easier to troubleshoot or understand the sequence of activities. Here is an example using a correlationId captured in the previous query:

--
-- List events for a correlation Id
--
SELECT
eventTimestamp as event_timestamp,
caller,
JSON_EXTRACT(claims, '$.ipaddr') as ip_address,
subscriptionId,
tenantId,
SPLIT_PART(JSON_EXTRACT(authorization, '$.scope'), '/', 5) as resource_group,
SPLIT_PART(resourceId, '/', -1) as resource_name,
level,
JSON_EXTRACT(authorization, '$.action') as action,
JSON_EXTRACT(category, '$.localizedValue') as category,
JSON_EXTRACT(operationName, '$.localizedValue') as operation,
JSON_EXTRACT(status, '$.localizedValue') as status
FROM azure.monitor.activity_logs
WHERE $filter = 'eventTimestamp ge ''2024-01-24T00:00:00Z'' and eventTimestamp le ''2024-01-25T00:00:00Z'' and correlationId eq ''30b92c6d-4a82-401f-ac73-e28377669539'''
AND subscriptionId = '631d1c6d-2a65-43e7-93c2-688bfe4e1468';

Client-side filtering with the WHERE Clause

Client-side filtering can be done using the traditional SQL WHERE clause. This method filters the data after it has been retrieved from the server, offering flexibility in data manipulation and analysis within the StackQL environment.

Getting activities for a specific user

In the example query, the WHERE clause filters records based on the caller (the user), providing an additional layer of data refinement.

--
-- Filter events for a particular user (caller)
--
SELECT
eventTimestamp as event_timestamp,
caller,
JSON_EXTRACT(claims, '$.ipaddr') as ip_address,
subscriptionId,
tenantId,
SPLIT_PART(JSON_EXTRACT(authorization, '$.scope'), '/', 5) as resource_group,
SPLIT_PART(resourceId, '/', -1) as resource_name,
level,
JSON_EXTRACT(authorization, '$.action') as action,
JSON_EXTRACT(category, '$.localizedValue') as category,
JSON_EXTRACT(operationName, '$.localizedValue') as operation,
JSON_EXTRACT(status, '$.localizedValue') as status
FROM azure.monitor.activity_logs
WHERE $filter = 'eventTimestamp ge ''2024-01-01T00:00:00Z'''
AND subscriptionId = '631d1c6d-2a65-43e7-93c2-688bfe4e1468'
AND caller = 'javen@stackql.io';

Running summary/aggregate queries

As StackQL is a complete SQL engine, GROUP BY/aggregate queries (including AVG, COUNT, SUM, MIN, MAX and more are natively available for all providers, services and resources.

Count events by severity level

In this example we will group events from a start datetime by their severity and calculate the counts for each level.

--
-- Summarize events by severity level
--
SELECT
level,
COUNT(*) as num_events
FROM azure.monitor.activity_logs
WHERE $filter = 'eventTimestamp ge ''2024-01-29T00:00:00Z'''
AND subscriptionId = '631d1c6d-2a65-43e7-93c2-688bfe4e1468'
GROUP BY level;
/* returns..
|---------------|------------|
| level | num_events |
|---------------|------------|
| Error | 58 |
|---------------|------------|
| Informational | 57 |
|---------------|------------|
| Warning | 57 |
|---------------|------------|
*/

Using JOIN to enrich activities

SELECT
l.eventTimestamp as event_timestamp,
l.level,
JSON_EXTRACT(l.operationName, '$.localizedValue') as operation,
vm.name,
vm.location,
JSON_EXTRACT(vm.properties, '$.timeCreated') as time_created
FROM azure.monitor.activity_logs l
INNER JOIN azure.compute.virtual_machines vm
ON vm.name = SPLIT_PART(l.resourceId, '/', -1)
WHERE vm.resourceGroupName = 'MC_kube-03_aks862a_australiaeast'
AND subscriptionId = '631d1c6d-2a65-43e7-93c2-688bfe4e1468'
AND l.$filter = 'eventTimestamp ge ''2024-01-28T00:00:00Z'' and resourceGroupName eq ''MC_kube-03_aks862a_australiaeast''';

or using variables from jsonnet, provided as a file using either the --iqldata or -q arguments, or as a jsonnet block as shown here:

<<<jsonnet
{
stackqlQuery: {
subscriptionId: '631d1c6d-2a65-43e7-93c2-688bfe4e1468',
resourceGroupName: 'MC_kube-03_aks862a_australiaeast',
}
}
>>>
SELECT
l.eventTimestamp as event_timestamp,
l.level,
JSON_EXTRACT(l.operationName, '$.localizedValue') as operation,
vm.name,
vm.location,
JSON_EXTRACT(vm.properties, '$.timeCreated') as time_created
FROM azure.monitor.activity_logs l
INNER JOIN azure.compute.virtual_machines vm
ON vm.name = SPLIT_PART(l.resourceId, '/', -1)
WHERE vm.resourceGroupName = '{{ .stackqlQuery.resourceGroupName}}'
AND subscriptionId = '{{ .stackqlQuery.subscriptionId}}'
AND l.$filter = 'eventTimestamp ge ''2024-01-28T00:00:00Z'' and resourceGroupName eq ''{{ .stackqlQuery.resourceGroupName}}''';
info

WHERE parameters in a JOIN query which are not disambiguated, such as subscriptionId in the above case are shared by both resources (azure.monitor.activity_logs and azure.compute.virtual_machines in this case).