Azure Activity Observability using StackQL
In this guide, you'll learn how to query and report Azure activity logs using StackQL, a powerful dev tool that enables querying and deploying cloud infrastructure and resources using SQL syntax.
Tested with embedded sql backend macos linux powershell
Basic query
The basic query shown below uses the azure.monitor.activity_logs
resource in the azure
provider; substitute the subscriptionId
with yours. In this query we will use jsonnet
with external variables for StackQL query pre-processing, see Using Variables for more information. In this case we have done this as an inline jsonnet
code block, but this can also be supplied as an external file using either the --iqldata
or -q
arguments to stackql exec
or stackql shell
.
The JSON_EXTRACT
and SPLIT_PART
functions are used to manipulate response data.
<<<jsonnet
{
stackqlQuery: {
subscriptionId: '631d1c6d-2a65-43e7-93c2-688bfe4e1468',
filter: "eventTimestamp ge ''" + std.extVar('startdate') + "''",
}
}
>>>
--
-- StackQL query with jsonnet preprocessing to retrieve all recent Azure activity
--
SELECT
eventTimestamp as event_timestamp,
caller,
JSON_EXTRACT(claims, '$.ipaddr') as ip_address,
subscriptionId,
tenantId,
SPLIT_PART(JSON_EXTRACT(authorization, '$.scope'), '/', 5) as resource_group,
SPLIT_PART(resourceId, '/', -1) as resource_name,
level,
JSON_EXTRACT(authorization, '$.action') as action,
JSON_EXTRACT(category, '$.localizedValue') as category,
JSON_EXTRACT(operationName, '$.localizedValue') as operation,
JSON_EXTRACT(status, '$.localizedValue') as status
FROM azure.monitor.activity_logs
WHERE $filter = '{{ .stackqlQuery.filter }}'
AND subscriptionId = '{{ .stackqlQuery.subscriptionId }}';
Running the query interactively
From a machine that is authenticated to Azure (including the Azure Portal Cloud Shell):
# to run on mac or linux
./stackql shell --var startdate=$(date -u -d '1 day ago' '+%Y-%m-%dT%H:%M:%SZ')
or using PowerShell:
# to run using windows powershell
$startDate = (Get-Date).AddDays(-1).ToString("yyyy-MM-ddTHH:mm:ssZ")
stackql.exe shell --var startdate=$startDate
You should see a tabular output with columns for: event_timestamp
, caller
, ip_address
, subscriptionId
, tenantId
, resource_group
, resource_name
, level
, action
, category
, operation
and status
.
Generating a CSV report
By saving the StackQL query to a file (activity_report.iql
), you can produce a csv
report from the activity query as follows (see --output csv
for more information on stackql
outputs):
# to run on mac or linux
./stackql exec \
--output csv \
-i activity_report.iql \
-f activity_report.csv \
--var startdate=$(date -u -d '1 day ago' '+%Y-%m-%dT%H:%M:%SZ')
or using PowerShell:
# to run using windows powershell
$startDate = (Get-Date).AddDays(-1).ToString("yyyy-MM-ddTHH:mm:ssZ")
stackql.exe exec `
--output csv `
-i activity_report.iql `
-f activity_report.csv `
--var startdate=$startDate
Server-side filtering with KQL
Server-side filtering in Azure is achieved through the use of KQL (Kusto Query Language). By extending the $filter
clause in your query, you can filter data directly on the server. This method is highly efficient as it reduces the amount of data transferred over the network. Below are some examples of server-side filtering (for brevity these queries are using parameters directly instead of using the jsonnet preprocessor):
Specifying a date range for activities
--
-- List events in a time range
--
SELECT
eventTimestamp as event_timestamp,
caller,
JSON_EXTRACT(claims, '$.ipaddr') as ip_address,
subscriptionId,
tenantId,
SPLIT_PART(JSON_EXTRACT(authorization, '$.scope'), '/', 5) as resource_group,
SPLIT_PART(resourceId, '/', -1) as resource_name,
level,
JSON_EXTRACT(authorization, '$.action') as action,
JSON_EXTRACT(category, '$.localizedValue') as category,
JSON_EXTRACT(operationName, '$.localizedValue') as operation,
JSON_EXTRACT(status, '$.localizedValue') as status
FROM azure.monitor.activity_logs
WHERE $filter = 'eventTimestamp ge ''2024-01-24T00:00:00Z'' and eventTimestamp le ''2024-01-25T00:00:00Z'''
AND subscriptionId = '631d1c6d-2a65-43e7-93c2-688bfe4e1468';
Getting activities for a resource group
--
-- List events for a resource group
--
SELECT
eventTimestamp as event_timestamp,
caller,
JSON_EXTRACT(claims, '$.ipaddr') as ip_address,
subscriptionId,
tenantId,
SPLIT_PART(resourceId, '/', -1) as resource_name,
level,
JSON_EXTRACT(authorization, '$.action') as action,
JSON_EXTRACT(category, '$.localizedValue') as category,
JSON_EXTRACT(operationName, '$.localizedValue') as operation,
JSON_EXTRACT(status, '$.localizedValue') as status
FROM azure.monitor.activity_logs
WHERE $filter = 'eventTimestamp ge ''2024-01-24T00:00:00Z'' and eventTimestamp le ''2024-01-25T00:00:00Z'' and resourceGroupName eq ''MC_kube-03_aks862a_australiaeast'''
AND subscriptionId = '631d1c6d-2a65-43e7-93c2-688bfe4e1468';
Getting activities for a resource type
--
-- List events for a resource provider
--
SELECT
eventTimestamp as event_timestamp,
caller,
JSON_EXTRACT(claims, '$.ipaddr') as ip_address,
subscriptionId,
tenantId,
SPLIT_PART(JSON_EXTRACT(authorization, '$.scope'), '/', 5) as resource_group,
SPLIT_PART(resourceId, '/', -1) as resource_name,
level,
JSON_EXTRACT(authorization, '$.action') as action,
JSON_EXTRACT(category, '$.localizedValue') as category,
JSON_EXTRACT(operationName, '$.localizedValue') as operation,
JSON_EXTRACT(status, '$.localizedValue') as status
FROM azure.monitor.activity_logs
WHERE $filter = 'eventTimestamp ge ''2024-01-24T00:00:00Z'' and eventTimestamp le ''2024-01-25T00:00:00Z'' and resourceProvider eq ''Microsoft.Compute'''
AND subscriptionId = '631d1c6d-2a65-43e7-93c2-688bfe4e1468';
Getting activities for a specific resource
Note we added an additional field (correlationId
) which we will use in the next server-side filter...
--
-- List events for resource using the resourceUri
--
SELECT
eventTimestamp as event_timestamp,
caller,
JSON_EXTRACT(claims, '$.ipaddr') as ip_address,
subscriptionId,
tenantId,
SPLIT_PART(resourceId, '/', -1) as resource_name,
correlationId,
level,
JSON_EXTRACT(authorization, '$.action') as action,
JSON_EXTRACT(category, '$.localizedValue') as category,
JSON_EXTRACT(operationName, '$.localizedValue') as operation,
JSON_EXTRACT(status, '$.localizedValue') as status
FROM azure.monitor.activity_logs
WHERE $filter = 'eventTimestamp ge ''2024-01-24T00:00:00Z'' and eventTimestamp le ''2024-01-25T00:00:00Z'' and resourceUri eq ''/subscriptions/631d1c6d-2a65-43e7-93c2-688bfe4e1468/resourceGroups/MC_kube-03_aks862a_australiaeast/providers/Microsoft.Compute/virtualMachines/aks-agentpool-35064155-1'''
AND subscriptionId = '631d1c6d-2a65-43e7-93c2-688bfe4e1468';
Getting activities for a specific resource
The correlation ID is a unique identifier used to track and group events related to a particular operation or transaction in Azure. This query helps in pinpointing specific events within a given time frame that are related to a particular operation, making it easier to troubleshoot or understand the sequence of activities. Here is an example using a correlationId
captured in the previous query:
--
-- List events for a correlation Id
--
SELECT
eventTimestamp as event_timestamp,
caller,
JSON_EXTRACT(claims, '$.ipaddr') as ip_address,
subscriptionId,
tenantId,
SPLIT_PART(JSON_EXTRACT(authorization, '$.scope'), '/', 5) as resource_group,
SPLIT_PART(resourceId, '/', -1) as resource_name,
level,
JSON_EXTRACT(authorization, '$.action') as action,
JSON_EXTRACT(category, '$.localizedValue') as category,
JSON_EXTRACT(operationName, '$.localizedValue') as operation,
JSON_EXTRACT(status, '$.localizedValue') as status
FROM azure.monitor.activity_logs
WHERE $filter = 'eventTimestamp ge ''2024-01-24T00:00:00Z'' and eventTimestamp le ''2024-01-25T00:00:00Z'' and correlationId eq ''30b92c6d-4a82-401f-ac73-e28377669539'''
AND subscriptionId = '631d1c6d-2a65-43e7-93c2-688bfe4e1468';
Client-side filtering with the WHERE
Clause
Client-side filtering can be done using the traditional SQL WHERE
clause. This method filters the data after it has been retrieved from the server, offering flexibility in data manipulation and analysis within the StackQL environment.
Getting activities for a specific user
In the example query, the WHERE
clause filters records based on the caller
(the user), providing an additional layer of data refinement.
--
-- Filter events for a particular user (caller)
--
SELECT
eventTimestamp as event_timestamp,
caller,
JSON_EXTRACT(claims, '$.ipaddr') as ip_address,
subscriptionId,
tenantId,
SPLIT_PART(JSON_EXTRACT(authorization, '$.scope'), '/', 5) as resource_group,
SPLIT_PART(resourceId, '/', -1) as resource_name,
level,
JSON_EXTRACT(authorization, '$.action') as action,
JSON_EXTRACT(category, '$.localizedValue') as category,
JSON_EXTRACT(operationName, '$.localizedValue') as operation,
JSON_EXTRACT(status, '$.localizedValue') as status
FROM azure.monitor.activity_logs
WHERE $filter = 'eventTimestamp ge ''2024-01-01T00:00:00Z'''
AND subscriptionId = '631d1c6d-2a65-43e7-93c2-688bfe4e1468'
AND caller = 'javen@stackql.io';
Running summary/aggregate queries
As StackQL is a complete SQL engine, GROUP BY
/aggregate queries (including AVG
, COUNT
, SUM
, MIN
, MAX
and more are natively available for all providers, services and resources.
Count events by severity level
In this example we will group events from a start datetime by their severity and calculate the counts for each level.
--
-- Summarize events by severity level
--
SELECT
level,
COUNT(*) as num_events
FROM azure.monitor.activity_logs
WHERE $filter = 'eventTimestamp ge ''2024-01-29T00:00:00Z'''
AND subscriptionId = '631d1c6d-2a65-43e7-93c2-688bfe4e1468'
GROUP BY level;
/* returns..
|---------------|------------|
| level | num_events |
|---------------|------------|
| Error | 58 |
|---------------|------------|
| Informational | 57 |
|---------------|------------|
| Warning | 57 |
|---------------|------------|
*/
Using JOIN
to enrich activities
SELECT
l.eventTimestamp as event_timestamp,
l.level,
JSON_EXTRACT(l.operationName, '$.localizedValue') as operation,
vm.name,
vm.location,
JSON_EXTRACT(vm.properties, '$.timeCreated') as time_created
FROM azure.monitor.activity_logs l
INNER JOIN azure.compute.virtual_machines vm
ON vm.name = SPLIT_PART(l.resourceId, '/', -1)
WHERE vm.resourceGroupName = 'MC_kube-03_aks862a_australiaeast'
AND subscriptionId = '631d1c6d-2a65-43e7-93c2-688bfe4e1468'
AND l.$filter = 'eventTimestamp ge ''2024-01-28T00:00:00Z'' and resourceGroupName eq ''MC_kube-03_aks862a_australiaeast''';
or using variables from jsonnet
, provided as a file using either the --iqldata
or -q
arguments, or as a jsonnet block as shown here:
<<<jsonnet
{
stackqlQuery: {
subscriptionId: '631d1c6d-2a65-43e7-93c2-688bfe4e1468',
resourceGroupName: 'MC_kube-03_aks862a_australiaeast',
}
}
>>>
SELECT
l.eventTimestamp as event_timestamp,
l.level,
JSON_EXTRACT(l.operationName, '$.localizedValue') as operation,
vm.name,
vm.location,
JSON_EXTRACT(vm.properties, '$.timeCreated') as time_created
FROM azure.monitor.activity_logs l
INNER JOIN azure.compute.virtual_machines vm
ON vm.name = SPLIT_PART(l.resourceId, '/', -1)
WHERE vm.resourceGroupName = '{{ .stackqlQuery.resourceGroupName}}'
AND subscriptionId = '{{ .stackqlQuery.subscriptionId}}'
AND l.$filter = 'eventTimestamp ge ''2024-01-28T00:00:00Z'' and resourceGroupName eq ''{{ .stackqlQuery.resourceGroupName}}''';
WHERE
parameters in a JOIN
query which are not disambiguated, such as subscriptionId
in the above case are shared by both resources (azure.monitor.activity_logs
and azure.compute.virtual_machines
in this case).