I grappled with Terraform for the better part of a day trying to provision a GKE Autopilot cluster in a Shared VPC service project, I was able to do this with StackQL in 2 minutes, this is how...
Before starting you will need the following to use GKE Autopilot in your Shared VPC:
control plane IP address range
control plane authorized networks (if desired)
the host network and node subnet you intend to use
pod and services secondary CIDR ranges
(all of the above would typically be pre-provisioned in the Shared VPC design and deployment)
Step 1: Using the GCP console, navigate to your service project, go to Kubernetes Engine --> Clusters --> Create --> GKE Autopilot --> Configure. Enter in all of the desired configuration options (including the network configuration specified above). Do not select CREATE.
Step 2: At the bottom of the dialog used to configure the cluster in the console, use the Equivalent REST button to generate the GKE Autopilot API request body.
Step 3: Supply this as input data to an StackQL INSERT command, either via an iql file, on as inline configuration. Optionally you can convert this to Jsonnet and parameterise for use in other environments.
Jsonnet is a fantastic configuration language as discussed in Using Jsonnet to Configure Multiple Environments. Going slightly beyond the basics, this article is an introduction to anonymous functions and the map and format methods in the Jsonnet standard library.
Similar to map methods in various other functional programming languages or data processing frameworks, map in Jsonnet evaluates a named or anonymous function for each element within an array. map is a higher order function, meaning it is a function that calls another function. Its signature is here:
std.map(func, arr)
the func argument could be a named function or an unnamed (or anonymous function). arr is an input array which could include embedded dictionaries or other lists as well.
In this example I am templating some config for a NAT gateway in GCP for use in an StackQL routine, where I have a list of external IP's that need to be formatted in the Google selfLink format. Perfect use for the map method as well as the format command similar to the printf or equivalent commands found in various other languages. The easiest way to use this is similar to the way you would invoke this in Python:
"%s/%s/%s" % [string1, string2, string3]
Putting it all together in the following practical example, you can see the input Jsonnet in the Jsonnet tab and the templated or rendered output in the Json tab. x represents an element of the extIps array, then the function returns the fully qualified selfLink url.
Jsonnet
Json
{ local project_id ='myproject-123', local region ='australia-southeast1', local self_link_prefix ='https://compute.googleapis.com/compute/v1/projects/', local extIps =[{name:'syd-extip1',region: region},{name:'syd-extip2',region: region}], nats:[ { name:'nat-config', natIpAllocateOption:'MANUAL_ONLY', natIps: std.map((function(x)"%s/%s/regions/%s/addresses/%s"%[self_link_prefix, project_id, x.region, x.name]), extIps), sourceSubnetworkIpRangesToNat:'ALL_SUBNETWORKS_ALL_IP_RANGES' }, ], }
Big Query provides a wealth of metrics and statistics for jobs run against it which could be queries, load jobs or export jobs. This article demonstrates some queries you can run using StackQL to bring back live statistics from load operations into Big Query as well as detail regarding errors encountered during the loading of data into Big Query.
Loading Data into Big Query from GCS using StackQL
In a previous blog, we demonstrated how to create a Big Query dataset and how to create a Big Query table using StackQL INSERT statements. Having created a target dataset and table in Big Query, we can invoke a load job using StackQL by performing an INSERT into the google.bigquery.jobs resource.
The data for this operation is shown in the Data tab which is supplied in Jsonnet format.
To see the available fields with their data types and descriptions, you can run the following StackQL DESCRIBE statement:
DESCRIBEEXTENDED google.bigquery.jobs;
As you can see from running the above command or looking at the API documentation, there is a state field which is an enum showing the state of the job, since we are only concerned with completed jobs we will filter on jobs with a state of DONE. The errorResult field is an object but its presence alone indicates that an error has occurred so we will add another filter to only show results where errorResult is not null.
A simple query to start off with is to count the number of errors, this will be for all job types (load, extract and query):
StackQL
Results
SELECTCOUNT(*)as num_errors FROM google.bigquery.jobs WHERE projectId ='stackql' AND state ='DONE' AND errorResult ISNOTnull;
To get a little more information about Big Query errors we can run a detailed query, extracting fields from the errorResult object using the JSON_EXTRACT built in function. This function is exceptionally useful as many of the fields returned from Google APIs are complex objects.
StackQL
Results
SELECT id,JSON_EXTRACT(errorResult,'$.reason')AS errorReason FROM google.bigquery.jobs WHERE projectId ='stackql' AND state ='DONE' AND errorResult ISNOTnull;
The previous queries returned all errors for all Big Query job types. If we want to narrow our query to just Big Query load operations we can use the Big Query JobStatistics object, which includes fields for each job type.
To refine results to only load operations add the following expression to the WHERE clause:
ANDJSON_EXTRACT(statistics,'$.load')ISNOTnull;
Date values returned in job responses are in Unix timestamp format, to format them in a human readable format we can use the DATETIME built in function. Here is a more advanced example:
StackQL
Results
SELECT id, JSON_EXTRACT(errorResult,'$.message')AS errorMessage, JSON_EXTRACT(errorResult,'$.reason')AS errorReason, DATETIME(SUBSTR(JSON_EXTRACT(statistics,'$.creationTime'),1,10),'unixepoch')AS creationTime, DATETIME(SUBSTR(JSON_EXTRACT(statistics,'$.startTime'),1,10),'unixepoch')AS startTime, DATETIME(SUBSTR(JSON_EXTRACT(statistics,'$.endTime'),1,10),'unixepoch')AS endTime FROM google.bigquery.jobs WHERE projectId ='stackql' AND state ='DONE' AND errorResult ISNOTnull ANDJSON_EXTRACT(statistics,'$.load')ISNOTnull;
The Cloud Asset API has recently gone GA, this is an exceptionally useful service which stores the history and inventory of cloud resources in your GCP org. Using the Cloud Asset API via StackQL you can enumerate all of the services and resources in your GCP org, including billable resources such as Cloud Storage buckets or Compute Engine instances, as well as other objects such as billing accounts, folders, projects, firewalls, service accounts and much more. All of this can be done using SQL!
Let’s start by exploring the available fields in this service:
Use the DESCRIBE or DESCRIBE EXTENDED to see the fields available in the google.cloudasset.assets resource as shown here:
StackQL
Response
DESCRIBEEXTENDED google.cloudasset.assets;
|------------------|--------|-----------------------------------------------------------------------------------------------------------| | name | type | description | |------------------|--------|-----------------------------------------------------------------------------------------------------------| | name | string | The full name of the asset. Example:| |||`//compute.googleapis.com/projects/my_project_123/zones/zone1/instances/instance1`| ||| See [Resource names](https://cloud.google.com/apis/design/resource_names#full_resource_name)for| ||| more information.| |------------------|--------|-----------------------------------------------------------------------------------------------------------| | orgPolicy | array |A representation of an [organization | ||| policy](https://cloud.google.com/resource-manager/docs/organization-policy/overview#organization_policy).| ||| There can be more than one organization policy with different constraints set on a given resource.| |------------------|--------|-----------------------------------------------------------------------------------------------------------| | servicePerimeter | object |`ServicePerimeter` describes a setof Google Cloud resources which can freely import and export data | ||| amongst themselves, but not export outside of the `ServicePerimeter`. If a request with a source within | |||this`ServicePerimeter` has a target outside of the `ServicePerimeter`, the request will be blocked.| ||| Otherwise the request is allowed. There are two types of Service Perimeter - Regular and Bridge.| ||| Regular Service Perimeters cannot overlap, a single Google Cloud project can only belong to a single | ||| regular Service Perimeter. Service Perimeter Bridges can contain only Google Cloud projects as members,| ||| a single Google Cloud project may belong to multiple Service Perimeter Bridges.| |------------------|--------|-----------------------------------------------------------------------------------------------------------| | osInventory | object | This API resource represents the available inventory data for a Compute Engine virtual | |||machine(VM) instance at a given point in time. You can use thisAPI resource to determine | ||| the inventory data of your VM. For more information, see [Information provided by OS inventory | ||| management](https://cloud.google.com/compute/docs/instances/os-inventory-management#data-collected).| |------------------|--------|-----------------------------------------------------------------------------------------------------------| | relatedAssets | object | The detailed related assets with the `relationship_type`.| |------------------|--------|-----------------------------------------------------------------------------------------------------------| | accessPolicy | object |`AccessPolicy` is a container for`AccessLevels`(which define the necessary attributes to use Google | ||| Cloud services) and `ServicePerimeters`(which define regions of services able to freely pass data | ||| within a perimeter). An access policy is globally visible within an organization, and the restrictions | ||| it specifies apply to all projects within an organization.| |------------------|--------|-----------------------------------------------------------------------------------------------------------| | iamPolicy | object | An Identity and Access Management(IAM) policy, which specifies access controls for Google Cloud | ||| resources.A`Policy` is a collection of`bindings`.A`binding` binds one or more `members`| ||| to a single `role`. Members can be user accounts, service accounts, Google groups, and domains | |||(such asG Suite).A`role` is a named list of permissions; each `role` can be an IAM predefined role | ||| or a user-created custom role. For some types of Google Cloud resources, a `binding` can also | ||| specify a `condition`, which is a logical expression that allows access to a resource only | |||if the expression evaluates to `true`.A condition can add constraints based on attributes of the request,| ||| the resource, or both. To learn which resources support conditions in their IAM policies, see the | |||[IAM documentation](https://cloud.google.com/iam/help/conditions/resource-policies).| |------------------|--------|-----------------------------------------------------------------------------------------------------------| | ancestors | array | The ancestry path of an asset in Google Cloud [resource hierarchy]| |||(https://cloud.google.com/resource-manager/docs/cloud-platform-resource-hierarchy),| ||| represented as a list of relative resource names. An ancestry path starts with the closest ancestor in| ||| the hierarchy and ends at root. If the asset is a project, folder, or organization, the ancestry path | ||| starts from the asset itself. Example:`["projects/123456789", "folders/5432", "organizations/1234"]`| |------------------|--------|-----------------------------------------------------------------------------------------------------------| | assetType | string | The type of the asset. Example:`compute.googleapis.com/Disk` See [Supported asset types]| |||(https://cloud.google.com/asset-inventory/docs/supported-asset-types)for more information.| |------------------|--------|-----------------------------------------------------------------------------------------------------------| | accessLevel | object | An `AccessLevel` is a label that can be applied to requests to Google Cloud services, along with a list | |||of requirements necessary for the label to be applied.| |------------------|--------|-----------------------------------------------------------------------------------------------------------| | resource | object |A representation of a Google Cloud resource.| |------------------|--------|-----------------------------------------------------------------------------------------------------------| | updateTime | string | The last update timestamp of an asset. update_time is updated when create/update/delete operation | ||| is performed.| |------------------|--------|-----------------------------------------------------------------------------------------------------------|
As you can see there is some very interesting stuff here, including where the asset fits in the organization hierarchy as well as whether the asset is included in a service perimeter.
To start querying you just need to supply a root node from which you want to start enumerating assets, this can be at an org level, folder level or project level.
A simple query to group and count all of the different types of assets in a GCP project is shown here:
StackQL
Response
SELECT assetType,COUNT(*) FROM google.cloudasset.assets WHERE parent ='projects/123123123123' GROUPBY assetType;
Its easy enough for anyone to deploy a Cloud Storage bucket in google, this can be done through the console, gcloud, terraform or stackql as shown here: Deploying and Querying GCS Buckets using StackQL. It is also easy to inadvertently allow users to set public ACLs on a bucket, therefore making its contents publicly visible by default. There is an easy way to prevent this from happening by Using public access prevention.
Let's work through a real life scenario using StackQL.
Step 1 : Run a query to find buckets which do not have public access prevention enforced
Run the following StackQL query from the shell or via exec: