Skip to main content

6 posts tagged with "aws"

View All Tags

· 5 min read

AWS creates default VPCs in each region for convenience. However, these default VPCs often contain noncompliant network ACLs and security group rules that do not align with best practices for AWS Config and Security Hub. Deleting these default VPCs is beneficial, especially for regions not used by your organization or architectures that do not utilize a VPC.  

This guide demonstrates how to use StackQL to enumerate and delete all default VPCs and their associated resources in all AWS regions.  

What you need

All you need to do is to install the pystackql package using:

pip install pystackql

Deleting resources will require a privileged AWS IAM user. You can do this from a terminal with the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables set (and optionally AWS_SESSION_TOKEN set if you are using sts assume-role).  

You can also use the StackQL Cloud Shell Scripts, from within AWS Cloud Shell, to run authenticated StackQL queries using:

sh stackql-aws-cloud-shell.sh

How it works

Default VPCs in AWS regions have a CIDR block of 172.31.0.0/16, before deleting anything, the program ensures it is not in use by using:

SELECT
id
FROM aws.ec2.network_interfaces
WHERE region = '{region}'
AND vpc_id = '{vpc_id}'

If any resources are using the VPC (such as EC2, RDS, ELB/ALB, VPC attached Lambda functions, etc), it will skip these VPCs.  Once determined that the VPC is not in use, all of the resources associated with the VPC must be deleted before the VPC itself can be deleted; this includes:

  • External Routes (aws.ec2.routes)
  • Internet Gateways (aws.ec2.internet_gateways)
  • NACLs (aws.ec2.network_acls)
  • Subnets (aws.ec2.subnets)
  • and then finally, the VPC (aws.ec2.vpcs)

The default route table and default security group are deleted automatically when deleting the VPC

This is done by discovering the resources using StackQL SELECT queries and deleting them using StackQL DELETE queries, like:

DELETE FROM aws.ec2.network_acls
WHERE data__Identifier = '{nacl_id}'
AND region = '{region}'

Complete code

The following Python program uses StackQL to list and delete default VPCs and their associated resources (subnets, route tables, internet gateways, security groups, and network ACLs) across all AWS regions.

Get the complete code here
from pystackql import StackQL
stackql = StackQL()
stackql.executeStmt("REGISTRY PULL aws")

def ensure_one_or_zero(resource_list, resource_name, region):
if len(resource_list) > 1:
raise RuntimeError(f"ah snap! multiple default {resource_name} resources found in {region}")
elif len(resource_list) == 0:
print(f"/* no default {resource_name} found in {region} */\n")
return False
return True

regions = [
'us-east-1', 'us-east-2', 'us-west-1', 'us-west-2',
'eu-central-1', 'eu-central-2', 'eu-west-1', 'eu-west-2', 'eu-west-3', 'eu-north-1', 'eu-south-1',
'ap-east-1', 'ap-south-1', 'ap-south-2', 'ap-northeast-1', 'ap-northeast-2', 'ap-northeast-3',
'ap-southeast-1', 'ap-southeast-2', 'ap-southeast-3', 'ap-southeast-4', 'af-south-1',
'ca-central-1', 'me-south-1', 'me-central-1', 'sa-east-1'
]

for region in regions:
vpc_ids = stackql.execute(
f"""
SELECT
vpc_id
FROM aws.ec2.vpcs
WHERE region = '{region}'
AND cidr_block = '172.31.0.0/16'
AND JSON_ARRAY_LENGTH(tags) = 0
"""
)
if not ensure_one_or_zero(vpc_ids, 'VPC', region): continue
vpc_id = vpc_ids[0]['vpc_id']

# check if vpc is in use
network_interfaces = stackql.execute(
f"""
SELECT
id
FROM aws.ec2.network_interfaces
WHERE region = '{region}'
AND vpc_id = '{vpc_id}'
"""
)
if len(network_interfaces) > 0:
print(f"/* skipping deletion of default VPC ({vpc_id}) in {region} because it is in use */\n")
continue

print(f"/* deleting resources for default VPC ({vpc_id}) in {region} */\n")

# get route table
route_table_ids = stackql.execute(
f"""
SELECT
route_table_id
FROM aws.ec2.route_tables
WHERE region = '{region}'
AND vpc_id = '{vpc_id}'
"""
)
ensure_one_or_zero(route_table_ids, 'route table', region)
route_table_id = route_table_ids[0]['route_table_id']

# get inet gateway id
inet_gateway_ids = stackql.execute(
f"""
SELECT gateway_id
FROM aws.ec2.routes
WHERE data__Identifier = '{route_table_id}|0.0.0.0/0'
AND region = '{region}'
"""
)
ensure_one_or_zero(inet_gateway_ids, 'internet gateway', region)
inet_gateway_id = inet_gateway_ids[0]['gateway_id']

# delete routes
print(f"/* deleting default VPC routes in route table ({route_table_id}) in {region} */")
print(f"""
DELETE FROM aws.ec2.routes
WHERE data__Identifier = '{route_table_id}|0.0.0.0/0'
AND region = '{region}';
""")

# detatch inet gateway
print(f"/* detaching default VPC internet gateway ({inet_gateway_id}) in {region} */")
print(f"""
DELETE FROM aws.ec2.vpc_gateway_attachments
WHERE data__Identifier = 'IGW|{vpc_id}'
AND region = '{region}';
""")

# delete inet gateway
print(f"/* deleting default VPC internet gateway ({inet_gateway_id}) in {region} */")
print(f"""
DELETE FROM aws.ec2.internet_gateways
WHERE data__Identifier = '{inet_gateway_id}'
AND region = '{region}';
""")

# delete nacl
nacl_ids = stackql.execute(
f"""
SELECT
id
FROM aws.ec2.network_acls
WHERE vpc_id = '{vpc_id}'
AND region = '{region}'
"""
)
ensure_one_or_zero(nacl_ids, 'network acl', region)
nacl_id = nacl_ids[0]['id']
print(f"/* deleting default VPC NACL ({nacl_id}) in {region} */")
print(f"""
DELETE FROM aws.ec2.network_acls
WHERE data__Identifier = '{nacl_id}'
AND region = '{region}';
""")

# delete subnets
subnet_ids = stackql.execute(
f"""
SELECT
subnet_id
FROM aws.ec2.subnets
WHERE vpc_id = '{vpc_id}'
AND region = '{region}'
"""
)
for subnet_id in subnet_ids:
print(f"/* deleting default VPC subnet ({subnet_id['subnet_id']}) in {region} */")
print(f"""
DELETE FROM aws.ec2.subnets
WHERE data__Identifier = '{subnet_id['subnet_id']}'
AND region = '{region}';
""")

# delete vpc
print(f"/* deleting default VPC ({vpc_id}) in {region} */")
print(f"""
DELETE FROM aws.ec2.vpcs
WHERE data__Identifier = '{vpc_id}'
AND region = '{region}';
""")

run the program using:

python3 delete-all-default-vpcs.py > delete_default_vpcs.iql

to generate a StackQL script you can run as a batch using stackql exec or as individual statements in the stackql shell.

Conclusion

Deleting default VPCs can help improve your AWS security posture by removing potentially noncompliant network configurations. This program leverages StackQL to automate the process, ensuring consistency across all regions.

Let us know what you think! ⭐ us on GitHub.

· One min read

StackQL allows you to query and interact with your cloud and SaaS assets using a simple SQL framework. Use cases include CSPM, asset inventory and analysis, finops and more, as well as IaC and sysops (lifecycle management).

Using stackql and the aws provider (AWS Cloud Control provider for stackql), here's how you can query your entire AWS estate in real time (globally) and generate a simple report like this...

aws-inventory-example

Check out the code at AWS Global Inventory!

Visit us and give us a ⭐ on GitHub

· 3 min read

StackQL allows you to query and interact with your cloud and SaaS assets using a simple SQL framework. Use cases include CSPM, asset inventory and analysis, finops and more, as well as our IaC and ops (lifecycle management).

The three major cloud providers all offer a built-in Linux shell for executing commands using their respective CLIs; in some cases, they come with tools like terraform pre-installed. They are pre-authorized with your credentials in the cloud console for the user you authenticated with.

Now you can easily use stackql - a unified analytics and IaC dev tool - in all major cloud providers' built-in shells, using cloud shell scripts packaged with the stackql Linux binary (available from v0.5.587 onwards).

StackQL is particularly useful for asynchronously querying across regions in AWS, projects in Google, or resource groups in Azure, which is challenging to do via the CLIs. For example:

SELECT region, COUNT(*) as num_functions
FROM aws.lambda.functions
WHERE region IN (
'us-east-1','us-east-2','us-west-1','us-west-2',
'ap-south-1','ap-northeast-3','ap-northeast-2',
'ap-southeast-1','ap-southeast-2','ap-northeast-1',
'ca-central-1','eu-central-1','eu-west-1',
'eu-west-2','eu-west-3','eu-north-1','sa-east-1')
GROUP BY region;

Additionally, you could authenticate to another provider from one cloud shell simultaneously and run multi-cloud inventory commands. For example:

SELECT 
name,
SPLIT_PART(machineType, '/', -1) as instance_type,
'google' as provider
FROM google.compute.instances
WHERE project IN ('myproject1','myproject2')
UNION
SELECT
instanceId as name,
instanceType as instance_type,
'aws' as provider
FROM aws.ec2.instances
WHERE region IN (
'us-east-1','us-east-2','us-west-1','us-west-2',
'ap-south-1','ap-northeast-3','ap-northeast-2',
'ap-southeast-1','ap-southeast-2','ap-northeast-1',
'ca-central-1','eu-central-1','eu-west-1',
'eu-west-2','eu-west-3','eu-north-1','sa-east-1');

Getting Started

To get started with StackQL in your preferred cloud shell environment, download the StackQL package using the following command:

curl -L https://bit.ly/stackql-zip -O \
&& unzip stackql-zip

This command downloads the StackQL package, unzips it, and sets the appropriate permissions. From there, you can use our tailored scripts for AWS, Google Cloud, or Azure to integrate StackQL seamlessly into your cloud shell environment.

Using StackQL in the AWS Cloud Shell

Run the stackql-aws-cloud-shell.sh as follows to use the StackQL command shell within the AWS cloud shell:

sh stackql-aws-cloud-shell.sh

An example is shown here:

aws-cloud-shell-example

You can also run stackql exec commands using the stackql-aws-cloud-shell.sh script; for instance, this command will write a CSV file for the results of a query that could be downloaded from the Cloud Shell.

sh stackql-aws-cloud-shell.sh exec \
--output csv --outfile instances.csv \
"SELECT region, instanceType FROM aws.ec2.instances WHERE region IN ('us-east-1')"

Additionally, you can supply an IAM role using the --role-arn argument to assume another role for your query or mutation operation, an example is shown here:

sh stackql-aws-cloud-shell.sh \
--role-arn arn:aws:iam::824532806693:role/SecurityReviewerRole exec \
--infile query.iql \
--output csv --outfile output.csv

Using StackQL in the Azure Cloud Shell

Run the stackql-azure-cloud-shell.sh as follows to open a StackQL command shell from the Azure Cloud Shell:

sh stackql-azure-cloud-shell.sh

An example is shown here:

azure-cloud-shell-example

Similar to the AWS script, you can also invoke stackql exec as well, an example is shown here:

sh stackql-azure-cloud-shell.sh exec \
--output csv --outfile instances_by_location.csv \
"SELECT location, COUNT(*) as num_instances FROM azure.compute.virtual_machines WHERE resourceGroupName = 'stackql-ops-cicd-dev-01' AND subscriptionId = '631d1c6d-2a65-43e7-93c2-688bfe4e1468' GROUP BY location"

Using StackQL in the Google Cloud Shell

Run the stackql-google-cloud-shell.sh as shown below to launch a StackQL command shell from within the google cloud shell:

sh stackql-google-cloud-shell.sh

An example is shown here:

google-cloud-shell-example

As with the other two providers, you can run exec commands following the example below:

sh stackql-google-cloud-shell.sh exec \
--output csv --outfile instances.csv \
"SELECT name, status FROM google.compute.instances WHERE project = 'stackql-demo'"

Please give us your feedback! Star us at github.com/stackql.

· 3 min read
info

stackql is a dev tool that allows you to query and manage cloud and SaaS resources using SQL, which developers and analysts can use for CSPM, assurance, user access management reporting, IaC, XOps and more.

Most AWS services and resources are regionally scoped, meaning the UI, CLI, SDKs, and all other methods of querying the aws provider give you a regional view (us-east-1 or ap-southeast-2, for instance). Many customer AWS estates span multiple regions - for multinational organizations, for example, or organizations with numerous dispersed locations within the US.

Sure, you could write custom scripts wrapping the CLI or SDKs - which would require development effort (not reusable for other providers); or get an abstract view with tools like AWS Config or Systems Manager, which requires these services to be enabled and configured (not flexible and not extendible to other providers). In either case:

  1. You can't write and run customized queries and generate custom reports - as you can do in SQL
  2. Any solutions you build will have to be rebuilt entirely for other providers

Using the latest (AWS provider for StackQL - which leverages the AWS Cloud Control API) and the executeQueriesAsync method in the pystackql Python package, I've put together an example here which runs a query to bring back attributes from all AWS Lambda functions deployed across 17 different AWS regions asynchronously. Results can be returned as a list of Python dictionaries or a Pandas dataframe. I am doing the former here, which took less than 10s.

from pystackql import StackQL
from pprint import pprint
from asyncio import run
stackql = StackQL()
stackql.executeStmt("REGISTRY PULL aws") # not required if the aws provider is already installed

async def stackql_async_queries(queries):
return await stackql.executeQueriesAsync(queries)

regions= ["us-east-1","us-east-2","us-west-1","us-west-2","ap-south-1","ap-northeast-3","ap-northeast-2","ap-southeast-1",
"ap-southeast-2","ap-northeast-1","ca-central-1","eu-central-1","eu-west-1","eu-west-2","eu-west-3","eu-north-1",
"sa-east-1"]

# list functions from all regions asynchronously
get_fns = [
f"""
SELECT *
FROM aws.lambda.functions
WHERE region = '{region}'
"""
for region in regions
]

functions = run(stackql_async_queries(get_fns))

# get function details for all functions across all regions asynchronously
get_fn_details = [
f"""
SELECT
function_name,
region,
arn,
description,
architectures,
memory_size,
runtime
FROM aws.lambda.function
WHERE region = '{function['region']}'
AND data__Identifier = '{function['function_name']}'
"""
for function in functions
]

function_details = run(stackql_async_queries(get_fn_details))
pprint(function_details)

which returns...

[{'architectures': '["x86_64"]',
'arn': 'arn:aws:lambda:us-east-1:824532806693:function:stackql-helloworld-fn',
'description': '',
'function_name': 'stackql-helloworld-fn',
'memory_size': '128',
'region': 'us-east-1',
'runtime': 'nodejs18.x'},
{'architectures': '["x86_64"]',
'arn': 'arn:aws:lambda:us-east-2:824532806693:function:stackql-helloworld-fn',
'description': '',
'function_name': 'stackql-helloworld-fn',
'memory_size': '128',
'region': 'us-east-2',
'runtime': 'nodejs18.x'},
{'architectures': '["x86_64"]',
'arn': 'arn:aws:lambda:us-west-1:824532806693:function:stackql-helloworld-fn',
'description': '',
'function_name': 'stackql-helloworld-fn',
'memory_size': '128',
'region': 'us-west-1',
'runtime': 'nodejs18.x'},
...

You could customize the StackQL query to run specific reports and visualize the results in a Jupyter notebook, for example:

  • Functions by runtimes
  • Function by memory size
  • Functions by tags
  • etc...

You could do something similar for other hyperscalars, for example, GCP, which scopes resources by projects, or Azure, which scopes resources by resource groups.

Let us know your thoughts! Visit us and give us a ⭐ on GitHub

· 6 min read

This exercise will show you how to run a real-time query across your AWS and Google cloud environments. You may do this for inventory analysis, security analysis, or any other reason you can think of. We will use stackql to query the state of your cloud resources across your AWS and Google environments. You can also use stackql to provision, de-provision or manage resources across different cloud and SaaS providers.

The steps we will take are:

  1. Prepare your environment for stackql usage.
  2. Use stackql to provision some resources in cloud. optional
  3. Use stackql to query resources present in the cloud.
  4. Use stackql to tear down resources created in step (2), if any. Important: you must destroy any resources created through this exercise, or you will incur ongoing charges.

Preparation

For this exercise, credentials with privileges against google and aws are required. It is outside the scope of this document to go into great detail on the various topics and options relevant to this. Instead, the below steps provide both: (i) reference to vendor documentation and (ii) suggestions for workarounds to get yourself going.

for old hands

All the materials required for this exercise are:

  1. A current stackql executable.
  2. A Google Service Account Key JSON file, where the corresponding Service Account possesses permissions sufficient to create, interrogate and delete compute block storage.
  3. AWS credentials stored in the traditional AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables, where the corresponding Service Account possesses permissions sufficient to create, interrogate and delete ec2 block storage.

step by step

First, please do the following:

  1. Download and install stackql from our website.
  2. For google:
    • (i) Create and download a Google Service Account Key as per Google documentation. Remember the location of your key file.
    • (ii) You will need to grant the Service Account at least read, list, create, and delete privileges. For more information about google iam and Service Accounts in particular, please consult the documentation. For this exercise, grant your service account the roles/compute.storageAdmin role would be adequate.
  3. For AWS:
    • (i) Create and download AWS user credentials as per AWS documentation. We will require long-lived credentials. In keeping with vendor advice, we strongly recommend against using root user credentials. We have created a dedicated CICD user for this exercise.
    • (ii) Set up the AWS CLI environment variables as per the documentation.
    • (iii) The user will need create / read / delete privileges against ec2 volumes. This can be done though the AWS IAM console in various ways. For example, one can use groups and permission policies. Adding your user to a group with AmazonEC2FullAccess will certainly work, although lesser privileges may be adequate.

Then, create some shell variables:

# you will need to edit the file path as appropriate

GOOGLE_DOWNLOADED_KEY_FILE_PATH="/path/to/your/downloaded/key.json"

AWS_AUTH_FRAGMENT='{ "type": "aws_signing_v4", "credentialsenvvar": "AWS_SECRET_ACCESS_KEY", "keyIDenvvar": "AWS_ACCESS_KEY_ID" }'

GOOGLE_AUTH_FRAGMENT='{ "credentialsfilepath": "'"${GOOGLE_DOWNLOADED_KEY_FILE_PATH}"'", "type": "service_account" }'

export STACKQL_AUTH_CTX='{ "aws": '"${AWS_AUTH_FRAGMENT}"', "google": '"${GOOGLE_AUTH_FRAGMENT}"' }'
Setting up Provider Auth in PowerShell
$GOOGLE_DOWNLOADED_KEY_FILE_PATH = "C:\path\to\your\downloaded\key.json"

$AWS_AUTH_FRAGMENT = '{ "type": "aws_signing_v4", "credentialsenvvar": "AWS_SECRET_ACCESS_KEY", "keyIDenvvar": "AWS_ACCESS_KEY_ID" }'

$GOOGLE_AUTH_FRAGMENT = '{ "credentialsfilepath": "' + $GOOGLE_DOWNLOADED_KEY_FILE_PATH + '", "type": "service_account" }'

$env:STACKQL_AUTH_CTX = '{ "aws": ' + $AWS_AUTH_FRAGMENT + ', "google": ' + $GOOGLE_AUTH_FRAGMENT + ' }'

Start a stackql shell session

To start an interactive shell session, in the same shell you setup your envrioment variables, run:

stackql --auth="${STACKQL_AUTH_CTX}" shell

You can exit at any time with ctrl + C.

Setup and meta queries to get started

StackQL providers are installed from the StackQL Provider Registry using the REGISTRY command. StackQL supports meta queries such as SHOW and DESCRIBE which can be used to explore the available services, resources, fields, and operations available in a given cloud or SaaS provider.

-- see available providers
registry pull list;

-- pull the required providers
registry pull google;

registry pull aws;

-- some the installed providers
show providers;

-- some meta queries
show services in google;

show resources in google.compute;

describe google.compute.disks;

Create block storage (optional)

You will need to replace the items in <ANGLE_BRACKETS>.

-- create a google volume, await and verify creation completes successfully
insert /*+ AWAIT */ into google.compute.disks(
project,
zone,
data__name,
data__sizeGb
)
select
'<YOUR_GCP_PROJECT>',
'australia-southeast1-a',
'my-stackql-demo-disk-01',
'10' ;

-- create an aws volume, operation despatched on a BEST EFFORT basis
insert into aws.ec2.volumes(
AvailabilityZone,
Size,
region)
select
'ap-southeast-2a',
10,
'ap-southeast-2';

Interrogate cloud block storage


-- query one resource from google
select
name,
split_part(split_part(type, '/', 11), '-', 2) as type,
status,
sizeGb as size
from google.compute.disks
where project = '<YOUR_GCP_PROJECT>'
and zone = 'australia-southeast1-a';

-- query the equivalent from aws
select
volumeId as name,
volumeType as type,
status,
size
from aws.ec2.volumes
where region = 'ap-southeast-2';

-- union the equivalent resources across clouds
select
'google' as vendor,
name,
split_part(split_part(type, '/', 11), '-', 2) as type,
status,
sizeGb as size
from google.compute.disks
where project = '<YOUR_GCP_PROJECT>'
and zone = 'australia-southeast1-a'
union
select
'aws' as vendor,
volumeId as name,
volumeType as type,
status,
size
from aws.ec2.volumes
where region = 'ap-southeast-2';

-- create a view for convenience
create view dual_cloud_block_storage as
select
'google' as vendor,
name,
split_part(split_part(type, '/', 11), '-', 2) as type,
status,
sizeGb as size
from google.compute.disks
where project = '<YOUR_GCP_PROJECT>'
and zone = 'australia-southeast1-a'
union
select
'aws' as vendor,
volumeId as name,
volumeType as type,
status,
size
from aws.ec2.volumes
where region = 'ap-southeast-2';

-- select from the newly created view, with ordering
select * from dual_cloud_block_storage order by name desc;

Delete block storage (if required)

This will only work if the disks are deletable. For example, aws.ec2.volumes must have status = available; you can check this with the view we created above.

/* delete a google volume, await and verify creation completes successfully.
One at a time only... */
delete /*+ AWAIT */ from google.compute.disks
where project = '<YOUR_GCP_PROJECT>'
and zone = 'australia-southeast1-a'
and disk = 'my-stackql-demo-disk-01';

-- delete an aws volume, operation despatched on a BEST EFFORT basis
delete from aws.ec2.volumes
where VolumeId = 'vol-049ee07b31aff451a'
and region = 'ap-southeast-2';

Verify the cleanup was successful

select * from dual_cloud_block_storage order by name desc;

That's it for the scripted demo!

Get involved

We Need Your Help!

if you find bugs, want features, have tech questions then go to github.com/stackql/stackql/issues and raise the appropriate issue 🙏