Skip to main content

21 posts tagged with "Amazon Web Services"

Amazon Web Services

View All Tags

AWS DeepRacer

· 9 min read
Chiwai Chan
Tinkerer

This blog is to detail my first experiences with AWS DeepRacer as somebody who knows very little about how AI works under the hood, and at first didn't fully understand the difference between Supervised Learning vs Unsupervised Learning vs Reinforcement Learning when I was writing my first Python code for the "reward_function".

AWS DeepRacer Training

DeepRacer is a Reinforcement Learning based AWS Machine Learning Service that provides a quick and fun way to get into ML by letting you build and train an ML model that can be used to drive around on a virtual, as well as a physical race track.

I'm a racing fan in many ways whether it is watching Formula 1, racing my mates in go karting or having a hoon on the scooter, so once I found out about AWS DeepRacer service I've always wanted to dip my toes in it. More than 2 years later I found an excuse to play with it during my preparations for the AWS Certified Machine Learning Specialty exam, I am expecting a question or two on DeepRacer in the exam so what better way to learn about DeepRacer than to try it out by cutting some Python code.

Goal

Have a realistic one this time and keep it simple: produce an ML model that can drive a car around a few diferent virtual tracks for a few laps without going off the track.

My Machine Learning background and relevant experience

  • Statistics, Calculus and Physics: was pretty good at these during high school and did ok in Statistics and Calculas during uni.
  • Python: have been writing some Python code in the past couple of years on and off, mainly in AWS Lambda functions.
  • Machine Learning: none
  • Writing code for mathematic: had a job that involved writing complex mathmatic equations and tree based algorithms in Ruby and Java for about 7 years

Approach

Code a Python Reward Function to return a Reinforcement Reward value based on the state of the DeepRacer vehicle - the reward can be positive for good behaviour and also be negative to discourage the agent (vehicle) for a state that is not going to give us a good race pace. The state of the vehicle is a set of key/values shown below and is available to the Python Reward Function during runtime for us to use to calculate a reward value.

    # "all_wheels_on_track": Boolean,        # flag to indicate if the agent is on the track
# "x": float, # agent's x-coordinate in meters
# "y": float, # agent's y-coordinate in meters
# "closest_objects": [int, int], # zero-based indices of the two closest objects to the agent's current position of (x, y).
# "closest_waypoints": [int, int], # indices of the two nearest waypoints.
# "distance_from_center": float, # distance in meters from the track center
# "is_crashed": Boolean, # Boolean flag to indicate whether the agent has crashed.
# "is_left_of_center": Boolean, # Flag to indicate if the agent is on the left side to the track center or not.
# "is_offtrack": Boolean, # Boolean flag to indicate whether the agent has gone off track.
# "is_reversed": Boolean, # flag to indicate if the agent is driving clockwise (True) or counter clockwise (False).
# "heading": float, # agent's yaw in degrees
# "objects_distance": [float, ], # list of the objects' distances in meters between 0 and track_length in relation to the starting line.
# "objects_heading": [float, ], # list of the objects' headings in degrees between -180 and 180.
# "objects_left_of_center": [Boolean, ], # list of Boolean flags indicating whether elements' objects are left of the center (True) or not (False).
# "objects_location": [(float, float),], # list of object locations [(x,y), ...].
# "objects_speed": [float, ], # list of the objects' speeds in meters per second.
# "progress": float, # percentage of track completed
# "speed": float, # agent's speed in meters per second (m/s)
# "steering_angle": float, # agent's steering angle in degrees
# "steps": int, # number steps completed
# "track_length": float, # track length in meters.
# "track_width": float, # width of the track
# "waypoints": [(float, float), ] # list of (x,y) as milestones along the track center

Based on this set of key/values we can get a pretty good idea of the state of the vehicle/agent and what it was getting up to on the track.

So using these key/values we calculate and return a value for the Reward Function in Python. For example, if the value for "is_offtrack" is "true" then this indicates the vehicle has come off the track so we can return a negative value for the Reward Function; also, we might want to amplify the negative reward if the vehicle was doing something else it should not be doing - like steering right into a left turn (steering_angle).

Conversely, we return a positive reward value for good behaviour such as steering straight on a long stretch of the track going as fast as possible within the center of the track.

My approach to coding the Reward Functions was pretty simple: calculate the reward value based on how I myself would physically drive on a go kart track; factor as much into the calculations as possible such as how the vehicle is hitting the apex, and is it hitting it from the outside of the track or the inside; is the vehicle in a good position to take the next turn or two. For each iteration of the code, I train a new model in AWS DeepRacer with it; I normally watch the video of the simulation to pay attention to what could be improved in the next iteration; then we do the whole process all over again.

Within the Reward Function I work out a bunch of sub-rewards such as:

  • steering straight on a long stretch of the track as fast as possible within the center of the track
  • is the vehicle in a good position to take the next turn or two
  • is the vehicle was doing something else it should not be doing like steering right into a left turn

These are just some examples of sub-rewards I work out - and the list grows as I iterate and improve (or make it worse) with each version of the reward function, at the end of each function I calculate the net reward value based on the sum up of the weighted sub-rewards; each sub-reward could have a higher importance than another so I've taken a weighted approach to the calculation to allow a sub-reward to amplify the effect it has on the net reward value.

Code

Here is the very first version of the Reward Function I coded:

MAX_SPEED = 4.0

def reward_function(params):
track_width = params['track_width']
distance_from_center = params['distance_from_center']
steering_angle = params['steering_angle']
speed = params['speed']

weighted_sub_rewards = []


half_track_width = track_width / 2.0


within_percentage_of_center_weight = 1.0
steering_angle_weight = 1.0
speed_weight = 0.5
steering_angle_and_speed_weight = 1.0


add_weighted_sub_reward(weighted_sub_rewards, "within_percentage_of_center_weight", within_percentage_of_center_weight, get_sub_reward_within_percentage_of_center(distance_from_center, track_width))
add_weighted_sub_reward(weighted_sub_rewards, "steering_angle_weight", steering_angle_weight, get_sub_reward_steering_angle(steering_angle))
add_weighted_sub_reward(weighted_sub_rewards, "speed_weight", speed_weight, get_sub_reward_speed(speed))
add_weighted_sub_reward(weighted_sub_rewards, "steering_angle_and_speed_weight", steering_angle_and_speed_weight, get_sub_reward_steering_angle_and_speed_weight(steering_angle, speed))

print(weighted_sub_rewards)

weight_total = 0.0
numerator = 0.0

for weighted_sub_reward in weighted_sub_rewards:
sub_reward = weighted_sub_reward["sub_reward"]
weight = weighted_sub_reward["weight"]

weight_total += weight
numerator += sub_reward * weight

print("sub numerator", weighted_sub_reward["sub_reward_name"], (sub_reward * weight))

print(numerator)
print(weight_total)
print(numerator / weight_total)

return numerator / weight_total



def add_weighted_sub_reward(weighted_sub_rewards, sub_reward_name, weight, sub_reward):
weighted_sub_rewards.append({"sub_reward_name": sub_reward_name, "sub_reward": sub_reward, "weight": weight})

def get_sub_reward_within_percentage_of_center(distance_from_center, track_width):
half_track_width = track_width / 2.0
percentage_from_center = (distance_from_center / half_track_width * 100.0)

if percentage_from_center <= 10.0:
return 1.0
elif percentage_from_center <= 20.0:
return 0.8
elif percentage_from_center <= 40.0:
return 0.5
elif percentage_from_center <= 50.0:
return 0.4
elif percentage_from_center <= 70.0:
return 0.15
else:
return 1e-3

# The reward is better if going straight
# steering_angle of -30.0 is max right
def get_sub_reward_steering_angle(steering_angle):
is_left_turn = True if steering_angle > 0.0 else False
abs_steering_angle = abs(steering_angle)

print("abs_steering_angle", abs_steering_angle)

if abs_steering_angle <= 3.0:
return 1.0
elif abs_steering_angle <= 5.0:
return 0.9
elif abs_steering_angle <= 8.0:
return 0.75
elif abs_steering_angle <= 10.0:
return 0.7
elif abs_steering_angle <= 15.0:
return 0.5
elif abs_steering_angle <= 23.0:
return 0.35
elif abs_steering_angle <= 27.0:
return 0.2
else:
return 1e-3


def get_sub_reward_speed(speed):
percentage_of_max_speed = speed / MAX_SPEED * 100.0

print("percentage_of_max_speed", percentage_of_max_speed)

if percentage_of_max_speed >= 90.0:
return 0.7
elif percentage_of_max_speed >= 65.0:
return 0.8
elif percentage_of_max_speed >= 50.0:
return 0.9
else:
return 1.0


def get_sub_reward_steering_angle_and_speed_weight(steering_angle, speed):
abs_steering_angle = abs(steering_angle)
percentage_of_max_speed = speed / MAX_SPEED * 100.0

steering_angle_weight = 1.0
speed_weight = 1.0

steering_angle_reward = get_sub_reward_steering_angle(steering_angle)
speed_reward = get_sub_reward_speed(speed)

return (((steering_angle_reward * steering_angle_weight) + (speed_reward * speed_weight)) / (steering_angle_weight + speed_weight))

Here is a video of one of the simulation runs:

Here is a link to my Github repository where I have all of versions of reward functions I created: https://github.com/chiwaichan/aws-deepracer/tree/main/models

Conclusion

After a few weeks of training and doing about 20 runs with each run using a different reward function, I did not meet the goal I set out to do - get the agent/vehicle to do 3 laps without coming off the track on a few different tracks. On average each model was only able to race the virtual car around each track for a little over a lap without crashing. It felt like at times I hit a bit of a wall and could not improve the results and in some instances the model got worse. I need to take a break from this to think of a better approach, the way I am doing it is by improving areas without measuring the progress in each area and the amount of improvement made in each.

Next steps

  • Learn how to train a DeepRacer model in SageMaker Studio (outside of the DeepRacer Service) using a Jupyter notebook so I can have more control over how models are trained
  • Learn and perform HyperParameter Optimizations using some of the SageMaker features and services
  • Take a Data and Visualisation driven approach to derive insights into where improvements can be made to the next model iteration
  • Learn to optimise the training, e.g. stop the training early when the model is not performing well
  • Sit the AWS Certified Machine Learning Specialty exam

Breaking Down Monolithic Subnets

· 5 min read
Chiwai Chan
Tinkerer

As my knowledge and experience of Cloud networking grew from designing network architectures over time and also more of lately from reviewing client network architectures, I've come to realise and appreciate the need to designing a proper network architecture that includes the long-term considerations, as early as possible - especially before a projects begins and definately before any resources are deployed into any VPCs.

In the past, I didn't have much of an interest into how a network was configured in an office, or how routing to a publicly accessible on-premise hosted application was set up when I first started in IT, this is mainly due to understanding very little about networking and also just because the networking was looked after by somebody else. It is only when I started using Cloud services where it allowed me to learn networking, much easier, perhaps because I was able to design, build and play around with my own dedicated isolated network in minutes without worrying about breaking things.

In this blog we will illustrate what a Monolithic Subnet looks like, and the problems that comes along with them; and illustrate one way to break down a Monolithic Subnet into multiple smaller Subnets - how solutions can benefit from designing workloads to leverage dedicated individual Subnets where each Subnet is for a set of common resources type or grouping. VPCs is also susceptible to becoming monolithics so I will write a separate blog about it in the future. Workloads should always be deployed across multiple AZs architecture for high-availability but to make this blog more digestable we will talk about a one AZ architecture.

We often see systems evolve over time whether they are applications or databases, that get to a point where they are too big to run, maintain or work with: these systems are commonly known as Monolithic Applications or Monolithic Databases.

Networks and the constructs of Networking can also be susceptible to becoming a monolith, early signs and symptoms could be: 1) CIDR block based rules in Security Groups or NACLs encompasses IP addresses of resources that should not be opened up to: a cause of this may be due to the number of different groups of resources within a Subnet where the IP addresses of each resource is non-deterministic – it may be difficult to design a minimum set of CIDR block values for rules to satisfy the least privilege principle. 2) conversely, CIDR block rules in Security Groups or NACLs with too many granular rules may also be a sign of a Monolithic Subnet – the common symptom are quotas of rules being reached too often.

Let’s take the example of 4 groups of compute resources, each group has a different network traffic usage behaviour than the other groups – group #1 communicates with resources in VPC X and VPC Y, while group #3 communicates with resources in VPC Y and VPC Z.

1

This is an example of how the groups of resources could be represented in a Subnet ordered by their Private IP address:

2

Often CIDR block rules that are too broad are used, which opens access to resources that should not be included. The following rules also allows in resources Groups #2 and #4 to communicate with resources in VPC X, Y and Z, when they are not expected to interaction with resources in any of those VPCs.

3

Conversely, implementing granular rules to follow best practice of least privilege may lead to quotas of Security Groups and NACLs to be reached; in any case, least privilege should be followed.

4

Solution

The solution is to break the groups of resources down into a Subnet for each Group. There is no hard rule that states a VPC must contain X number of tiers of Subnets - Subnets are used to group similar resources with similar network traffic patterns, if there are many groupings of resources then it is perfectly fine to create as many number of tiers of Subnets – one Subnet for each Grouping.

5

As a result, rules are more specific, targeted and makes it straight forward to implement the least privilege principle.

6

When groups of resources are broken down from a monolith Subnet into multiple Subnets, there are other benefits created as a by-product:

  • With each resource group deployed in a separate dedicated Subnet early on it will likely reduce or eliminate (a good solution is to not have a problem to begin with) future re-work that combats increased architecture complexity, which may often require re-deploying resources into new Subnets - to me this is unnecessary effort if we can avoid it, especially for resources that requires a lot of manual effort to deploy
  • NACLs rules are broken down and grouped into its respective resources and Subnet, which leads to fewer number of rules in a NACL – reduce possibility of reaching the quotas
  • When all resources are deployed within one Subnet only Security Groups could be leveraged to implement firewall rules, but when resources are broken down into multiple Subnets then NACLs can be leveraged as well
  • Security Posture is improved because certain traffic does not enter the Subnet from adjacent Subnets if NACL rules are implemented appropriately
  • Depending on how granular you break down your monolithic Subnets, if it is a very fine break down then you are setting up your network architecture to be in a position to implement tighter controls gearing towards a micro-segmentation network architecture

This solution compliments the use of networking solutions in other blogs I have written:

Leveraging AWS Prefix Lists

· 7 min read
Chiwai Chan
Tinkerer

AWS VPC Prefix List is a feature of the AWS Networking that has been around for a short while, however, I have yet to see it leveraged to its full potential, and more often than not I have not seen them used at all.

There are 2 types of Prefix Lists:

  • AWS-managed Prefix Lists: as the name indicates these lists are managed by AWS, and they are used to maintain a set of IP address ranges for AWS services, e.g. S3, DynamoDB and CloudFront.
  • Customer-managed Prefix Lists: these are created and maintained by anyone who has access to the AWS Console, AWS APIs or AWS SDKs. This is what we will be focusing on.

In this blog we will go into:

  • What Customer-managed Prefix Lists are
  • How they can be leveraged by AWS Security Groups
  • How they can be leveraged by AWS Subnet Route Tables
  • How they can be leveraged by AWS Transit Gateway Route Tables
  • Considerations

AWS VPC Customer-managed Prefix List is a great tool to have available as it provides the ability to track and maintain a list of CIDR block values, which can then be referenced by other AWS Networking components in their rules or route tables. Each Prefix List supports either IPv4 or IPv6 based addresses, and a number of expected Max Entries for the list must be defined; the number of entries in the list cannot exceed the Max Entries.

You can use Prefix List to maintain a list of CIDR blocks of Subnets or VPCs; or, track a list of similiar IP addresses based on a grouping of your choice, e.g. EC2 instances with a certain function - you can even track CIDR values of Subnets, VPCs and EC2 within the same list.

I have a blog on how to automatically maintain a list of EC2 instances Private IP addresses based on a Tag set against an EC2 instance: Maintain a Prefix List of EC2 Private IP Addresses using EventBridge

Let's create a Prefix List in the AWS Console

1

2

3

Prefix List – Security Group Reference

Customer-managed Prefix List is great option to have to centrally manage and track a list of CIDR blocks allowed to ingress an ENI by referencing Prefix Lists in Security Groups, a single Prefix List instance can be referenced by one or many Security Groups within the same account or cross-account.

Let's take a look at an example

This is especially useful in scenarios where you have fleet of EC2 instances where you like to allow the same network traffic sources to ingress on Port 22 to perform administration tasks, these fleet EC2 instances could scatter across multiple VPCs, and may even be scattered across multiple AWS accounts.

4

Often, we add a new Source CIDR to all Security Groups as we allow a new machine to perform administration tasks to the same fleet of EC2 instances, or even remove (or not when we forget) a CIDR Source when a machine is retired. In the past we would have modified each and every one of these Security Groups.

Here is how we can leverage Customer-managed Prefix Lists with Security Groups:

5

Here, under the same Security Group rules outcome we externalise the CIDR values into a Prefix List and reference the list in all 3 Security Groups; in the case of Security Groups spanning across multiple AWS accounts the Prefix Lists can be shared with other AWS accounts using Resource Access Manager (RAM). Now, we can allow a new machine to perform administration tasks across the entire fleet of EC2 instances by only adding a new CIDR Source to a single location, conversely, we can remove a machine by deleting a CIDR Source. There is also an added benefit of reduced effort in the need to identify which Security Groups have a rule for an IP address if we were to remove access across the entire fleet using this pattern – because it is maintained in a single location.

Prefix List – Subnet Route Table Reference

Another way to use Prefix Lists is to use them to centrally manage and track a list of CIDR block destinations to route traffic out of a Subnet’s Route Table to the same Target, a Prefix List can be referenced by one or many Subnet Route Tables within the same account or cross-account using RAM.

Let's take a look at an example

Below, we have a scenario with 3 different Route Tables across the two VPCs, with each Route Table with the same Transit Gateway Target for the same set of Destinations; and also the same Destinations routed to their respective Egress Only Internet Gateway (EIGW) for their VPC.

6

Here is how we can leverage Customer-managed Prefix Lists with Subnet Route Tables:

7

We have externalised the Destination CIDR values of the 3 Route Tables into 2 separate Prefix Lists: 1st Prefix List contains the CIDR block values of Destinations routed for the EIGW in their respective VPC; the 2nd Prefix List contains CIDR block values of Destinations routed for the same Transit Gateway instance all VPCs is an attachment of.

Prefix List – Transit Gateway Route Table Reference

Lastly, in a Transit Gateway Route Table you have the option to either to define static routes or have routes dynamically propagated from a Transit Gateway attachment. You also have the option to use a Prefix List for routing.

Here is how we can leverage Customer-managed Prefix Lists with Transit Gateway Route Tables:

8

To reference a Prefix List in a Transit Gateway Route Table, you have to reference it under the "Prefix list references" section:

9

Considerations

  • The aggregated total Max Entries of all Prefix Lists referenced by a resource (e.g. a Security Group) is counted towards the resource's quota - not the aggregated total of actual entries of all Prefix Lists. Be conscious of the Prefix List you reference in a resource, does the resource referencing the Prefix List require all the CIDR values offered in the list? if not, you are not using Prefix Lists economically.
  • If the same Prefix List instance is referenced by multiple AWS resources then consistency is enforced - operational effort is reduced due to fewer changes by not having to change a values in multiple locations.
  • Before you add or remove a CIDR value from a Prefix List, consider the flow on impact it may have to the downstream resources that reference this list, as you may inadvertently terminate some traffic flow, or worse, open up traffic to sources you don't intend to.

Conclusion

One of the things I have noticed during my short time in consulting so far is that organising Cloud resources (in particular Networking), structuring them correctly and consistently across multiple environments will set up a solid foundation for organisations in the long term, however, it is often an area that is overlooked and is only paid attention to when the rate of innovation is slowed down due to complexities and inconsistencies. Prefix Lists is a great option to have to improve consistency and operational efficiencies.

Here I have only detailed the basic use of Customer-managed Prefix Lists, but in my other blog I have a more advanced use case leveraging Prefix Lists: Work-around for cross-account Transit Gateway Security Group Reference

This solution compliments the use of networking solutions in other blogs I have written:

Maintain a Prefix List of EC2 Private IP Addresses using EventBridge

· 7 min read
Chiwai Chan
Tinkerer

AWS VPC customer-managed prefix list is a great feature to have in a tool box as it provides the ability to track and maintain a list of CIDR block values, that can be referenced by other AWS Networking component’s in their rules and tables. Each Prefix List supports either IPv4 or IPv6 based addresses, and a number of expected Max Entries for the list must be defined; the number of entries in the list cannot exceed the Max Entries. Check out my blog on AWS Prefix List to learn how it could be referenced and leveraged by other AWS Networking components.

In this blog we will:

  • Walk-through the proposed solution
  • Deploy the solution from a SAM project hosted in my GitHub repository
  • Stop the running EC2 instance provisioned by the SAM project's CloudFormation stack - this will de-register the Private IP address of the EC2 instance from the Prefix List (also provisioned by the CloudFormation stack)
  • Start the same EC2 instance - this will register the Private IP address of the provisioned EC2 instance back into the Prefix List
  • Manually create an EC2 instance with a Tag value of "prefix-list=eventbridge-managed-prefix-list"

In this solution we propose an architecture to maintain a list of EC2 Private IPs in a Prefix List by leveraging EventBridge to listen for EC2 Instance State Change Events.

1

Depending on the EC2 Instance State Change value we will perform a different action against the Prefix List using a Lambda Function: if the Instance State is “running" then we register the Private IP address into the Prefix List; or, deregister the Private IP address from the Prefix list when the Instance State is “stopping”.

2

When the event is received by the Lambda function, it will perform a lookup on the Tags of the EC2 instance for a Tag (e.g. prefix-list=eventbridge-managed-prefix-list) that indicates which Prefix List (or Lists) the Lambda function will register/de-register the Private IP against. The Prefix List should be maintained economically - because it affects the quotas of resources that reference this Prefix List as described by the AWS documentation: Prefix lists concepts and rules, so the Lambda function should ideally set the Prefix List Max Entries to the number of entries expected in the list before an entry is registered, or, afterwards if an entry de-registered.

By maintaining a Prefix List and leveraging this pattern in your solutions, your solutions may potentially benefit in the following ways:

  • Reusability of configurations which will reduce the operational burden and improve consistency.
  • Re-use of Prefix Lists by sharing it with other AWS accounts by leveraging Resource Access Manager
  • Creates an automated mechanism to track and maintain a definitive list of Private IP addresses of similarly grouped of EC2 instances with non-deterministic IP addresses
  • High cohesion and low Coupling designs: reduce manual flow on changes when a change is implemented
  • Leverage programmatic mechanisms for automatically changes and maintenance – minimise deployments and/or manual tasks
  • Improve Security posture: this may potentially reduce occurances of overly broad CIDR values used in rules or route tables where it is used to encompass a few number of IP address within a wide IP range

Deploying the solution

Here we will walk-through the steps involved to deploy a SAM project of this solution hosted in my GitHub repository: https://github.com/chiwaichan/prefix-list-of-ec2-private-ip-addresses-using-eventbridge

Prerequisites:

Run the following command to checkout the code

git clone git@github.com:chiwaichan/prefix-list-of-ec2-private-ip-addresses-using-eventbridge.git

cd prefix-list-of-ec2-private-ip-addresses-using-eventbridge/

3

Run the following command to configure the SAM deploy

sam deploy --guided

Enter the following arguments in the prompt:

  • Stack Name: prefix-list-of-ec2-private-ip-addresses-using-eventbridge
  • AWS Region: ap-southeast-2 or the value of your preferred Region
  • Parameter ImageID: ami-0c6120f461d6b39e9 (the Amazon Linux AMI ID in ap-southeast-2), you can use any AMI ID for your Region
  • Parameter SecurityGroupId: the Security Group ID to use for the EC2 instance provisioned, e.g. sg-0123456789
  • Parameter SubnetId: the Subnet ID of the Subnet to deploy the EC2 instance in, e.g. subnet-0123456678

4

5

6

Confirm the deployment

Let's check to see that everything has been deployed correctly in our AWS account.

Here we can see the list of AWS resources deployed in the CloudFormation Stack

7

Here we can see the details of the EC2 instance provisioned in a "Running" state. Take note of the Private IPv4 address. 8

This is the Prefix List provisioned; here we can see the Private IPv4 address of the EC2 instance in the Prefix list entries. Also, note that the Max Entries is currently set to 1. 9

Stopping the running EC2 Instance

Let's stop the EC2 instance

10

We should see the Private IP address of the EC2 instance removed from the Prefix List Entries, the Max Entries remains as 1 - this is because the minimum value must be 1 even when there are no Entries in the Prefix List

11

This is the sniplet of Python code in the Lambda function that removes the Private IP address from the Prefix List:

# if the instance state change is 'stopping' so we remove the private IP CIDR to the Prefix List
elif ec2_state == "stopping":
if is_in_list:
print("remove")

response = client.modify_managed_prefix_list(
PrefixListId=prefix_list_id,
CurrentVersion=current_prefix_list_version,
RemoveEntries=[
{
'Cidr': private_id_address + "/32"
},
]
)

if len(current_entries) != 1:
sleep(3)

response = client.modify_managed_prefix_list(
PrefixListId=prefix_list_id,
MaxEntries=len(current_entries) - 1
)
else:
print("not in list so no action")

Starting the stopped EC2 Instance

Let's start the EC2 instance

12

We should see the Private IP address of the EC2 instance added back to the Prefix List Entries. Note the description is different to what it was when we first saw it earlier.

13

This is the sniplet of Python code in the Lambda function that adds the Private IP address to the Prefix List:

# if the instance state change is 'running' so we add the private IP CIDR to the Prefix List
if ec2_state == "running":
if is_in_list:
print("already in list so no action")
else:
print("add")

if len(current_entries) + 1 != prefix_list["MaxEntries"]:
response = client.modify_managed_prefix_list(
PrefixListId=prefix_list_id,
MaxEntries=len(current_entries) + 1
)

sleep(3)

response = client.modify_managed_prefix_list(
PrefixListId=prefix_list_id,
CurrentVersion=current_prefix_list_version,
AddEntries=[
{
'Cidr': private_id_address + "/32",
'Description': 'added by EventBridge Lambda'
},
]
)

Manually create an EC2 instance with a Prefix List Tag

Let's launch a new EC2 instance (using any AMI and deploy it in any Subnet with any Security Group) with a value of "eventbridge-managed-prefix-list" for the "prefix-list" Tag, the EventBridge and Lambda will register the Private IP address of this newly created instance into the Prefix List "eventbridge-managed-prefix-list".

14

15

16

Here we see the Private IP address of the new manually created EC2 instance appear in the Prefix List Entries; also, the Max Entries has been updated to 2 by the Lambda function.

17

FYI, You can adapted this pattern and Lambda function to add or remove Private IP addresses based on the EC2 instance state change value of your choosing.

Clean up

  • Delete the manually created EC2 instance; afterwards, you can see it removed from the Prefix List and the Prefix List's Max Entries decreased back down to 1 by the Lambda function
  • Delete the CloudFormation stack with the name "prefix-list-of-ec2-private-ip-addresses-using-eventbridge"

This solution compliments the use of networking solutions in other blogs I have written:

Work-around for cross-account Transit Gateway Security Group Reference

· 8 min read
Chiwai Chan
Tinkerer

Have you ever tried to create a Security Group with a Source or Destination rule that references another Security Group? how about referencing a Security Group from another AWS account to allow ingress network traffic over a Transit Gateway architecture? if this question peaked your interest then you should keep reading.

In this blog we will go into:

  • Prerequisites
  • What we like to have
  • What we probably end up doing most of the time
  • What we could do instead using AWS Customer-managed Prefix Lists
  • Considerations

This blog builds on top of the Prefix List patterns I described in this blog: AWS Prefix List, so have a read of it to provide you with a better context as you read on.

What we like to have

How many of us have tried to implement the following architecture but realised it was not technically possible?

1

I myself have certainly tried to implement this a couple of years ago but to no avail; recently, a client said they also tried to implement this very same pattern, as per usual I did a bit of googling and confirmed that it is still the case today.

What we probably end up doing most of the time

This is probably what most of us do to allow cross-account network traffic to ingress into an EC2 instance over a Transit Gateway architecture.

2

In VPC A, instead of being able to reference a Security Group (outside of AWS account A, so from either account B, C or D) as the Source traffic of an ENI (via Security Group rules) attached to the EC2 instance in VPC A, one of the current methods is to add the CIDR blocks of the source traffic in the Source rules in the Security Group in VPC A: the CIDR value could either be the entire VPC CIDR block (of VPC B, C or D) to allow all traffic from a VPC, or, a Subnet's CIDR block to narrow down the ingress traffic to flow only from within a sub-section of a source VPC; or, the specific Private IP addresses of the source EC2 instances (e.g. 172.20.15.1/32).

The approach you decide for this pattern depends on the level of security posture you are comfortable with implementing into your network architecture:

  • VPC CIDR block values: this will allow ingress traffic wide open from the entire source externally VPC, if you intend for all resources from a source VPC to send traffic to your target resources then this option is fine
  • Subnet CIDR block values: this provides a narrower approach with a slightly tighter level of network security than above, if you intend for all resources from a source Subnet to send traffic to your target resources then this option is fine
  • Specific CIDR values of a Private IP addresses: this option provides the tightest network security control of the 3 options, however, maintaining a list Private IP addresses of EC2 instances outside of your AWS account (whether you or a 3rd party owns the account) will require a some operational effort. The solution proposed below will provide an automated mechanism to solve this particular problem

Network security controls could be further tightened when coupled with the use of NACLs, have a read of my blog for an example of incoroporating NACLs into your network architecture: Swiss Cheese Network Security: Factorising Security Group Rules into NACLs and Security Group Rules

An example scenario that could be problematic for this architecure is that, if the Source Private IP addresses (for resources outside of the account) needs to be constantly added or removed in the Security Group in VPC A - pet EC2 instances being provisioned and terminated: this will be a burden for the operations team as they would constantly need to update the Security Group rules to relfect changes happening outside of the AWS account - this would not be a problem if we were able to reference in rules the Security Groups from other AWS accounts, perhaps one day AWS will have this ability. This is especially burdensome when you have to co-ordinate changes with 3rd party owners of the AWS accounts outside of your control, imagine having to maintain changes from a dozen external AWS accounts.

What we could do instead using AWS Customer-managed Prefix Lists

Here we propose a pattern to achieve the same outcome but instead we leverage Prefix Lists, to externalise the management of CIDR blocks in the AWS accounts (B, C and D) where the network traffic originate from, then reference the external Prefix List in each of the accounts (B, C, and D) in the Security Group rules of account A; with the help of AWS Resource Access Manager (RAM) as Prefix Lists as shared with AWS account A by account B, C and D.

3

In the diagram above we have 3 options for the CIDR values maintained in these Prefix Lists outside of AWS account A, these types of values are similiar to the 3 options when the rules were defined (explained earlier) in the Security Group in VPC A, but the principle of network security controls remains the same in terms of tightness.

This pattern achieves the same outcome as what we desire if Security Groups could be (it is not supported by AWS at the time of writing this blog) referenced over a Transit Gateway, but it does have its drawbacks: the Max Entries (not the actual) of a Prefix List is counted towards the Quota of the Security Group that references it – so the example illustrated above results in 3 rules (1 for each Prefix List for each account) created in the Security Group in VPC A. This patterns has merits when you want allow inversion of control to enable external AWS accounts to control what network traffic is allowed to enter with the help of using Prefix Lists shared through RAM - remember the control is essentially delegated to the external AWS accounts, so you have to trust the level of scoping for CIDR value entries is being maintained in these accounts.

Bonus - Extra tight network security controls

The pattern above solves a small to medium sized problem on its own, but if we were to combine it with the patterns detailed in these two blogs: Leveraging AWS Prefix Lists and Maintaining a Prefix List of EC2 Private IP Addresses using AWS EventBridge, we can achieve the following:

4

By combining the 3 patterns we will end up with a network architecture that achieves the following:

  • A work-around for cross-account Security Group reference over a Transit Gateway.
  • List of Private IP addresses of similar EC2 instances (any grouping of your choosing) automatically tracked and managed in a Prefix List within each spoke account based on a Tagged value on the EC2 instances.
  • The same Prefix List in each spoke account can be referenced (via Resource Access Manager) to route return traffic back from the Subnet Route Table in VPC A to the originating Transit Gateway – this could potentially fully automate routing of traffic to Transit Gateway – great for scenarios where you only want return traffic for one or two IP addresses (especially when they are pets) in account B, C or D.
  • The same Prefix List in each spoke account can be referenced (via Resource Access Manager) to route return traffic back from Transit Gateway to the destined source Transit Gateway Attachment – this could potentially be used to automate routing if static or propagated routing is not used in a Transit Gateway Route Table. We can narrow it down to a very small subset of distributed allowable return traffic IP block for a spoke source traffic attachment – so only a subset of return traffic is allowed to return back to the originating TGW spoke.
  • Depending on the narrowness of the CIDR values used in the Prefix Lists, e.g. a few distributed /32 IPs in the Prefix List for a source VPC with a 1024 addresses for it's CIDR block, if used effectively, least privilege for network security is achieved using this pattern.

Considerations

As with any patterns, services or components, the pros and cons of each one needs to be weighed against each other and thought out in the interest of the long-term overall benefits for your solution and most importantly for your organisation. Restructuring existing networking and migrating workloads into it can be difficult and time consuming - especially if manual steps to deploy your infrastructure is required. Use Prefix Lists economically so that you do not under consume the number of Max Entries set by leveraging Lambdas to automatically update Max Entries; check out my blog on Maintaining a Prefix List of EC2 Private IP Addresses using AWS EventBridge.

This solution compliments the use of networking solutions in other blogs I have written:

Swiss Cheese Network Security Factorising Security Group Rules into NACLs and Security Group Rules

· 9 min read
Chiwai Chan
Tinkerer

Introduction

Lately I've been doing some networking configuration reviews for some of the projects I've been put on; to balance out the #crazycatlady blogs I'll be blogging about some network patterns and components that don't often get much attention or get used at all in the pipeline of blogs.

Today I'll be talking about Network Access Control List (NACL) and examples of how it could be used; and most importantly why it should be used.

NACLs are firewalls rules for your Subnets like how Security Group (SG) are firewall rules for your ENIs - SGs controls what traffic are allowed to enter your ENIs and NACLs controls what network traffic is allowed to enter your Subnet. Think of an onion and its layers, the NACLs is the outer layer around your SGs, so if your traffic is blocked by NACL rules (outer layer) then it will not be able to get into your Subnet, therefore it is impossible for the traffic to reach your ENIs (next layer in).

I've only reviewed a small handful of AWS network configurations but one thing I've noticed is that I've only ever seen the same default single NACL rule used that Allows all network traffic sources to all ports going into a Subnet.

Problem

We've reached the maximum allowable limit for rules in a Security Group and attached as many Security Groups to an ENI as we are allowed to.

Short summary of the solution

Reduce the number of rules: incorporating some NACL rules into a network design could reduce the overall number of Security Group rules if used effectively - by pulling firewall rules out into the Subnet layer using NACLs; and at the same time improves security posture as traffic is checked and blocked before it enters a Subnet, as opposed to traffic getting checked and blocked at a resource layer by Security Groups after it enters a Subnet – this effectively is adopting a defence in layers approach.

Example of the problem

problem nacls

We commonly open up All Ports, Protocols and Sources/Destinations into and out of a Subnet using NACLs without leveraging Deny rules.

problem security groups

We commonly apply all Firewall rules at the resource’s ENI layer via Security Groups; after all traffic routed into a Subnet is allowed to enter.

problem intersect

The network traffic allowed into an AWS resource depends on the combination of the rules applied to the Subnet’s NACL, as well as the rules applied at the Security Groups layer: the Intersection of the 2 rule sets is what allows network traffic to be entered into an ENI – think of it like the intersection of a Venn Diagram, or, a well commonly known model called the “Swiss Cheese”.

venn diagram

venn diagram

This is net result of network traffic sources and ports allowed to enter an AWS resource by the 2 layers of rules – as you expect to see this is all the rules applied at the Security Group layer. Below we show the equivalent configuration in the AWS Console as depicted by the diagrams above.

problem nacl aws console problem sg aws console

Note, we have 1 Allow rule in the NACL for all Protocols, Ports and Sources; and 9 Security Group rules made up of 3 CIDR blocks with each allowed to enter the same 3 Ports.

Solution

Here we have a solution that achieves the same outcome as the example described in the problem, but we will achieve it with the use of NACLs.

solution nacls

In the NACL, instead of using a single Allow rule for network traffic for all Protocols, Ports and Sources/Destinations, we have the following 3 rules:

  1. To allow all traffic source from 0.0.0.0/0 to enter the Subnet for Port 22
  2. To allow all traffic source from 0.0.0.0/0 to enter the Subnet for Port 80
  3. To allow all traffic source from 0.0.0.0/0 to enter the Subnet for Port 443

solution security groups

In the Security Group, we have the following rules:

  1. To allow all traffic source from 10.0.0.0/8 to hit the ENI on all Ports
  2. To allow all traffic source from 172.16.0.0/12 to hit the ENI on all Ports
  3. To allow all traffic source from 192.168.0.0/16 to hit the ENI on all Ports

At first glance when you look at the Security Group rules you may think that it is overly permissive because all Ports are opened for the 3 CIDR blocks, however, if we apply the logic of Venn Diagram Intersects for the 2 rule sets made up of NACL and Security Groups, then you will realise the net result of traffic Source and Ports allowed into an ENI is identical to the example in the problem without using NACLs.

solution intersect

solution intersect result

Here is what the NACL and Security configuration looks like in the AWS Console for the proposed pattern:

solution nacl aws console

solution intersect result

The net result of the 2 rule sets is identical and the traffic allowed to enter into an ENI remains the same; but notice in this pattern we have 3 Allow rules for the NACL and 3 rules for the Security Group (total of 6 vs where it was previously 10). In effect, we’ve reduced the number of rules in the Security Group by a factor of 3 but achieved the same outcome by leveraging NACLs, so this pattern is useful if you constantly find yourself hitting the AWS Quota limits for the number of rules in a Security or even hitting the limit for the number of Security Groups attached to an ENI.

Now let’s consider a more problematic example where there are many more Ports used that are spread out with gaps in between, with many specific CIDR values. Under the current pattern imagine the 60 rules in a Security Group made up of combinations of 6 different Ports and 10 different Sources with the following configuration:

PortSource
31010.1.0.1/32
31010.3.0.1/32
31010.9.0.1/32
310172.16.1.0/32
310172.16.4.0/32
310172.16.8.0/32
310192.168.1.1/32
310192.168.4.1/32
310192.168.8.1/32
310192.168.9.1/32
32010.1.0.1/32
32010.3.0.1/32
32010.9.0.1/32
320172.16.1.0/32
320172.16.4.0/32
320172.16.8.0/32
320192.168.1.1/32
320192.168.4.1/32
320192.168.8.1/32
320192.168.9.1/32
32210.1.0.1/32
32210.3.0.1/32
32210.9.0.1/32
322172.16.1.0/32
322172.16.4.0/32
322172.16.8.0/32
322192.168.1.1/32
322192.168.4.1/32
322192.168.8.1/32
322192.168.9.1/32
40010.1.0.1/32
40010.3.0.1/32
40010.9.0.1/32
400172.16.1.0/32
400172.16.4.0/32
400172.16.8.0/32
400192.168.1.1/32
400192.168.4.1/32
400192.168.8.1/32
400192.168.9.1/32
42010.1.0.1/32
42010.3.0.1/32
42010.9.0.1/32
420172.16.1.0/32
420172.16.4.0/32
420172.16.8.0/32
420192.168.1.1/32
420192.168.4.1/32
420192.168.8.1/32
420192.168.9.1/32
50010.1.0.1/32
50010.3.0.1/32
50010.9.0.1/32
500172.16.1.0/32
500172.16.4.0/32
500172.16.8.0/32
500192.168.1.1/32
500192.168.4.1/32
500192.168.8.1/32
500192.168.9.1/32

When we convert the 60 rules in the Security Group into using NACL and Security Group we get:

PortSource
ALL or 310-50010.1.0.1/32
ALL or 310-50010.3.0.1/32
ALL or 310-50010.9.0.1/32
ALL or 310-500172.16.1.0/32
ALL or 310-500172.16.4.0/32
ALL or 310-500172.16.8.0/32
ALL or 310-500192.168.1.1/32
ALL or 310-500192.168.4.1/32
ALL or 310-500192.168.8.1/32
ALL or 310-500192.168.9.1/32
PortSource
3100.0.0.0/0
3200.0.0.0/0
3220.0.0.0/0
4000.0.0.0/0
4200.0.0.0/0
5000.0.0.0/0

We have gone from 61 (60 SG rules + the NACL Allow all) rules down to 16 rules between the NACL and Security Group – the net result is identical. I have not stated which of the 2 tables above is for the NACL rules and which is for the Security Group rules, this is because it does not matter which attribute is used to factorise the rules into the NACL - if we remember the Intersect of a Venn Diagram – however, I suggest picking the Port or Source depending based around the network construct are you most likely hitting the rule limits – the area you want to leave wiggle room for. If we use table 2 for the Security Group rules then we’ve effectively reduced the rules by 90%.

To be able to fully take advantage of this pattern, careful consideration needs to happen at the beginning of any VPC and Subnet designs in respect to how resources are grouped within a VPC and especially within Subnet, too many grouping of dissimilar resources in terms of Source Traffic, Protocols and Ports could have consequence of too many rules; a blog in the pipeline. Off course it is best practice to implement security in all layers so if there is room left in your Security Groups you should lock down your rules by Ports and Source as much as you can.

This solution compliments the use of networking solutions in other blogs I have written:

Smart Cat Feeder – Part 4

· 5 min read
Chiwai Chan
Tinkerer

This is the Part 4 and final blog of the series where I detail my journey in learning to build an IoT solution.

Please have a read of my previous blogs to get the full context leading up to this point before continuing.

  • Part 1: I talked about setting up a Seeed AWS IoT Button
  • Part 2: I talked about publishing events to an Adruino Micro-controller from AWS
  • Part 3: I talked about my experience of using a 3D Printer for the first time to print a Cat Feeder

Why am I building this Feeder?

I've always wanted to dip my toes into building IoT solutions beyond doing what a typical tutorial teaches in only turning on LEDs - I wanted to build something that would used everyday. Plus, I often forget to feed the cats while I am away from home (for the day), so it would be nice to come home to a non-grumpy cat by feeding them remotely any time and from any where in the world using the internet.

What was used to build this Feeder?

  • A 3D Printer using PLA as the filament material.
  • An Arduino based micro-controller - in this case a Seeed Studio XIAO ESP32C3
  • A couple of motors and controllers
  • AWS Services
  • Seeed AWS IoT Button
  • Some code
  • and some cat food

So how does it work and how is it put together?

To simply describe what is built, the Feeder uses an Iot button click to trigger events over the internet to instruct the feeder to dispense food into one or both food bowls.

cat feeder

Here are some diagrams describing the architecture of the solution - the technical things that happens in-between the IoT button and the Cat Feeder.

architecture diagram seeed sequence diagram

When the Feeder receives a MQTT message from the AWS IoT Core Service, it runs the motor for 10 seconds to dispense food into either one of food bowls, and if the message contains an event value to dispense food into both bowls we can run both motors concurrently using the L298N controller.

Here's a video of some timelapse picture captured during the 3 weeks it took to 3D print the feeder.

The Feeder is made up of a small handful of basic hardware components, below is a Breadboard diagram depicting the components used and how they are all wired up together. A regular 12V 2A DC power adapter supply is used to power all the components.

breadboard diagram seeed

The code to start and stop a motor is about 10 lines of code as shown below. This is the completed version of the Arduino Sketch shown in Part 2 of this blog series when it was partially written at the time.

#include "secrets.h"
#include <WiFiClientSecure.h>
#include <MQTTClient.h>
#include <ArduinoJson.h>
#include "WiFi.h"

// The MQTT topics that this device should publish/subscribe
#define AWS_IOT_PUBLISH_TOPIC "cat-feeder/states"
#define AWS_IOT_SUBSCRIBE_TOPIC "cat-feeder/action"

WiFiClientSecure net = WiFiClientSecure();
MQTTClient client = MQTTClient(256);

int motor1pin1 = 32;
int motor1pin2 = 33;
int motor2pin1 = 16;
int motor2pin2 = 17;

void connectAWS()
{
WiFi.mode(WIFI_STA);
WiFi.begin(WIFI_SSID, WIFI_PASSWORD);

Serial.println("Connecting to Wi-Fi");
Serial.println(AWS_IOT_ENDPOINT);

while (WiFi.status() != WL_CONNECTED) {
delay(500);
Serial.print(".");
}

// Configure WiFiClientSecure to use the AWS IoT device credentials
net.setCACert(AWS_CERT_CA);
net.setCertificate(AWS_CERT_CRT);
net.setPrivateKey(AWS_CERT_PRIVATE);

// Connect to the MQTT broker on the AWS endpoint we defined earlier
client.begin(AWS_IOT_ENDPOINT, 8883, net);

// Create a message handler
client.onMessage(messageHandler);

Serial.println("Connecting to AWS IOT");
Serial.println(THINGNAME);

while (!client.connect(THINGNAME)) {
Serial.print(".");
delay(100);
}

if (!client.connected()) {
Serial.println("AWS IoT Timeout!");
return;
}

Serial.println("About to subscribe");
// Subscribe to a topic
client.subscribe(AWS_IOT_SUBSCRIBE_TOPIC);

Serial.println("AWS IoT Connected!");
}

void publishMessage()
{
StaticJsonDocument<200> doc;
doc["time"] = millis();
doc["state_1"] = millis();
doc["state_2"] = 2 * millis();
char jsonBuffer[512];
serializeJson(doc, jsonBuffer); // print to client

client.publish(AWS_IOT_PUBLISH_TOPIC, jsonBuffer);

Serial.println("publishMessage states to AWS IoT" );
}

void messageHandler(String &topic, String &payload) {
Serial.println("incoming: " + topic + " - " + payload);

StaticJsonDocument<200> doc;
deserializeJson(doc, payload);
const char* event = doc["event"];

Serial.println(event);

feedMe(event);
}

void setup() {
Serial.begin(9600);
connectAWS();

pinMode(motor1pin1, OUTPUT);
pinMode(motor1pin2, OUTPUT);
pinMode(motor2pin1, OUTPUT);
pinMode(motor2pin2, OUTPUT);
}

void feedMe(String event) {
Serial.println(event);

bool feedLeft = false;
bool feedRight = false;

if (event == "SINGLE") {
feedLeft = true;
}
if (event == "DOUBLE") {
feedRight = true;
}
if (event == "LONG") {
feedLeft = true;
feedRight = true;
}

if (feedLeft) {
Serial.println("run left");
digitalWrite(motor1pin1, HIGH);
digitalWrite(motor1pin2, LOW);
}

if (feedRight) {
Serial.println("run right");
digitalWrite(motor2pin1, HIGH);
digitalWrite(motor2pin2, LOW);
}

delay(10000);
digitalWrite(motor1pin1, LOW);
digitalWrite(motor1pin2, LOW);
digitalWrite(motor2pin1, LOW);
digitalWrite(motor2pin2, LOW);
delay(2000);

Serial.println("fed");
}

void loop() {
publishMessage();
client.loop();
delay(3000);
}

Demo Time

The Seeed AWS IoT Button is able to detect 3 different types of click events: Long, Single and Double, and we are able to leverage this all the way to the feeder so we will have it performing certains actions base on the click event type.

The video below demonstrates the following scenarios:

  • Long Click: this will dispense food into both cat bowls
  • Single Click: this will dispense food into Ebok's cat bowl
  • Double Click: this will dispense food into Queenie's cat bowl

What's next?

Build the nervous system of an ultimate nerd project I have in mind that would allow me to voice control actions controlling servos, LEDs and audio outputs, by using a mesh of Seeed XIAO BLE Sense micro-controllers and TinyML Machine Learning.

Hosting multiple subsites under a serverless website instance

· 7 min read
Chiwai Chan
Tinkerer

Introduction

Recently, I was tasked with coming up with a solution for a single website instance to host various pockets of documentations scattered across a growing number of Git repositories; each repository hosted documentation for a specific subject domain written in Markdown format - you may have come across README.md files all over the internet which is a classic example of Markdown.

Here is a list of requirements based on what the solution has to solve:

  • Website Hosting: the documentation website must be accessible from anywhere over the public internet. Optionally, we could limit access to a list of whitelisted IPs.
  • Authentication: access is only granted to those that should have it. Federating an IdP is ideal, e.g. Azure AD.
  • Serverless.
  • Host multiple sets of documentation scattered across multiple Azure DevOps Git Repositories.
  • Versioning: store each set of documentation in source control for all its goodness.
  • Format: create the documentation in plain text without having to worry much about styling and formatting. This is where Markdown file format comes in.
  • Pipelines to detect changes to documentation that would in turn trigger builds and deployments.
  • Azure AD Federation for SSO, this is especially useful for organisations with many applications and users so existing credentials can be re-used and managed the same way.

Solution

Serverless Website Hosting Infrastructure

website infrastructure chiwaichan

The Serverless Website Hosting Infrastructure I am about to talk about is built on top on an AWS's sample solution found here. I added resources on top of the example to suit our needs.

  • The user visits https://docs.example.co.nz from a browser on any device.
  • CloudFront: We are leveraging this component as the Content Distribution Network for the website, using the standard pattern of serving the CDN using an S3 Bucket.
    • Successful Lambda@Edge Check Auth: Static website content stored in S3 will only be served if the user is authenticated - a valid JWT (JSON Web Token) is found in the request.
    • Unsuccessful Lambda@Edge Check Auth: Return an HTTP 302 in the response to the user's browser to redirect user to Cognito so the user can sign in
    • This CloudFront instance is configured with the following settings:
      • Website content is cached for 30 minutes, each expired content file will be retrived from S3 individually.
      • Configured with the Alternative Domain Name: docs.example.co.nz
      • Configured with an SSL certificate for the sub-domain docs.example.co.nz using ACM (Amazon Certificate Manager) Service, the certificate is free and will be automatically renewed and managed by AWS.
  • Lambda@Edge: Validates every incoming request to CloudFront for the existence of a cookie to see if it contains a user's authentication information/JWT.
    • No authentication information: Respond to Cloudfront that the user needs to login.
    • Contains authentication cookie: Exchange the authentication information for a JWT token and store the JWT in the cookies in the HTTP response.
  • S3: This bucket is used as a CloudFront Origin and contains the static content files for the Documentation Website, e.g. HTML/CSS/JS/Images.
  • Amazon Cognito: This is the component used as the entry point for Authentication into the website, we will Federate Azure AD as an IdP using SAML integration - the user will be redirected to Azure AD for authentication.
    • Post back: When Cognito receives a SAML Assertion/Token from Azure AD after a successful login, a user's profile of that user is saved into the Cognito's User Pool by collecting the user attributes (claims) from the SAML Assertion.
  • Azure Active Directory: This solution will Federate Azure AD into Cognito using SAML, I suggest on following this walkthrough if you have a requirement for Azure AD Federation: https://aws.amazon.com/blogs/security/how-to-set-up-amazon-cognito-for-federated-authentication-using-azure-ad/
    • Successful authentication: the IDP posts back a SAML Assertion/Token back to Cognito

Set up instructions - Website Infrastructure

  • Create an AWS CloudFormation stack for the Website Hosting Infrastructure from the existing YML file "templates/aws-website-infrastructure.yml" found in this repository. We'll need the Stack's Outputs later on when we create the AWS Pipeline.

website infrastructure chiwaichan cloudformation create website infrastructure chiwaichan cloudformation outputs

Azure DevOps and AWS CodePipeline

deployment pipeline chiwaichan

There are 2 types of pipelines that makes up the end-to-end pipeline for this solution, 1st type is for the Azure side to push Markdown files into AWS, the other is for AWS to compile the Markdown files and deploy them into S3 where the Website Content is hosted.

In the Azure pipeline we take the raw documentation (Markdown) from a Git repository hosted in Azure DevOps Git Repositories, each time a set code changes is pushed into any one of the Git repositories will trigger an Azure Pipeline "Run", the Azure Pipeline will upload the Markdown and assets files to a centralised S3 bucket repository (created by the Website Infrastructure CloudFormation Stack earlier).

Each Azure DevOps repository will host documentation for a specific domain topic, this Pipeline pattern is designed to cater for a growing number of repositories that has a requirement to host all documentations within a single Wesbite instance; the Azure Pipeline needs to be configured for each instance of Azure DevOps Git Repository. Once the Markdown files are converted to HTML during the CodeBuild stage of the CodePipeline execution, the output of those files are upload the S3 bucket that is served behind the CloudFront/Website stack.

Set up instructions - Azure & AWS Pipelines

1 This step is skipped if the infrastruture website was previously set up for the another (first) set of documentation, in this case re-use the Access Keys created at that time in subsequent steps. Create a set of Access Keys for an AWS IAM User with a policy to perform the following actions on the "SourceZipBucket" bucket created in the Website Infrastructure CloudFormation stack earlier:

  • s3:PutObject
  • s3:GetObject
  • s3:DeleteObject
  • s3:ListBucket
#example

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:DeleteObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::${REPLACE-WITH-SOURCE-ZIP-BUCKET-NAME}/*",
"arn:aws:s3:::${REPLACE-WITH-SOURCE-ZIP-BUCKET-NAME}"
]
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": "s3:ListAllMyBuckets",
"Resource": "*"
}
]
}

2 Create a new ADO pipeline from the existing YML file "templates/azure-pipeline.yml" in this repository.

azure devops pipeline 1 azure devops pipeline 2 azure devops pipeline 3 azure devops pipeline 4

Use these as the variables for the Pipeline using the same case:

  • S3-documentation-bucket-name: use the Outputs value of "SourceZipBucket" from the AWS CloudFormation Website Infrastructure Stack created earlier - this is the same S3 bucket name used in the IAM User policy.
  • AWS_ACCESS_KEY_ID: The value of the Access Key ID created earlier.
  • AWS_SECRET_ACCESS_KEY: The value of the Secret Access Key created earlier.
  • AWS_REGION: The region where the SourceZipBucket was created in.
  • sub-site-name: This is the name of the URL path for this set of documentation, it could be the name of the Azure DevOps Repository Name for easy reference. E.g. https://docs.example.co.nz/${sub-site-name}

3 Hit Run to start a pipeline execution

4 Skip this Step if you skipped Step 1. Create a CloudFormation stack for the Pipeline to deploy new Documentation, use the Cloudformation YML file "templates/aws-pipeline.yml" in this repository. Use the following as the Parameter values for the Pipeline:

  • SourceBucket: This is the Outputs value of "SourceZipBucket" from the AWS CloudFormation Website Infrastructure Stack created earlier.
  • StaticFilesBucket: This is the Outputs value of "DocumentationS3Bucket" from the AWS CloudFormation Website Infrastructure Stack created earlier.

aws cloudformation pipeline 1

Populate the website skeleton for Docusaurus

The CodeBuild instance in the pipeline runs a set of commands that takes the Markdown and asset files, then produces as an output the HTML format equivalent files of the entire website for all sub-sites. In order for the CodeBuild instance to run successfully it expects the skeleton files in the root of the "DocumentationS3Bucket" S3 Bucket found in the Outputs of the Website Infrastructure CloudFormation Stack, this is so Docusaurus knows how to render the Markdown files into HTML.

To generate the skeleton files and upload it to the S3 bucket use the following commands on a local machine:

npx create-docusaurus@latest website classic
aws s3 cp website/. s3://${DocumentationS3Bucket}

Smart Cat Feeder – Part 2

· 8 min read
Chiwai Chan
Tinkerer

seeed studio xiao esp32c3

The source code for this blog can be found in my Github repository: https://github.com/chiwaichan/aws-iot-cat-feeder. This repository only includes the source code for the solution implemented up to this stage/blog in the project.

In the end I decided to go with the Seeed Studio XIAO ESP32C3 implementation of the ESP32 micro-controller for $4.99 (USD). I also ordered some other bits and pieces from AliExpress that's going to take some time to arrive.

In this Part 2 of the blog series I will demonstrate the exchange of messages (JSON payload) using the MQTT protocol between the ESP32 and the AWS IoT Core Service, as well as the exchange of messages between a Lambda Function and the ESP32 - this Lambda is written in Python which is intended to replace the Lambda triggered by the IoT button event found in Part 1.

Prerequisites if you like to try out the solution using the source code

  • An AWS account.
  • An IoT button. Follow Part 1 of this blog series to onboard your IoT button into the AWS IoT 1-Click Service.
  • Create 2 Certificates in the AWS IoT Core Service. One certificate is for the ESP32 to publish and subscribe to Topics to IoT Core, and the other is used by the IoT button's Lambda to publish a message to a Topic subscribed by the ESP32.

aws iot certificate list

Create a Certificate using the recommended One-Click option.

aws iot certificate create

Download the following files and take note of which device (the ESP32 or the IoT Lambda) you like to use this certificate for:

aws iot certificate created

Activate the Certificate.

aws iot certificate activated

Click on Done. Then repeat the steps to create the second Certificate.

Publish ESP32 States to AWS IoT Core

seeed studio xiao esp32c3 aws iot

The diagram above depicts the components used that is required in order for the ESP32 to send the States of the Cat Feeder, I've yet to decide what to send but examples could be 1.) battery level 2.) Cat weight (based on a Cat's RFID chip and some how weighing them while they eat) 3.) or how much food is remaining in the feeder. So many options.

  1. ESP32: This is the micro-controller that will eventually have a bunch of hardware components that we will take States from, then publish to a Topic.
  2. MQTT: This is the lightweight pub/sub protocol used to send IoT messages over TCP/IP to AWS IoT Core.
  3. AWS IoT Core: This is the service that will forward message to the ESP32 micro-controller that are subscribed to Topics.
  4. IoT Topic: The Lambda will publish a message along with the type of button event (One click, long click or double click) to the Topic "cat-feeder/action", the value of the event is subject to what is supported by the IoT button you use.
  5. Do something later on: I'll decide later on what to do downstream with the State values. This could be anything really, e.g. save a time series of the data into a database or bunch of DynamoDB tables, or get an alert to remind me to charge the Cat Feeder's battery with a customizable threshold?

Instructions to try out the Arduino/ESP32 part of the solution for yourself

  1. Install the Arduino IDE.
  2. Follow this AWS blog on setting up an IoT device, start from "Installing and configuring the Arduino IDE" to including "Configuring and flashing an ESP32 IoT device". Their blog walks us through on preparing the Arduino IDE and on how to flash the ESP32 with a Sketch.
  3. Clone the Arduino source code from my Github repository: https://github.com/chiwaichan/aws-iot-cat-feeder
  4. Go to the "secrets.h" tab and replace the following variables:

arduino secrets

  • WIFI_SSID: This is the name of your Wifi Access Point
  • WIFI_PASSWORD: The password for your Wifi.
  • AWS_IOT_ENDPOINT: This is the regional endpoint of your AWS Iot Core Service.

aws iot endpoint

  • AWS_CERT_CA: The content of the Amazon Root CA 1 file created in the prerequisites for the first certificate.
  • AWS_CERT_CRT: The content of the xxxxx.cert.pem file created in the prerequisites for the first certificate.
  • AWS_CERT_PRIVATE: The content of the xxxxx.private.key file created in the prerequisites for the first certificate.
  1. Flash the code onto the ESP32

arduino flash code

You might need to push a button on the micro-controller during the flashing process depending on the your ESP32 micro-controller

  1. Check the Arduino console to ensure the ESP32 can connect to AWS IoT and publish messages.

arduino console

  1. Verify the MQTT messages is received by AWS IoT Core

aws iot mqtt test client

Sending a message to the ESP32 when the IoT button is pressed

architecture diagram seeed

The diagram above depicts the components used to send a message to the ESP32 each time the Seeed AWS IoT button is pressed.

  1. AWS IoT button: this is the IoT button I detail in Part 1; it's a physical button that can be anywhere in the world where I can press to feed the fur babies once the final solution is built.
  2. AWS Lambda: This will replace the Lambda from the previous blog with the one shown in the diagram.
  3. IoT Topic: The Lambda will publish a message along with the type of button event (One click, long click or double click) to the Topic "cat-feeder/action", the value of the event is subject to what is supported by the IoT button you use.
  4. AWS IoT Core: This is the service that will forward message to the ESP32 micro-controller that are subscribed to Topics.
  5. ESP32: We will see details of the button event from each click in the Arduino console once this part is set up.

Instructions to set up the AWS IoT button part of the solution

  1. Take the 3 files create in the second set of Certificate created in the AWS IoT Core Service in the prerequisites, then create 3 AWS Secrets Manager "Other type of secret: Plaintext" values. We need a Secret value for each file. This is to provide the Lambda Function the Certificate to call AWS IoT Core.

aws secrets manager

  1. Get a copy of the AWS code from my Github repository: https://github.com/chiwaichan/aws-iot-cat-feeder

  2. In a terminal go into the aws folder and run the commands found in the "sam-commands.text" file, be sure to replace the following values in the commands to reflect the values for your AWS account. This will create a CloudFormation Stack of the AWS IoT Services used by this entire solution.

  • YOUR_BUCKET_NAME
  • Value for IoTEndpoint
  • Value for CatFeederThingLambdaCertName, this is the name of the long certificate value found in Iot Core created in the prerequisites for the second certificate.
  • Value for CatFeederThingLambdaSecretNameCertCA, e.g. "cat-feeder-lambda-cert-ca-aaVaa2", check the name in Secrets Manager.
  • Value for CatFeederThingLambdaSecretNameCertCRT
  • Value for CatFeederThingLambdaSecretNameCertPrivate
  • Value for CatFeederThingControllerCertName, this is the name of the long certificate value found in Iot Core created in the prerequisites for the second certificate used by the ESP32.
  • Find the Lambda created in the CloudFormation stack and Test the Lambda to manually trigger the event.
  • If you have setup an IoT 1-Click Button found in Part 1, you can replace that Lambda with the one created by the CloudFormation Stack. Go to the "AWS IoT 1-Click" Service and edit the "template" for the CatFeeder project.

aws iot one click lambda

  1. Let's press the Iot Button in the following way:
  • Single Click
  • Double Click
  • Long Click
  1. Verify the button events are received by the ESP32 by going to the Arduino console and you should see something like this:

arduino console aws iot mqtt messages

What's next?

I recently got a Creality3D Ender-3 V2 printer, I've got many known unknowns I know I need to get up to speed with in regards to fundamentals of 3D printing and all the tools, techniques and software associated with it. I'll attempt to print an enclosure to house the ESP32 controller, the wires, power supply/battery (if I can source a battery that lasts for more than a month on a single charge) and most importantly the dry cat food; I like to use some mechanical components to dispense food each time we press the IoT button described in Part 1. I'll talk in depth on the progress made on the 3D printing in Part 3.

Smart Cat Feeder – Part 1

· 3 min read
Chiwai Chan
Tinkerer

If you are forgetful when it comes to feeding your fur babies like me, and you often only realise you need to put some dry food into the bowl when you are at work then you should read these series of blogs. Over time, I'll be designing and building a smart cat feeder over time using a combination of components such as Arduino micro controllers, motors, sensors and IoT devices and Cloud services. I'll publish the steps taken in these series of blogs, also, I'll publish any designs and source code as I figure things out and make decisions on aspects of the design.

In this part 1 of the series, I will do a walkthrough on setting up an AWS IoT 1-Click device to trigger a Lambda Function. I got myself one of these Seeed IoT buttons for $20; I also bought a NCR18650B battery which I realised later on is only required if I wanted to run the device without it being powered by a USB type-C cable (used for charging the power as well).

seeed iot button for aws

Firstly, make sure you have an AWS account. Then install the AWS IoT1-Click app onto your phone and log in using your AWS account. With these we will be able to link IoT devices up to our AWS account.

aws iot app login

Claim the IoT device with Device ID

aws iot app claim

Scan the barcode on the back of the device; you can scan multiple devices in bulk.

aws iot app scan aws iot app added aws iot app complete claim aws iot app claim wait for click

Next, I'll set up the Wifi on the device so that it can reach the internet internet from home. Can't see why I can't set it up to my phone's AP for feeding on the go, I'll try it out some other time.

aws iot app wifi

Now we'll create a project and add the IoT device to a placement group in the AWS Console. Give a name and description for the project.

aws iot new project

Next define a template, this is where we create a Lambda function; all the plumbing between the IoT device and Lambda will be handled for us.

aws iot project template

Next we create a placement for the Iot device.

aws iot project placement

aws iot project new placement

aws iot project placement created

Since I have no Arduino micro-controllers (have yet to buy one), I will get the Lambda to log a message.

aws iot lambda

Push the button on the Iot device, wait for the event LED status to turn green after flashing white then check the logs CloudWatch Logs.

aws iot lambda logs

At some point I have to code the Lambda to perform a real action as each event comes through, which will be demonstrated in a following blog in the series instead of just logging to CloudWatch logs.

Within the app on your phone you can see status of each IoT device such as the remaining battery life percentage.

aws iot app devices

As well as a history of the button's events.

aws iot app device events

In the next blog, I'll configure the Lambda to push the event to a Topic for AWS IoT Core to subscribe to, which in turns will trigger an event to an ESP32 ( I've yet to decide on a specific version of the micro-controller) using the IoT MQTT protocol.