===== AWS Transit Gateway Appliance Mode =====
{{tag>AWS VyOS TGW}}
==== Description ====
While I was looking through some API changes for EC2 (https://awsapichanges.info/archive/changes/12caed-ec2.html) I noticed mention of "Appliance Mode" for the Transit Gateway. I also found the VPC Transit Gateway documentation (https://docs.aws.amazon.com/vpc/latest/tgw/transit-gateway-appliance-scenario.html) that related to this and has a very nice summary. I decided to expand on the summary and perform my own testing.
In my testing I used Amazon Linux 2 instances running Apache in the Spoke VPCs and VyOS free community edition in the Shared VPC as the Instance Firewall. I did use VPC Endpoints for SSM, SSM Messages, and EC2 Messages so I could use Systems Manager Session Manager to obtain CLI access to the test instances without having to use any NAT configuration on the VyOS instances. This allowed me to confirm the traffic flows without worrying about NAT modify the traffic.
At the time of testing modifying the Transit Gateway VPC attachment Appliance Mode can only be done via the API / CLI. Modification and status of Appliance Mode in not currently available via the AWS web console.
==== Diagram ====
{{ :images:svg:tgw_appliance_mode.svg?1000 | Transit Gateway Appliance Mode }}
\\
https://wiki.nerdydrunk.info/_media/images:svg:tgw_appliance_mode.svg
==== Testing Network ====
Below are the networks and route tables that were used during testing. I also used the default Security Group for each VPC but I added rule to allow all traffic from 10.0.0.0/8. In a production environment the security group should be more restrictive.
^ ^ Shared ^ Spoke 1 ^ Spoke 2 ^
| **CIDR** | **10.0.0.0/23** | **10.1.0.0/24** | **10.1.1.0/24** |
| Public A | 10.0.0.0/25 | N/A | N/A |
| Public B | 10.0.0.128/25 | N/A | N/A |
| Private A | 10.0.1.0/25 | 10.1.0.0/25 | 10.1.1.0/25 |
| Private B | 10.0.1.128/25 | 10.1.0.128/25 | 10.1.1.128/25 |
=== Spoke VPC Route Tables ===
^ VPC Spoke 1 Route Table ^^
^ Destination ^ Target ^
| 10.1.0.0/24 | Local |
| 0.0.0.0/0 | TGW |
^ VPC Spoke 2 Route Table ^^
^ Destination ^ Target ^
| 10.1.1.0/24 | Local |
| 0.0.0.0/0 | TGW |
=== Transit Gateway Route Tables ===
^ TGW Spoke Route Table ^^
^ Destination ^ Target ^
| 0.0.0.0/0 | Shared VPC |
^ TGW Shared Route Table ^^
^ Destination ^ Target ^
| 10.1.0.0/24 | Spoke 1 VPC |
| 10.1.1.0/24 | Spoke 2 VPC |
=== Shared VPC route Tables ===
^ VPC Shared Private A Route Table ^^
^ Destination ^ Target ^
| 10.0.0.0/23 | Local |
| 0.0.0.0/0 | VyOS A |
^ VPC Shared Public A Route Table ^^
^ Destination ^ Target ^
| 10.0.0.0/23 | Local |
| 10.1.0.0/16 | TGW |
| 0.0.0.0/0 | IGW |
^ VPC Shared Private B Route Table ^^
^ Destination ^ Target ^
| 10.0.0.0/23 | Local |
| 0.0.0.0/0 | VyOS B |
^ VPC Shared Public B Route Table ^^
^ Destination ^ Target ^
| 10.0.0.0/23 | Local |
| 10.1.0.0/16 | TGW |
| 0.0.0.0/0 | IGW |
==== Instance Firewall (VyOS) Configuration ====
* Set global firewall state policy to drop invalid traffic.
* NAT rule 100 to exclude private network traffic.
* NAT rule 200 to masquerade all traffic leaving eth0.
set firewall state-policy invalid action 'drop'
set nat source rule 100 destination address '10.1.0.0/16'
set nat source rule 100 exclude
set nat source rule 100 outbound-interface 'eth0'
set nat source rule 100 source address '10.1.0.0/16'
set nat source rule 200 outbound-interface 'eth0'
set nat source rule 200 translation address 'masquerade'
==== Change Transit Gateway Attachment Mode ====
Show TGW attachments and locate Shared VPC TGW attachment
aws ec2 describe-transit-gateway-vpc-attachments
Show info on a single TGW attachment
aws ec2 describe-transit-gateway-vpc-attachments \
--transit-gateway-attachment-ids tgw-attach-11111111111111111
Enable appliance mode for a TGW attachment
aws ec2 modify-transit-gateway-vpc-attachment \
--transit-gateway-attachment-id tgw-attach-11111111111111111 \
--options ApplianceModeSupport=enable
Disable appliance mode for a TGW attachment
aws ec2 modify-transit-gateway-vpc-attachment \
--transit-gateway-attachment-id tgw-attach-11111111111111111 \
--options ApplianceModeSupport=disable
==== Findings ====
With TGW appliance mode disabled and VyOS configured to allow invalid traffic all instances are able to reach each other and VyOS traffic monitoring shows traffic on both instances.
With TGW appliance mode disabled and VyOS configured to drop invalid traffic only instances within the same availability zones are able to reach each other.
With TGW appliance mode enabled and VyOS configured to drop invalid traffic all instances are able to reach each other and VyOS traffic monitoring shows traffic on only a single instance.
I did find that with appliance mode enabled and dropping invalid traffic traffic would "flip flop" between the VyOS instances based on which instance last initiated the last traffic flow. Instance 1 A to Instance 2 B shows traffic on VyOS A. Immediately initiate traffic from Instance 2 B to Instance 1 A and traffic would show on VyOS A. Allow sufficient time between traffic initiations and traffic would show on the "correct" (expected) VyOS instance.