Troubleshooting EC2 Worker Pool Instances continuously terminating
Last updated: March 13, 2025
When running worker pools on AWS EC2 instances, you may need to access the instances for troubleshooting. Our AWS EC2 terraform module includes SSM (Systems Manager) access by default, which provides a secure way to access instances without requiring SSH keys.
Accessing Worker Pool Instance Logs
To access and troubleshoot a worker pool instance that isn't connecting properly:
Wait for the Auto Scaling Group (ASG) to launch a new instance
Navigate to the ASG in your AWS Console and select the "Instances" tab
Detach the instance you want to investigate from the ASG
Note: The instance will still shut down but won't terminate since it's no longer managed by the ASG
Restart the detached instance
Use AWS Systems Manager Session Manager to connect to the instance and view logs
SSM is typically the simplest way to access a worker pool instance that uses the default Spacelift-supplied AMI. However, you can generally review the logs in CloudWatch without directly connecting to the EC2 instance. Look for the log groups named spacelift-errors.log and spacelift-info.log to find relevant information.
Common Issues
This can also happen when using an incorrect worker pool token or private key (e.g., they are not properly base64-encoded), or network connectivity problems preventing the instance from connecting to the IoT broker.
We recommend using SSM for instance access rather than SSH keys, as it's more secure and is already configured in our terraform module.