Troubleshooting Worker Crashing Issues
Last updated: November 7, 2024
Issue: Worker Crashing During Run
When a private worker crashes in Spacelift, it’s often due to insufficient memory to handle the stack being processed. This article provides guidance on how to diagnose and resolve worker crashes.
For Private Workers
If your private worker is crashing, the most common reason is memory exhaustion, especially with large or complex stacks.
Steps to Diagnose and Resolve
Monitor CPU and Memory Usage
Check the CPU and memory usage of your EC2 instance (or equivalent infrastructure) to determine if resource limits are being exceeded. For large stacks, consider using a larger instance type with more memory and processing power.Enable Debug Logging for Terraform Stacks
If you're using Terraform, set the environment variableTF_LOG=DEBUGto enable detailed logging. This can provide additional insights and clues about which processes are consuming memory.How to Set
TF_LOG:Go to your stack's environment tab
Add:
Name:
TF_LOGValue:
DEBUG
For Public Workers
If you encounter crashing issues with a public worker, please reach out to the Spacelift support team. Be sure to provide your Run ID so they can assist you more effectively.
Summary
Worker crashes are commonly caused by memory limitations, especially on private workers handling large stacks. Monitoring resource usage and enabling TF_LOG=DEBUG can help identify the root cause. For public worker issues, contact Spacelift Support with the relevant run information.