Automating Python Package Installation for Air-Gapped Environments with AWS S3
Introduction
Managing software dependencies in air-gapped environments—where direct internet access is restricted—is a significant challenge for organizations that require secure and controlled deployments. This is especially true for Python package installation, where developers rely on pip and PyPI (Python Package Index) for dependency management.
In environments such as government, defense, and industrial control systems, security policies prohibit direct downloads from external repositories. To overcome this, organizations can leverage AWS S3 as a private package repository, enabling a secure, automated, and scalable solution for Python package deployment.
This article explores best practices for automating Python package installation in air-gapped environments using AWS S3, pip, and dependency management tools like pip-compile.
Challenges of Installing Python Packages in Air-Gapped Environments
1. No Direct Access to PyPI
Most Python applications depend on external libraries from PyPI, but air-gapped environments lack internet access to download them directly.
Solution: A pre-packaged repository in AWS S3 that serves as a local package source.
2. Managing Package Dependencies Manually is Error-Prone
Python applications often require dozens of dependencies with complex version constraints. Manually gathering these can lead to:
✅ Version mismatches
✅ Missing dependencies
✅ Installation failures
Solution: Automating dependency resolution with pip-compile and ensuring packages are correctly bundled before deployment.
3. OS-Specific Package Variations
Python dependencies often require OS-specific binaries (e.g., Windows .whl vs. Linux .tar.gz). A single approach won’t work across multiple environments.
Solution: A structured multi-OS packaging strategy that considers different formats and dependency trees.
Solution: Automating Python Package Installation Using AWS S3
Step 1: Preparing Python Packages for Offline Deployment
To automate dependency resolution, we use pip-compile from pip-tools. This ensures we capture all transitive dependencies while locking versions for stability.
Generating a Locked Dependency List
Run the following command to generate a fully resolved list of dependencies:
bash
pip install pip-tools
pip-compile --generate-hashes -o requirements.lock requirements.txt
Downloading Dependencies for Offline Use
Once dependencies are resolved, download them locally:
bash
pip download -r requirements.lock -d ./offline_packages
This creates an offline_packages/ directory with all required .whl and .tar.gz files.
Step 2: Uploading Python Packages to AWS S3
To make these packages accessible within an air-gapped network, upload them to AWS S3 as a private repository.
Uploading Packages to S3
bash
aws s3 cp --recursive ./offline_packages s3://your-bucket-name/python-packages/
This stores all dependencies in an S3 bucket, enabling instances without internet access to retrieve them securely.
Step 3: Configuring pip to Use S3 as a Package Repository
Since air-gapped environments cannot access PyPI, we configure pip to install packages directly from AWS S3.
Using AWS S3 as a Trusted Package Source
Modify pip.conf to point to your S3 bucket:
ini
[global]
index-url = https://your-bucket-name.s3.amazonaws.com/python-packages/
trusted-host = your-bucket-name.s3.amazonaws.com
This ensures pip install pulls packages only from the private S3 repository.
Installing Packages from the S3 Repository
bash
pip install --no-index --find-links=https://your-bucket-name.s3.amazonaws.com/python-packages/ -r requirements.lock
Optimizing Deployment for Multi-OS Environments
For organizations supporting multiple operating systems (Windows, RHEL, SUSE, Ubuntu, etc.), maintain OS-specific package directories in S3:
arduino
s3://your-bucket-name/python-packages/
├── windows/
├── rhel9/
├── suse15/
├── ubuntu22/
Use the correct package source depending on the OS type:
bash
ppip install --find-links=https://your-bucket-name.s3.amazonaws.com/python-packages/rhel9/ -r requirements.lock
Automating Deployment with AWS Systems Manager
To streamline installation across multiple air-gapped EC2 instances, use AWS Systems Manager Run Command.
Example: Automating Python Installation with SSM
Create an AWS SSM Automation Document to:
SSM Document Example (YAML)
yaml
schemaVersion: '2.2'
description: "Install Python packages from S3"
mainSteps:
- action: aws:runShellScript
name: InstallPythonPackages
inputs:
runCommand:
- "aws s3 cp --recursive s3://your-bucket-name/python-packages/ /tmp/python-packages/"
- "pip install --no-index --find-links=file:///tmp/python-packages/ -r /tmp/python-packages/requirements.lock"
This ensures instances retrieve and install packages securely from AWS S3.
Lessons Learned from Implementing This Solution
✔ Pre-compile dependencies with pip-compile – Avoid missing packages and ensure version consistency.
✔ Use S3 as an internal PyPI alternative – Enables scalable, air-gapped installations.
✔ Automate with AWS Systems Manager – Eliminates manual intervention and enforces consistency.
✔ Structure packages by OS type – Avoids compatibility issues with OS-specific dependencies.
✔ Harden security by limiting access – Use IAM policies to restrict package access to authorized EC2 instances.
Conclusion
Automating Python package installation in air-gapped environments doesn’t have to be a manual process. By leveraging:
✅ pip-compile for dependency resolution
✅ AWS S3 for private package storage
✅ Custom pip configurations to prevent PyPI access
✅ AWS Systems Manager for deployment automation
Organizations can efficiently manage Python dependencies without requiring internet access.
🔹 Key Takeaways:
✅ Pre-download & package dependencies
✅ Host them in a secure AWS S3 bucket
✅ Configure pip to install only from S3
✅ Automate deployments using AWS Systems Manager
By implementing this approach, teams can ensure compliance, security, and automation in air-gapped environments.