Experimental implementation of AWS Lambda Serverless Infrastructure
how to build a serverless platform
The core of the serverless platform is the Lambda function.
A Lambda function is a function that is executed in the cloud, and can be triggered by an event.
rn we will only support python
- there will be go agent running inside the firecracker VM.
- The agent will handle upcoming requests and pass it to the correct runtime - python in this case
- The agent will be written in Go and only the binary will be copied to the VM.
Here's a step-by-step plan to build Skyscale:
- Core Features:
- Support Python functions with optimized invocation.
- CLI to scaffold and manage functions.
- Deployment and execution in isolated micro-VMs using Firecracker.
- Logging and monitoring for executed functions.
- Additional Goals:
- Minimize cold start times.
- Provide pre-warmed micro-VMs for faster execution.
- Components:
- CLI Tool:
- Scaffolds new functions.
- Deploys and manages function configurations.
- Manages local testing.
- Function Runtime:
- Isolated micro-VMs with Firecracker.
- Python server (e.g., FastAPI) inside the micro-VM to execute functions.
- Control Plane:
- Schedules function invocations.
- Manages VM lifecycles.
- Allocates resources dynamically.
- Data Plane:
- Handles communication between the host and micro-VMs.
- Transfers inputs/outputs efficiently.
- Pre-Warm Pool:
- Maintains a pool of pre-warmed micro-VMs to reduce cold starts.
- CLI Tool:
- Optimize Firecracker Image:
- Use a minimal custom kernel and filesystem to reduce image size.
- Include only the required Python runtime and dependencies.
- Automate Image Creation:
- Use tools like Packer or custom scripts to build VM images.
- Add Networking:
- Implement virtual network interfaces for micro-VMs (fcnet).
- Features:
- Scaffold Python functions with a predefined template.
- Support function metadata (e.g., runtime, memory, timeout).
- Deploy functions using the CLI (packaging and uploading code to a server).
- Technologies:
- Use Python’s
argparse
orclick
for CLI development.
- Use Python’s
- Create a base VM image with:
- A lightweight Python server (e.g., FastAPI or Flask) to handle HTTP requests.
- Dependency installation support (
requirements.txt
).
- Implement a script for bootstrapping user-defined functions.
- VM Orchestrator:
- Automate the lifecycle of Firecracker VMs (start, stop, terminate).
- Scheduler:
- Distribute incoming function requests to appropriate VMs.
- Use a lightweight queue (e.g., RabbitMQ or Redis).
- Reduce Cold Starts:
- Maintain pre-warmed VMs ready to execute functions.
- Implement lazy loading for rarely-used functions.
- Efficient Data Transfer:
- Use shared memory or sockets for communication between the host and micro-VMs.
- Request Routing:
- Proxy requests through a lightweight service (e.g., Nginx or HAProxy).
- Add logging for:
- Function execution times.
- Resource usage (CPU, memory).
- Errors and exceptions.
- Implement a lightweight monitoring dashboard.
- Package and deploy functions using the CLI.
- Store function metadata in a lightweight database (e.g., SQLite, Postgres).
- Benchmark cold start times and function invocation latency.
- Experiment with:
- Snapshotting VMs after the Python server initializes.
- Kernel and filesystem optimization for Firecracker.
- Multi-threading in the control plane for parallel invocations.
- Allow developers to test their functions locally before deployment.
- Use Docker to mimic the runtime environment.
- Month 1:
- Setup Firecracker-based VM with Python runtime.
- Implement the CLI tool for scaffolding functions.
- Month 2:
- Build and test the control plane (VM lifecycle and function scheduling).
- Develop logging and monitoring capabilities.
- Month 3:
- Optimize invocation flow with pre-warmed VMs.
- Add local testing capabilities.
- Use Docker or Kubernetes to manage Skyscale services.
- Automate deployments with CI/CD pipelines (GitHub Actions, GitLab CI).
- Provide a self-hosted option for early users.
- CLI Commands:
skyscale init
- Scaffold a new function.skyscale deploy
- Deploy a function to the platform.skyscale logs
- View function logs.skyscale test
- Test the function locally.
- Templates:
- Provide common function templates for tasks like API calls or file processing.
- Write unit and integration tests for each component.
- Set up a benchmark suite to test performance under load.
- Gather feedback from early adopters to refine the platform.
Let’s break down the lifecycle and internal working of Skyscale using the example function print("hello fire")
. I'll explain each step in detail, from function creation to execution and cleanup.
- The user runs the CLI command:
skyscale init --name hello_fire
- What Happens Internally:
- The CLI creates a directory structure like:
hello_fire/ ├── handler.py # User's function ├── requirements.txt # Dependencies └── skyscale_config.json # Metadata for the function
handler.py
contains a scaffolded template for a Python function:def handler(): print("hello fire")
- The CLI creates a directory structure like:
- What Happens Internally:
- The user deploys the function using:
skyscale deploy --name hello_fire
- What Happens Internally:
- Step 1: Package Function:
- The function code, dependencies, and metadata are bundled into a
.zip
or.tar.gz
file.
- The function code, dependencies, and metadata are bundled into a
- Step 2: Upload to Skyscale Control Plane:
- The CLI sends the package to a server hosting the control plane.
- The control plane stores the package in a central repository (e.g., an S3 bucket or a file system).
- Step 3: VM Image Preparation:
- A base Firecracker VM image is cloned.
- The function package is injected into the VM image at a predefined path (e.g.,
/home/hello_fire
). - A lightweight Python server is pre-installed in the VM to execute the function.
- Step 1: Package Function:
- What Happens Internally:
- The user invokes the function:
skyscale invoke --name hello_fire
- What Happens Internally:
- Step 1: Request Received:
- The control plane receives the request and checks:
- If a pre-warmed VM is available for the function.
- If not, it starts a new Firecracker micro-VM using the prepared VM image.
- The control plane receives the request and checks:
- Step 2: VM Initialization:
- The VM boots up with a minimal Linux kernel and a root filesystem.
- The Python server inside the VM starts listening for requests.
- Step 3: Function Execution:
- The request payload (if any) is sent to the Python server inside the VM via an HTTP POST request.
- The server dynamically loads and executes
handler.py
using Python’sexec()
function. print("hello fire")
writes output to the VM's standard output (stdout).
- Step 4: Capture Output:
- The server captures the stdout of the function execution.
- The output is sent back to the control plane, which relays it to the CLI.
- Step 5: VM Management:
- If the VM is pre-warmed, it is returned to the pool for reuse.
- If the VM was created for a single invocation, it is terminated.
- Step 1: Request Received:
- What Happens Internally:
- Logs are generated at each step:
- Invocation Logs:
- Function output:
"hello fire"
. - Execution time, resource usage, and errors (if any).
- Function output:
- Control Plane Logs:
- Function scheduling details.
- VM lifecycle events.
- These logs are stored in a central logging system and accessible via:
skyscale logs --name hello_fire
- Invocation Logs:
- After execution, the VM is either:
- Returned to the Pre-Warm Pool for future invocations.
- Terminated if it’s no longer needed (e.g., after idle timeout).
- Temporary files and data inside the VM are cleared to maintain security and optimize resource usage.
- Bootstrapping:
- A Firecracker VM boots in milliseconds with a minimal kernel and root filesystem.
- Execution Environment:
- The VM contains only the required Python runtime and dependencies.
- The function is executed inside an isolated Python server running in the VM.
- Cold Start Optimization:
- Pre-warmed VMs are kept ready with Python initialized and the server running.
- This avoids the overhead of booting a VM from scratch for each invocation.
- Efficient Communication:
- Input/output data is passed between the host and VM using Unix sockets or shared memory for low latency.
- Handles scheduling and lifecycle management of VMs.
- Allocates resources dynamically based on function demand.
- Manages the pre-warm pool to minimize cold starts.
- Provides a seamless interface for scaffolding, deploying, and managing functions.
- Relays user commands to the control plane via REST or gRPC APIs.
Let’s assume the user invokes the function:
skyscale invoke --name hello_fire
- T=0 ms: Request is received by the control plane.
- T=50 ms: A pre-warmed Firecracker VM is assigned to the request.
- T=100 ms: The function package is loaded, and
handler.py
is executed. - T=200 ms: Output (
hello fire
) is captured and returned to the user. - T=250 ms: The VM is returned to the pre-warm pool.
This lifecycle ensures:
- Fast Invocations: Pre-warmed VMs reduce cold starts.
- Isolation: Each function runs in a secure, isolated micro-VM.
- Scalability: The system can dynamically scale by spinning up new Firecracker VMs based on demand.
Does this address your question, or would you like a deep dive into a specific component?