- Implement timewebcloud provider with DeployAgent, RemoveAgent, ListDeployedAgentNames - Add minimal HTTP API client for Timeweb Cloud (create/list/delete servers) - Register provider in main.go with CLI flags - Add timeweb-list and timeweb-tester utilities - Include Dockerfile and docker-compose.yml for deployment - Update DEPLOY.md with verified OS/preset IDs
7.8 KiB
Project: Woodpecker CI Autoscaler — Timeweb Cloud Provider
Goal
Add a Timeweb Cloud provider to the Woodpecker CI autoscaler so that:
- The Woodpecker server runs permanently on one VDS.
- When a CI job appears, the autoscaler dynamically creates a new VDS on Timeweb Cloud.
- The VDS is bootstrapped via cloud-init, connects to the server as an agent, and runs the job.
- After the job finishes and the idle timeout expires, the VDS is destroyed.
Background
Current Setup
- Woodpecker server and agent run permanently on a single VDS via Docker Compose.
- The goal is to move to a dynamic model where agents are created on demand.
Woodpecker CI Autoscaler Architecture
- Repository:
woodpecker-ci/autoscaler(separate from the mainwoodpecker-ci/woodpeckerrepo). - Language: Go.
- Provider Interface (3 methods):
type Provider interface { DeployAgent(context.Context, *woodpecker.Agent) error RemoveAgent(context.Context, *woodpecker.Agent) error ListDeployedAgentNames(context.Context) ([]string, error) } - Provisioning Flow:
- Autoscaler monitors the Woodpecker queue.
- When pending tasks exceed capacity, it calls
AgentCreate()to get a token, thenDeployAgent(). DeployAgentcreates a VM and passes cloud-init user-data.- The VM boots, installs Docker, and runs the Woodpecker agent container via docker compose.
- The agent connects to the server via gRPC using the provided token.
- On scale-down,
RemoveAgent()terminates the VM, and the agent is deleted from Woodpecker.
- Cloud-init: The autoscaler generates a cloud-init YAML that installs Docker and starts the agent. Custom templates are supported via
WOODPECKER_PROVIDER_USERDATA/WOODPECKER_PROVIDER_USERDATA_FILE. - Agent Environment Variables (set in cloud-init):
WOODPECKER_SERVER— gRPC address of the server.WOODPECKER_AGENT_SECRET— token generated byAgentCreate().WOODPECKER_MAX_WORKFLOWS— parallelism per agent.WOODPECKER_GRPC_SECURE— TLS flag.
- Configuration: The autoscaler uses
urfave/clifor CLI flags. Providers define their own flags (e.g.,--hetznercloud-api-token). - Registration: To add a new provider, you must:
- Implement the
Providerinterface in a new package underproviders/<name>/. - Create a
flags.gofile with CLI flags. - Import the package and add a case in
cmd/woodpecker-autoscaler/main.go. - Append the provider's flags to the global app flags.
- Implement the
Timeweb Cloud API
- Public API: Yes —
https://api.timeweb.cloud. - Official Go SDK:
github.com/timeweb-cloud/sdk-go(OpenAPI-generated). - Authentication: JWT Bearer token (
Authorization: Bearer <token>). - VDS Lifecycle Endpoints:
- Create:
POST /api/v1/servers - Delete:
DELETE /api/v1/servers/{server_id} - Get:
GET /api/v1/servers/{server_id} - List:
GET /api/v1/servers - Start:
POST /api/v1/servers/{server_id}/start - Shutdown:
POST /api/v1/servers/{server_id}/shutdown - Clone:
POST /api/v1/servers/{server_id}/clone
- Create:
- Create Server Parameters:
name(required)os_idorimage_idpreset_idorconfiguration(CPU, RAM, disk)ssh_keys_idscloud_init— this is critical for passing user-data.availability_zonehostname
- Rate Limit: 20 requests per second per endpoint.
- Tags/Labels: The API does not seem to have a native "label" or "tag" system for servers. We may need to track pool association by server name prefix or by storing state locally. This is an open question.
Implementation Plan
Phase 1: Project Setup
- Fork / vendor
woodpecker-ci/autoscaleras the base. - Add
github.com/timeweb-cloud/sdk-goas a dependency. - Create the provider package:
providers/timewebcloud/.
Phase 2: Provider Implementation
- Struct & Constructor (
provider.go):- Fields: API client, config, pool ID, default image/preset/zone.
New(ctx, cli.Command, *config.Config) (types.Provider, error).
- Flags (
flags.go):--timewebcloud-api-token(env:WOODPECKER_TIMEWEBCLOUD_API_TOKEN)--timewebcloud-os-id/--timewebcloud-image-id--timewebcloud-preset-id/--timewebcloud-configuration--timewebcloud-availability-zone--timewebcloud-ssh-key-id--timewebcloud-hostname-prefix
- DeployAgent:
- Generate cloud-init user-data via
cloudinit.RenderUserDataTemplate(). - Call
CreateServerwith the agent name and user-data. - Store the mapping
agent.Name -> server_id(in memory or via naming convention).
- Generate cloud-init user-data via
- RemoveAgent:
- Find server by agent name (list all servers and filter by name, or use a stored mapping).
- Call
DeleteServer. - Handle "not found" gracefully.
- ListDeployedAgentNames:
- List all servers.
- Filter by name prefix (e.g.,
pool-<pool-id>-agent-). - Return matching names.
Phase 3: Integration
- Import the provider in
main.go. - Add
case "timewebcloud":tosetupProvider(). - Append
timewebcloud.ProviderFlagsto the global flags.
Phase 4: Testing & Deployment
- Build the binary.
- Test locally or on a staging VDS:
- Start the autoscaler with
--provider=timewebcloud. - Trigger a CI job.
- Verify VDS creation, agent connection, job execution, and cleanup.
- Start the autoscaler with
- Update Docker Compose / deployment docs.
Key Technical Decisions
1. How to Track Agent-to-Server Mapping?
Options:
- A. Name Prefix Convention: Name servers as
wp-<pool>-<agent-name>.ListDeployedAgentNamesfilters by prefix. Simple, no state needed. - B. In-Memory Map: Store
map[string]int(agent name -> server ID) in the provider struct. Lost on restart. - C. Local State File: Persist the map to disk. Survives restart.
- D. API Metadata: If Timeweb API supports tags/labels, use them. (Currently unclear.)
Recommendation: Start with A (name prefix) as the simplest and most robust approach. If Timeweb adds tags later, migrate to D.
2. How to Handle Server Readiness?
Question: After CreateServer, the server may take time to boot. Does DeployAgent need to wait?
Answer: No. The autoscaler engine only requires that the VM creation is initiated. The agent will connect when ready. The engine has AgentInactivityTimeout (default 10m) to clean up agents that never connect.
3. OS Image Selection
Question: What base image should be used for the agent VMs?
Answer: Ubuntu 22.04 LTS or Debian 12 (stable, good Docker support). The os_id must be fetched from Timeweb's API (GetOsList). Alternatively, a custom image with Docker pre-installed could speed up boot time.
4. SSH Keys
Question: Are SSH keys needed if we use cloud-init?
Answer: Cloud-init handles everything. SSH keys are optional but useful for debugging. The provider should allow configuring ssh_keys_ids.
Open Questions
- Does Timeweb Cloud API support assigning custom tags/labels to servers? (Affects
ListDeployedAgentNamesimplementation.) - What is the typical boot time for a new VDS? (Affects
AgentInactivityTimeouttuning.) - Does the
cloud_initfield inCreateServeraccept standard cloud-init YAML? (Needs testing.) - Is there a way to use a custom image (snapshot) to pre-install Docker and reduce boot time?
- What are the
os_idvalues for Ubuntu/Debian? (Need to callGetOsList.) - Does Timeweb charge for stopped (but not deleted) servers? (Affects whether we should stop vs. delete.)
References
- Woodpecker Autoscaler Repo:
https://github.com/woodpecker-ci/autoscaler - Provider Interface:
engine/types/provider.go - Hetzner Provider (reference):
providers/hetznercloud/ - Cloud-init Render:
engine/inits/cloudinit/cloudinit.go - Timeweb Cloud Go SDK:
https://github.com/timeweb-cloud/sdk-go - Timeweb Cloud API Docs:
https://timeweb.cloud/api-docs