Files
twcloud-scaler/agents.md
Sergey Vanyushkin 191cdd108f feat: add Timeweb Cloud provider for Woodpecker CI autoscaler
- Implement timewebcloud provider with DeployAgent, RemoveAgent, ListDeployedAgentNames
- Add minimal HTTP API client for Timeweb Cloud (create/list/delete servers)
- Register provider in main.go with CLI flags
- Add timeweb-list and timeweb-tester utilities
- Include Dockerfile and docker-compose.yml for deployment
- Update DEPLOY.md with verified OS/preset IDs
2026-05-16 13:09:07 +03:00

7.8 KiB

Project: Woodpecker CI Autoscaler — Timeweb Cloud Provider

Goal

Add a Timeweb Cloud provider to the Woodpecker CI autoscaler so that:

  1. The Woodpecker server runs permanently on one VDS.
  2. When a CI job appears, the autoscaler dynamically creates a new VDS on Timeweb Cloud.
  3. The VDS is bootstrapped via cloud-init, connects to the server as an agent, and runs the job.
  4. After the job finishes and the idle timeout expires, the VDS is destroyed.

Background

Current Setup

  • Woodpecker server and agent run permanently on a single VDS via Docker Compose.
  • The goal is to move to a dynamic model where agents are created on demand.

Woodpecker CI Autoscaler Architecture

  • Repository: woodpecker-ci/autoscaler (separate from the main woodpecker-ci/woodpecker repo).
  • Language: Go.
  • Provider Interface (3 methods):
    type Provider interface {
        DeployAgent(context.Context, *woodpecker.Agent) error
        RemoveAgent(context.Context, *woodpecker.Agent) error
        ListDeployedAgentNames(context.Context) ([]string, error)
    }
    
  • Provisioning Flow:
    1. Autoscaler monitors the Woodpecker queue.
    2. When pending tasks exceed capacity, it calls AgentCreate() to get a token, then DeployAgent().
    3. DeployAgent creates a VM and passes cloud-init user-data.
    4. The VM boots, installs Docker, and runs the Woodpecker agent container via docker compose.
    5. The agent connects to the server via gRPC using the provided token.
    6. On scale-down, RemoveAgent() terminates the VM, and the agent is deleted from Woodpecker.
  • Cloud-init: The autoscaler generates a cloud-init YAML that installs Docker and starts the agent. Custom templates are supported via WOODPECKER_PROVIDER_USERDATA / WOODPECKER_PROVIDER_USERDATA_FILE.
  • Agent Environment Variables (set in cloud-init):
    • WOODPECKER_SERVER — gRPC address of the server.
    • WOODPECKER_AGENT_SECRET — token generated by AgentCreate().
    • WOODPECKER_MAX_WORKFLOWS — parallelism per agent.
    • WOODPECKER_GRPC_SECURE — TLS flag.
  • Configuration: The autoscaler uses urfave/cli for CLI flags. Providers define their own flags (e.g., --hetznercloud-api-token).
  • Registration: To add a new provider, you must:
    1. Implement the Provider interface in a new package under providers/<name>/.
    2. Create a flags.go file with CLI flags.
    3. Import the package and add a case in cmd/woodpecker-autoscaler/main.go.
    4. Append the provider's flags to the global app flags.

Timeweb Cloud API

  • Public API: Yes — https://api.timeweb.cloud.
  • Official Go SDK: github.com/timeweb-cloud/sdk-go (OpenAPI-generated).
  • Authentication: JWT Bearer token (Authorization: Bearer <token>).
  • VDS Lifecycle Endpoints:
    • Create: POST /api/v1/servers
    • Delete: DELETE /api/v1/servers/{server_id}
    • Get: GET /api/v1/servers/{server_id}
    • List: GET /api/v1/servers
    • Start: POST /api/v1/servers/{server_id}/start
    • Shutdown: POST /api/v1/servers/{server_id}/shutdown
    • Clone: POST /api/v1/servers/{server_id}/clone
  • Create Server Parameters:
    • name (required)
    • os_id or image_id
    • preset_id or configuration (CPU, RAM, disk)
    • ssh_keys_ids
    • cloud_initthis is critical for passing user-data.
    • availability_zone
    • hostname
  • Rate Limit: 20 requests per second per endpoint.
  • Tags/Labels: The API does not seem to have a native "label" or "tag" system for servers. We may need to track pool association by server name prefix or by storing state locally. This is an open question.

Implementation Plan

Phase 1: Project Setup

  1. Fork / vendor woodpecker-ci/autoscaler as the base.
  2. Add github.com/timeweb-cloud/sdk-go as a dependency.
  3. Create the provider package: providers/timewebcloud/.

Phase 2: Provider Implementation

  1. Struct & Constructor (provider.go):
    • Fields: API client, config, pool ID, default image/preset/zone.
    • New(ctx, cli.Command, *config.Config) (types.Provider, error).
  2. Flags (flags.go):
    • --timewebcloud-api-token (env: WOODPECKER_TIMEWEBCLOUD_API_TOKEN)
    • --timewebcloud-os-id / --timewebcloud-image-id
    • --timewebcloud-preset-id / --timewebcloud-configuration
    • --timewebcloud-availability-zone
    • --timewebcloud-ssh-key-id
    • --timewebcloud-hostname-prefix
  3. DeployAgent:
    • Generate cloud-init user-data via cloudinit.RenderUserDataTemplate().
    • Call CreateServer with the agent name and user-data.
    • Store the mapping agent.Name -> server_id (in memory or via naming convention).
  4. RemoveAgent:
    • Find server by agent name (list all servers and filter by name, or use a stored mapping).
    • Call DeleteServer.
    • Handle "not found" gracefully.
  5. ListDeployedAgentNames:
    • List all servers.
    • Filter by name prefix (e.g., pool-<pool-id>-agent-).
    • Return matching names.

Phase 3: Integration

  1. Import the provider in main.go.
  2. Add case "timewebcloud": to setupProvider().
  3. Append timewebcloud.ProviderFlags to the global flags.

Phase 4: Testing & Deployment

  1. Build the binary.
  2. Test locally or on a staging VDS:
    • Start the autoscaler with --provider=timewebcloud.
    • Trigger a CI job.
    • Verify VDS creation, agent connection, job execution, and cleanup.
  3. Update Docker Compose / deployment docs.

Key Technical Decisions

1. How to Track Agent-to-Server Mapping?

Options:

  • A. Name Prefix Convention: Name servers as wp-<pool>-<agent-name>. ListDeployedAgentNames filters by prefix. Simple, no state needed.
  • B. In-Memory Map: Store map[string]int (agent name -> server ID) in the provider struct. Lost on restart.
  • C. Local State File: Persist the map to disk. Survives restart.
  • D. API Metadata: If Timeweb API supports tags/labels, use them. (Currently unclear.)

Recommendation: Start with A (name prefix) as the simplest and most robust approach. If Timeweb adds tags later, migrate to D.

2. How to Handle Server Readiness?

Question: After CreateServer, the server may take time to boot. Does DeployAgent need to wait? Answer: No. The autoscaler engine only requires that the VM creation is initiated. The agent will connect when ready. The engine has AgentInactivityTimeout (default 10m) to clean up agents that never connect.

3. OS Image Selection

Question: What base image should be used for the agent VMs? Answer: Ubuntu 22.04 LTS or Debian 12 (stable, good Docker support). The os_id must be fetched from Timeweb's API (GetOsList). Alternatively, a custom image with Docker pre-installed could speed up boot time.

4. SSH Keys

Question: Are SSH keys needed if we use cloud-init? Answer: Cloud-init handles everything. SSH keys are optional but useful for debugging. The provider should allow configuring ssh_keys_ids.

Open Questions

  1. Does Timeweb Cloud API support assigning custom tags/labels to servers? (Affects ListDeployedAgentNames implementation.)
  2. What is the typical boot time for a new VDS? (Affects AgentInactivityTimeout tuning.)
  3. Does the cloud_init field in CreateServer accept standard cloud-init YAML? (Needs testing.)
  4. Is there a way to use a custom image (snapshot) to pre-install Docker and reduce boot time?
  5. What are the os_id values for Ubuntu/Debian? (Need to call GetOsList.)
  6. Does Timeweb charge for stopped (but not deleted) servers? (Affects whether we should stop vs. delete.)

References

  • Woodpecker Autoscaler Repo: https://github.com/woodpecker-ci/autoscaler
  • Provider Interface: engine/types/provider.go
  • Hetzner Provider (reference): providers/hetznercloud/
  • Cloud-init Render: engine/inits/cloudinit/cloudinit.go
  • Timeweb Cloud Go SDK: https://github.com/timeweb-cloud/sdk-go
  • Timeweb Cloud API Docs: https://timeweb.cloud/api-docs