自动扩缩容

什么是自动扩缩容

自动扩缩容（Auto Scaling）是指系统根据实时负载自动调整容器副本数量。

低负载时：3 个副本
高负载时：10 个副本
负载回落后：回到 3 个副本

Docker Swarm 自动扩缩容

Swarm 的自动扩缩容需要结合外部工具实现。

手动扩缩

# 手动扩缩
docker service scale web=5

# 批量扩缩
docker service scale web=5 api=3 db=1

Swarm + Prometheus 自动扩缩

# docker-compose.yml
version: '3.8'
services:
  web:
    image: myapp:latest
    deploy:
      replicas: 3
      resources:
        limits:
          cpus: '0.5'
          memory: 512M
    ports:
      - "80:80"

  prometheus:
    image: prom/prometheus
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml

  auto-scaler:
    image: docker-swarm-auto-scaler
    environment:
      - TARGET_SERVICE=web
      - MIN_REPLICAS=2
      - MAX_REPLICAS=10
      - CPU_THRESHOLD=70
      - MEMORY_THRESHOLD=80

Kubernetes 自动扩缩容

K8s 内置了强大的自动扩缩容能力。

HPA（水平自动扩缩）

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 50
        periodSeconds: 60

基于自定义指标的 HPA

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-custom-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: 1000
  - type: Object
    object:
      metric:
        name: requests_per_second
      describedObject:
        apiVersion: v1
        kind: Service
        name: web
      target:
        type: Value
        value: 10k

VPA（垂直自动扩缩）

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: web-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web
  updatePolicy:
    updateMode: Auto
  resourcePolicy:
    containerPolicies:
    - containerName: '*'
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2
        memory: 4Gi

Docker 层的自动扩缩

基于 docker events

# 监听容器事件
docker events --filter 'event=oom'

# 触发扩缩脚本
docker events --format '{{json .}}' | while read event; do
  if echo "$event" | grep -q "oom"; then
    docker service scale web=$(($(docker service ls | grep web | awk '{print $3}') + 1))
  fi
done

扩缩策略

指标选择

指标	适用场景	说明
CPU	计算密集型	最常见，但不一定反映真实负载
内存	内存密集型	内存泄漏时自动扩容
请求数	Web 应用	更准确的业务负载指标
队列长度	异步处理	消息积压时扩容
自定义指标	特定场景	如数据库连接数

冷却时间

扩容冷却：避免频繁扩容
缩容冷却：避免来回抖动（thrashing）

behavior:
  scaleDown:
    stabilizationWindowSeconds: 300  # 等待 5 分钟再缩容
    policies:
    - type: Percent
      value: 10          # 每次最多缩 10%
      periodSeconds: 60  # 每分钟最多缩一次

面试要点

K8s 的 HPA 是生产环境最常用的自动扩缩容方案
扩缩容不能只看 CPU/内存，业务指标更准确
冷却时间（Stabilization Window）防止”抖动”
水平扩缩（HPA）优于垂直扩缩（VPA）——更灵活
Docker Swarm 需要额外工具配合才能自动扩缩

面试官常问：你们用的是什么扩缩容策略？如何防止频繁扩缩导致的抖动？

文章版权归作者所有，未经允许请勿转载。

THE END

Agent 智能体开发

自动扩缩容

自动扩缩容

什么是自动扩缩容

Docker Swarm 自动扩缩容

手动扩缩

Swarm + Prometheus 自动扩缩

Kubernetes 自动扩缩容

HPA（水平自动扩缩）

基于自定义指标的 HPA

VPA（垂直自动扩缩）

Docker 层的自动扩缩

基于 docker events

扩缩策略

指标选择

冷却时间

面试要点

请登录后发表评论