Docker 客户端错误排查

Docker 客户端错误排查

常见客户端错误类型

Docker 客户端在与 daemon 通信时可能出现各种错误,这里整理最常见的问题和解决方案。

1. Cannot connect to Docker daemon

现象

Cannot connect to the Docker daemon at unix:///var/run/docker.sock. 
Is the docker daemon running?
ERROR: error during connect: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/json": dial unix /var/run/docker.sock: connect: connection refused

排查

# 检查 daemon 是否运行
systemctl status docker

# 检查 socket 文件是否存在
ls -la /var/run/docker.sock

# 检查权限
ls -la /var/run/docker.sock
# srw-rw---- 1 root docker 0 ... /var/run/docker.sock

# 检查当前用户是否在 docker 组
groups
# 如果不在 docker 组
sudo usermod -aG docker $USER
# sudo docker 测试
sudo docker ps

解决

# 启动 daemon
sudo systemctl start docker

# 或者指定 host
docker -H tcp://127.0.0.1:2375 ps

# 设置环境变量
export DOCKER_HOST=tcp://127.0.0.1:2375

2. Permission denied

现象

permission denied while trying to connect to the Docker daemon socket
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock

解决

# 方式一:使用 sudo
sudo docker ps

# 方式二:将用户加入 docker 组(安全风险)
sudo usermod -aG docker $USER
# 然后重新登录

# 方式三:修复 socket 权限(临时)
sudo chmod 666 /var/run/docker.sock

3. Image not found / Pull access denied

现象

Unable to find image ':latest' locally
docker: Error response from daemon: pull access denied for <name>, repository does not exist or may require 'docker login': denied.

解决

# 登录
docker login

# 检查镜像名是否正确
docker search 

# 对于私有仓库
docker login registry.example.com
docker pull registry.example.com/myapp:latest

4. Port already allocated

现象

docker: Error response from daemon: driver failed programming external connectivity on endpoint <name>:
Error starting userland proxy: listen tcp4 0.0.0.0:8080: bind: address already in use

解决

# 查找占用端口的进程
lsof -i :8080
netstat -tulpn | grep 8080

# 停止占用端口的容器
docker stop 

# 或者换个端口
docker run -p 8081:80 ...

5. Context deadline exceeded

现象

docker: Error response from daemon: context deadline exceeded

解决

# 通常是 daemon 响应超时
# 检查 daemon 状态
systemctl status docker

# 增加 daemon 响应超时(daemon.json)
{
  "default-shm-size": "64M",
  "max-concurrent-downloads": 3
}

# 重启 daemon
sudo systemctl restart docker

6. Docker CLI 版本不兼容

现象

Error response from daemon: client version 1.40 is too new. Maximum supported API version is 1.39
client version 1.24 is too old

解决

# 查看版本
docker version

# 指定 API 版本
export DOCKER_API_VERSION=1.39
docker ps

# 升级 Docker
sudo apt-get update && sudo apt-get upgrade docker-ce

7. Disk space 错误

现象

write /var/lib/docker/tmp/GetImageBlob: no space left on device
Error processing tar file: write /var/lib/docker/overlay2/...: no space left on device

解决

# 立即清理
docker system prune -a -f
docker volume prune -f

# 查看磁盘空间
df -h
du -sh /var/lib/docker/

8. TLS/证书错误

现象

Error response from daemon: Client sent an HTTP request to an HTTPS server.
x509: certificate is valid for ..., not ...

解决

# 检查是否为 HTTP/HTTPS 混用
export DOCKER_TLS_VERIFY=1
export DOCKER_CERT_PATH=/path/to/certs

# 或使用 HTTP
export DOCKER_HOST=tcp://host:2375
export DOCKER_TLS_VERIFY=0

客户端配置

// ~/.docker/config.json
{
  "auths": {
    "https://index.docker.io/v1/": {
      "auth": "base64_encoded_credentials"
    }
  },
  "currentContext": "default",
  "cliPluginsExtraDirs": ["/usr/local/lib/docker/cli-plugins"]
}

调试技巧

# 开启 debug
dockerd --debug

# 查看客户端请求的详细信息
docker -D ps

# 使用 curl 直接访问 daemon API
curl --unix-socket /var/run/docker.sock http://localhost/containers/json | jq

# 检查网络
docker system info

面试要点

  1. 最常见的客户端错误是 socket 权限和 daemon 未运行
  2. Cannot connect 错误排查路径:daemon 状态 → socket 文件 → 用户组 → DOCKER_HOST
  3. Permission denied 推荐用 sudo 而非修改 socket 权限
  4. context deadline exceeded 通常是 daemon 负载过高
  5. 使用 -D 选项开启调试日志

面试官常问:docker ps 返回 Cannot connect 错误,你如何排查?

© 版权声明
THE END
喜欢就支持一下吧
点赞15 分享
评论 抢沙发

请登录后发表评论

    暂无评论内容