Docker Network not Found

In our team, we are currently transitioning to Docker to deploy everything on our server.

We are using Docker Swarm and multiple (10+) compose files defining plenty (20+) of services. Everything works beautifully so far, except when we take down our stack using docker stack rm <name> (and redeploy using docker stack deploy <options> <name>): about every second time, we get the following error:

Failed to remove network <id>: Error response from daemon: network <id> not foundFailed to remove some resources from stack: <name>

When using docker network ls, the network is indeed not removed, however, docker network rm <id> always results in the following:

Error response from daemon: network <id> not found

What makes that even more strange is the fact that docker network inspect <id> returns a normal output. The networks are always overlay networks that are created with the compose files used to deploy our stack. Currently, we only have a single node in our Swarm.

Our current "workaround" is to restart Docker (which resolves the issue), but that is not a viable solution in a production environment. Leaving the swarm and joining it again does not resolve the issue either.

At first, we thought that this issue is related to Docker for Mac only (as we first encountered the issue on local machines), however, the same issue arises on Debian Stretch. In both cases, we use the latest Docker distribution available.

I would really appreciate any help!

92066 次浏览

That sounds exactly like this issue.

Stack rm followed "too fast" by stack deploy would race for the creation/removal of networks, possibly other stack resources.

The issue is still open as of today (docker/cli), but you could try the workaround suggested:

until [ -z "$(docker service ls --filter label=com.docker.stack.namespace=$COMPOSE_PROJECT_NAME -q)" ] || [ "$limit" -lt 0 ]; do
sleep 1;
done


until [ -z "$(docker network ls --filter label=com.docker.stack.namespace=$COMPOSE_PROJECT_NAME -q)" ] || [ "$limit" -lt 0 ]; do
sleep 1;
done

You can always use docker system prune -a to get rid of the old network. This will not delete your volumes.
It will take longer to docker-compose up --build -d the next time, but it will get you past your current problem.

If you are attempting to add a container to an existing network that no longer exists, then you can use docker-compose up --force-recreate. I found this GitHub issues comment to be a helpful overview.

old containers are still using old network. Probably you removed networks but forgot to rm old containers. Just remove old containers, create your network and build again.

After using the docker prune command, I was unable to launch the docker container on a network.

The following errors stated:

ERROR: for jekyll-serve Cannot start service jekyll-serve: network b52287167caf352c7a03c4e924aaf7d78e2bc372c703560c003acc758c013432 not found ERROR: Encountered errors while bring up the project.

docker system prune enabled me to begin using docker-compose up again.

More info here: https://docs.docker.com/config/pruning/

Deleting the "network not found" in docker

Inspect the network which we are unable to delete

docker network inspect <id> or <name>

Disconnect the network

docker network disconnect -f <networkID> <endpointName> or <endpointId>

Delete unused networks

docker network prune

I could not get rid of the networks by any of the methods in previous answers.

This is what worked for me.

systemctl restart docker

This is the experience I got and I think it might help. Docker network is capable of doing bridging. In course told, a container can disconnect and connect from one to the other. If one disconnects from current and connect to the other, and the current disappear due to shutdown/network prune, the independent container will lose the connection. Later, when you try to start, only found "network not found" error.

The solution to this is start swarm/cluster (in my case I start with docker-compose up), disconnect the container (even it's yet up) from that network using force (-f). Connect back to that (different ID, but same name) network. Now you can successfully start it without "network not found" error. So, the point is it maybe happens to see same name and different ID network.