🛠️ Dockerfile From Image (dfimage)
📑 Table of Contents
- 🎯 Purpose
- ⚡ Usage
- 🧪 Test
- 🐋 Docker Example
- 🔍 How Does It Work
- ⚠️ Limitations
- 🐛 Bugs
- 📝 License
- 💰 Donate
🎯 Purpose
Reverse-engineers a Dockerfile from a Docker image.
See my Inspiration and Container Source for more information.
Similar to how the docker history
command works, the Python script is able to re-create the Dockerfile
(approximately) that was used to generate an image using the metadata that Docker stores
alongside each image layer.
⚡ Usage
The Python script is itself packaged as a Docker image so it can easily be executed with the Docker run command:
docker run -v /var/run/docker.sock:/var/run/docker.sock dfimage ruby:latest
The ruby:latest
parameter is the image name & tag (either the truncated form or the complete image name &
tag).
Since the script interacts with the Docker API in order to query the metadata for the various image layers it
needs access to the Docker API socket. The -v
flag shown above makes the Docker socket available inside the
container running the script.
Note that the script only works against images that exist in your local image repository (the stuff you see
when you type docker images
). If you want to generate a Dockerfile for an image that doesn’t exist in your
local repo you’ll first need to docker pull
it.
You can find more usage examples in the Wiki.
🐍 Alternative: pip install dfimage
To install the dfimage
package using pip(x), run the following command:
pipx install dfimage
Note that Docker must be installed on your system for the dfimage
package to work correctly.
🧪 Test
dfimage --help
🐋 Docker Example
Here’s an example that shows the official Docker ruby image being pulled and the Dockerfile for that image being generated. Note: A docker tag is required for correct functionality.
docker pull ruby:latest
docker pull ghcr.io/laniksj/dfimage
alias dfimage="docker run -v /var/run/docker.sock:/var/run/docker.sock --rm ghcr.io/laniksj/dfimage"
dfimage ruby:latest
🔍 How Does It Work
When an image is constructed from a Dockerfile, each instruction in the Dockerfile results in a new layer. You
can see all of the image layers by using the docker images
command with the (now deprecated) --tree
flag.
docker images --tree
Each one of these layers is the result of executing an instruction in a Dockerfile. In fact, if you do a
docker inspect
on any one of these layers you can see the instruction that was used to generate that layer.
⚠️ Limitations
As the Python script walks the list of layers contained in the image it stops when it reaches the first tagged
layer. It is assumed that a layer which has been tagged represents a distinct image with its own Dockerfile so
the script will output a FROM
directive with the tag name.
In the example above, the ruby image contained a layer in the local image repository which had been tagged with buildpack-deps (though it wasn’t shown in the example, this likely means that buildpack-deps:latest was also pulled at some point). If the buildpack-deps layer had not been tagged, the Python script would have continued outputting Dockerfile directives until it reached the root layer.
Also note that the output generated by the script won’t match exactly the original Dockerfile if either the
COPY
or ADD
directives (like the example above) are used. Since we no longer have access to the build
context that was present when the original docker build
command was executed all we can see is that some
directory or file was copied to the image’s filesystem (you’ll see the file/directory checksum and the
destination it was copied to).
🐛 Bugs
Please report any bugs or issues you find. Thanks!