Managing Data in Containers

We have code and its environment(e.g. node dependencies) in our local machine. We create

docker file and instruct them to create an image. Images are read-only and take the snapshot of code at that particular time it was created. Therefore we need containers to run them. Always remember the fact that containers add an extra layer on top of them and not copy the code from the image it was created.



  1. Once we create Images and Containers, they will be isolated from the local machine

  2. Images are read-only - we can't rewrite the code in Images

  3. Containers are a read-write layer on top of images

  4. But wait! Where does the writing layer come from in the container?


The answer is from the code itself. Say a code like a login form and registration form. When we register to the website by giving details, it will then go to the database and be saved. Then it will create a custom-made profile URL like Facebook registration.


In essence,

  • Data can be Read-only - Code and Environment --->Stored with Images


  • Temporary data like browser history stack in React - Stored in memory - Stored with Containers --> This data is deleted after the container stops or removed


  • Permanent data like user account details like registration --> Need a way to store a permanent data

Here the problem evolves from permanent data because till now we have seen ways to store temporary data that comes in handy and default when the container is created. But with the case of permanent data - we need another tool that fills the gap of losing the data.


This tool has to find a way to connect to the local machine/host because that is permanent. These tools in the docker arena are called Volumes and Bind Mounts.


Volumes are folder in the host machine's hard drive that is mapped to folders inside the docker containers.
Since it is a folder in the host machine, it is permanent. Connection with containers enables the communication between them

The problem we encountered is called Data Persistence. There are some data that need to be available even after the container is removed. This is done by making the connection with the local machine folder.


When you create an Image, give instruction VOLUME ["any docker folder"]. This instruction will be connected to the local machine folder that is known only to Docker. Therefore Volumes are managed fully by docker and isolated from all local machine processes.


When you run a container based on the image(that has VOLUME instruction), volume is created and the name is randomly given by docker itself - This is called Anonymous Volume. Since anonymous volumes are created automatically created by docker, it is tied to a specific container and removed automatically as soon as the specific container is removed (--rm command)


As we are interested in data persistence across all containers, we need to find a way not to remove volumes even when the container is removed. This is done with the help of Named Volumes. Therefore while creating a container add a command. The volume will survive even after the container is removed



-v [volume name]:['container folder that needs to be connected to local machine'].



Note: Both Named and Anonymous volumes has use cases


Since Volumes are wholly controlled by docker itself, our local changes are not reflected in the docker file system - any way that is the isolation we wanted. But during development, we can't create new images every time we change the source code. This is done with the help of Bind Mounts. Here instead of volume name give an absolute path in the instruction -


'-v [local machine folder]:['container folder that needs to be connected to local machine']. 

The bind mounts are not fully controlled by docker itself because we attach the local machine folder to container deliberately.



Source:Docker site

Recap:

Images are read-only

Containers are read-write - they add a thin read-write layer on top of the image

Container data can't persist after the container is removed - Volume is the solution

Container can't interact with host file system - Bind mount is the solution