Raphael (PH) De Lio

Software Engineer

Understanding Redis Persistence

If you read my story on 10 things you didn’t know about Redis — From a Message Broker to a Graph Database, and already have your own server running locally, as explained in How to run Redis locally in a Docker Container and manage it with Redis Insight and Redis CLI. You are ready to enable Redis to persist its data onto the disk.

Redis has two mechanisms to persist its data. Let’s go through each one of them, understand how they work, what are their advantages and disadvantages, and how we can use them.

Redis Database

Redis Database, or RDB, is a mechanism of persistence in which the database will persist the data into the disk as snapshots.

If the server instance goes down, these snapshots can be used to restore a previous database state. The interval in which the snapshots are taken can be configured. For example, you can configure your database to take a snapshot every 1 minute if 10 changes have happened in the dataset or every 5 minutes if 1000 changes have happened in the dataset.

How it works

The snapshots work like a time machine. You can take as many snapshots as you wish, as frequently as you wish, and keep them for as long as you wish. Then you can use these snapshots to restore the database to any point in time in case of disasters.

By default, Redis stores these snapshots in a binary file named dump.rdb. And this RDB file is replaced whenever a new snapshot is created.

Redis takes snapshots by forking its process into a parent and child process. Then, the child process starts writing a new RDB file. And when it’s done writing the new RDB file, it replaces the old one.

An illustration representing Redis snapshots. At the center, there is a large Redis logo (a red stack with a white star). Above it, multiple snapshots of the Redis logo are displayed on a rope, each labeled with a different date (e.g., “SNAPSHOT 1995-06-08,” “SNAPSHOT 1995-06-09,” etc.), hung with clothespins. This visual signifies periodic data snapshots in Redis. The credit “@RaphaelDeLio” appears at the bottom right corner.

Advantages

  • RDB doesn’t impact the performance of the server. Since the main process only has to fork its process and the child process will take care of all the writing on the disk, the performance of the parent process is preserved. However, forking may cause performance decrease.
  • Restarting the database is faster. RDB is faster when restoring large datasets in comparison to AOF, which is another mechanism of persistence we’ll cover later in this story.
  • Compact backup files. The content of the dump file is very compact and can be transferred to other storage such as Google Cloud Storage or Amazon S3.

Disadvantages

  • You can lose minutes of data. Although you can configure your database to take snapshots from time to time, the minimum you would like to set is 5 minutes. That’s because when the data is relatively large or the CPU performance is not great, the fork() operation may be time-consuming, which can lead Redis to stop serving clients from a millisecond to one second.

Working with RDB

To take a snapshot you can either configure Redis to automatically take a snapshot every N minutes or M changes in the dataset or take a manual snapshot yourself. Let’s go through a few commands:

Automatic snapshots

Unless specified in the redis.conf file, the automatic snapshots will be enabled by default. To check the configuration, you can run:

CONFIG GET save

Which should return:

1) "save"
2) "3600 1 300 100 60 10000"

The default configuration will create a new snapshot:

  • After 3600 seconds (an hour) if at least 1 change was performed.
  • After 300 seconds (5 minutes) if at least 100 changes were performed.
  • After 60 seconds if at least 10000 changes were performed.
To override this configuration, you can either change the config file or run the CONFIG SET command, as in:

CONFIG SET save "120 1"

The example above will ask Redis to create a new snapshot every 2 minutes if at least 1 record has been changed in the dataset.

You can also turn it off by running:

CONFIG SET save ""

Manual Snapshot: Save Command

The SAVE command performs a synchronous save of the dataset producing a point-in-time snapshot of all the data inside the Redis instance.

According to the documentation, you almost never want to call the SAVE command in production because it will block all the other clients. Instead, they recommend using the BGSAVE command. However, if for any reason, there’s an issue preventing Redis from doing the fork(), the SAVE command may be the option to dump the latest dataset.

When you run SAVE, you should see OK as a response. The time to respond is O(N), where N is the total number of keys in the database.

Manual Snapshot: BGSave command

This operation is asynchronous, which means that the BGSAVE command will immediately return OK.

Redis will fork its process, the parent will continue to serve the clients and the child will dump the RDB file.

However, an error is returned if there is already a background save running or if there is another non-background-save process running, specifically an in-progress AOF rewrite.

In this case, you would prefer to use BGSAVE SCHEDULE. This command will immediately return OK. If there’s an AOF rewrite in progress, it will be scheduled to create the snapshot at the next opportunity.

Last Save command

This command will return the UNIX TIME of the last snapshot created with success.

You can run it with: LASTSAVE.

And it will return: (integer) 1660310189

AOF

Append-only file is another mechanism of persistence that will log every write operation received by the server. These logs can then be replayed at server start-up and reconstruct the original dataset.

The commands are logged using the same format as the Redis protocol.

An illustration showing a concept of Redis data storage. At the top, there is a series of red blocks symbolizing data or storage. Below, on the left, a silhouette of a person sitting at a desk with a laptop represents a user or developer. In the center, the Redis logo (a red stack with a white star) symbolizes the Redis database. This image likely conveys data being interacted with, stored, or processed in Redis. The credit “@RaphaelDeLio” is visible in the bottom right corner.

How it works

When Redis finishes executing a write command, it will append the command at the end of the aof_buf buffer of the server in protocol format (The language used between a server and a client in the networking communication)

The flushing of the buffer will be determined by the setting appendfsync, which can be:

  • always (safest, but poor performance)
  • everysec (safe, better performance) (Default)
  • no (Generally up to the Operating System to decide, which is typically ~30 seconds (unsafe, best performance)

This setting will then be used by the flushAppendOnlyFile(), a function that will write the contents of the aof_buf to the AOF file.

An illustration representing Redis Append-Only File (AOF) buffering. On the left, there’s a black silhouette of a person working on a laptop. In the center, the Redis logo (a red stack with a white star) symbolizes the Redis database. On the right, there is a stacked icon labeled “AOF_BUF,” representing the AOF buffer. The image visually connects the user, Redis, and AOF buffering for data persistence. The credit “@RaphaelDeLio” is displayed in the bottom right corner.

Advantages

  • It’s durable. Since every change operation is appended to the file, it’s unlikely to face dataloss.
  • It’s reliable. Even if the log ends with a half-written command for some reason (disk full or other reasons) the redis-check-aof tool is able to fix it easily.
  • It’s flexible. Even if you trigger the FLUSHALL command, which will delete all keys from the database. As long as the file hasn’t been rewritten, you can still stop Redis, edit the file, remove this command and restart your server.

Disadvantages

  • Size of the file. AOF files are usually bigger than the equivalent RDB for the same dataset.
  • Performance. AOF can not be very performant depending on the fsync setting.

Enabling

You can turn on the AOF in your redis.conf file by setting:

appendonly yes

Or by running the command:

CONFIG SET appendonly yes

By running the command above, the file will be generated automatically. However, in order to replay the file on server startup, the setting must be set in the configuration file.

And that’s it. Whenever you restart your database, it will replay the commands on the file automatically and recreate its original state.

AOF Rewrite

All operations that modify the dataset of your database will be appended to the AOF file. This means that the AOF file is an always-growing file.

When your file gets too big, Redis will rewrite it from scratch into a new file. This operation is done by accessing the data in memory, not by reading the old file, which guarantees that it will be created with the shortest number of commands possible.

Once the rewrite is finished, the old file will be overwritten by the new file.

Manual Rewrite with BGREWRITEAOF

Redis will automatically trigger the rewrite process, however, you can also trigger it manually if you wish. You can do it by running the following command:

BGREWRITEAOF

If there’s another persistence operation running in the background, the Rewrite operation will be scheduled for a later time.

Digging into the file

Let’s start by editing our redis.conf file, enabling AOF and disabling RDB.

Then, let’s connect to Redis Insight and use the Workbench to set two keys:

A screenshot of Redis commands executed in a terminal or interface. The SET command is used twice:
	1.	SET secondKey "I'm number two" returns “OK” at 11:41:39.
	2.	SET firstKey "I'm number one" returns “OK” at 11:41:28.
The commands are displayed with a dark background, and timestamps are visible on the right. A green “Play” button is present at the top-right corner, suggesting the commands were run or can be executed again.

Now, let’s do a cat on our file:

A terminal window on a MacBook showing the contents of an appendonly.aof file in Redis. The commands are displayed in Redis’ RESP protocol format:
	1.	SELECT 0 switches to database 0.
	2.	SET firstKey "I'm number one" stores the string “I’m number one” in firstKey.
	3.	SET secondKey "I'm number two" stores the string “I’m number two” in secondKey.
The file content is formatted with protocol symbols (*, $) and lengths of arguments. The terminal prompt indicates the user raphaeldelio at the end of the output.

*N is the number of arguments of the command, and $M is the length, i.e. the number of bytes, of each argument.

In our case, Redis executed:

  • SELECT 0
  • SET firstKey “I’m number one”
  • SET secondKey “I’m number two”

Now, let’s edit the first key:

A screenshot showing Redis commands being executed in a terminal or interface. The command SET firstKey "I'm still number one" is at the top, ready to be executed. Below are two previously executed commands:
	1.	SET secondKey "I'm number two" returns “OK” at 11:45:22.
	2.	SET firstKey "I'm number one" returns “OK” at 11:45:20.
The interface has a dark background with a green play button on the right, indicating commands can be run or replayed. Timestamps are visible on the right side for each command.

And let’s check the file again:

A terminal window displaying the contents of a Redis appendonly.aof file. It shows the Redis commands executed in RESP protocol format:
	1.	SELECT 0 to switch to database 0.
	2.	SET firstKey "I'm number one" stores the string “I’m number one” in firstKey.
	3.	SET secondKey "I'm number two" stores the string “I’m number two” in secondKey.
	4.	SET firstKey "I'm still number one" updates the value of firstKey to “I’m still number one.”
The terminal ends with the command prompt, indicating successful execution.

We can see that Redis logged all our commands:

  • SELECT 0
  • SET firstKey “I’m number one”
  • SET secondKey “I’m number two”
  • SET firstKey “I’m still number one”

However, is the first SET command still required to rebuild the database? Not really, it has already been overwritten by the third SET command. Before this file gets very big, let’s ask Redis to rewrite it:

A Redis command interface showing the execution of commands. At the top, the BGREWRITEAOF command is typed, which triggers a background rewrite of the AOF file. Below, the previous commands are listed:
	1.	SET firstKey "I'm still number one" executed at 11:50:38.
	2.	SET secondKey "I'm number two" executed at 11:45:22.
	3.	SET firstKey "I'm number one" executed at 11:45:20.
All commands return “OK”. The interface has a dark background with timestamps visible on the right and a green play button indicating the ability to execute commands.

And let’s read the file again:

A terminal window displaying corrupted or compacted contents of a Redis appendonly.aof file. The file begins with a Redis signature (REDIS0009), version (redis-ver6.2.7), and other metadata like redis-bits. It then shows scattered readable text such as firstKey I'm still number one and secondKey I'm number two interspersed with unreadable or corrupted characters (?????, @?ctimek, etc.). The terminal prompt at the bottom indicates the user raphaeldelio has completed the cat command.

Now, at first, this was weird for me. I was expecting the AOF file to be something like:

*2
$6
SELECT
$1
0
*3
$3
SET
$9
secondKey
$14
I'm number two
*3
$3
SET
$8
firstKey
$20
I'm still number one

However, it became:

REDIS0009?	redis-ver6.2.7?
redis-bits?@?ctimežk?bused-mem˜??
aof-preamble???????֭h
????mʗ????~??ױ??firstKeyI'm still number one secondKeyI'm number two??????֭h
????mʗ??????!4d?

This header is the same header of an RDB file, so I did the following experiment:

  1. Stopped Redis Server
  2. Renamed appendonly.aof to dump.rdb
  3. Edited redis.conf, turned off AOF and turned on RDB again
  4. Started the server again
  5. And the server was able to use the dump.rdb, which had only the header, to recreate the database.

This made me believe that:

  1. The AOF file is a mix of AOF and RDB
  2. BGREWRITEAOF doesn’t actually rewrite the AOF file but takes a snapshot instead

And thanks to Lior Kongo, who answered my question on Stack Overflow, I was able to confirm it.

According to the documentation:

When rewriting the AOF file, Redis is able to use an RDB preamble in the AOF file for faster rewrites and recoveries. When this option is turned on the rewritten AOF file is composed of two different stanzas:

[RDB file][AOF tail]

When loading, Redis recognizes that the AOF file starts with the “REDIS” string and loads the prefixed RDB file, then continues loading the AOF tail.

You can turn it off by setting the configuration on redis.conf:

aof-use-rdb-preamble no

Editing the AOF file

The AOF file is also flexible and easily editable before the BGWRITEAOF command is triggered.

Let’s see if we can flush all our data by running FLUSHALL and then:

  1. Stopping the database
  2. Editing the AOF file and removing the FLUSHALL command
  3. Restarting the database

I will start by adding two keys again:

A Redis command interface displaying two executed SET commands:
	1.	SET secondKey "I'm number two" executed at 15:55:46.
	2.	SET firstKey "I'm number one" executed at 15:55:43.
Both commands returned “OK” as responses. The interface has a dark background with timestamps on the right and a green play button at the top-right corner, indicating that commands can be executed or replayed.

I can see both of them are in my database:

A screenshot of a Redis Stack Database interface displaying two keys:
	1.	firstKey with data type STRING and size 72 B.
	2.	secondKey with data type STRING and size 72 B.
The interface includes a search bar labeled “Filter by Key Name or Pattern” and a “+ Key” button for adding new keys. The left sidebar contains icons for navigating different sections of the Redis database, with the key icon highlighted. The display indicates “Results: 2 keys” scanned, and the last refresh occurred in less than 1 minute.

Now, I’m gonna run de FLUSHALL command:

A Redis command interface showing the FLUSHALL command typed in the input line, which clears all keys from all databases. Below are two previously executed commands:
	1.	SET secondKey "I'm number two" executed at 15:55:46.
	2.	SET firstKey "I'm number one" executed at 15:55:43.
The interface has a dark background with timestamps on the right side and a green play button indicating the ability to execute or replay commands.

And now, all of my keys are gone:

A Redis Stack Database interface showing an empty key list after executing a FLUSHALL command. The search bar at the top is labeled “Filter by Key Name or Pattern,” and no keys are visible in the display. The left sidebar has a key icon highlighted, indicating the current view. The interface includes a “Last refresh: < 1 min” indicator and a “+ Key” button for adding new keys.

You can see all commands are still in my AOF file, so let’s edit it and remove the last one:

 A terminal window showing the contents of a Redis appendonly.aof file opened in Vim. The file logs Redis commands in RESP protocol format:
	1.	SELECT 0 to switch to database 0.
	2.	SET firstKey "I'm number one" to set the firstKey value.
	3.	SET secondKey "I'm number two" to set the secondKey value.
	4.	FLUSHALL appears at the end, indicating a command to delete all keys in the database.
The bottom of the window shows Vim in INSERT mode.

And after restarting the server, it’s like the FLUSHALL command never ran:

A Redis Stack Database interface displaying two keys:
	1.	secondKey with data type STRING and size 72 B.
	2.	firstKey with data type STRING and size 72 B.
The interface includes a search bar labeled “Filter by Key Name or Pattern” and shows “Results: 2 keys. Scanned 2 / 2 keys.” The left sidebar highlights the key icon, indicating the current view. A “Last refresh: < 1 min” message appears at the top right, alongside a + Key button for adding new keys.

Should I use AOF or RDB?

If you want to have a degree of durability compared to Postgres SQL or other in-disk databases, you should have both mechanisms of persistence turned on.

AOF will make sure your data is durable and safe while RDB will allow you to keep a smaller file and restart your database faster.

However, it all depends on your use case. If you can tolerate a few minutes of data loss or if you can tolerate data loss at all, then you can either turn off AOF or both AOF and RDB.

Enabling persistence from Docker

If you have been following my previous articles, you have seen I have been doing my experimentations by running Redis Server and Redis Insight from within a Docker Container.

Docker Containers are ephemeral by default, which means that you will lose all your data if you restart your container.

In order to enable persistence for your container, you need to configure a volume. To do it, I first create a local directory at /tmp/local-redis/data. Then, you need to add the following option to your docker run command:

-v /tmp/local-redis/data:/data

And for having a custom redis.conf file loaded into my container, I added the file to /tmp/local-redis/redis.confand added the following option to my docker command:

-v /tmp/local-redis/redis.conf:/redis-stack.conf

In the end, you will have a command like:

docker run -d --name redis-stack -p 6379:6379 -p 8001:8001 -v /tmp/local-redis/data:/data -v /tmp/local-redis/redis.conf:/redis-stack.conf redis/redis-stack:latest

Which will spin up your container with a persistence volume for keeping your server data stored in disk and another one for injecting the configuration file into the server.

Leave a Reply

Your email address will not be published. Required fields are marked *