Hi all,
just wanted to mention that the backup process described below seems to
work. The 100 files gap is still about the same and I further
investigated the cause. It is related to the meta information like
indices and caches that are present in some but not all folders.
Counting only files that contain the sequence ,S= and even summing all
file sizes led to the same number and the exactly same size of raw mail
data.
I also didn't receive any notification about really failed backups,
therefore I believe that the backup works correctly.
Regards
Christian
On 09.01.2022 21:57, Christian wrote:
Hi all,
first: I'm using version 2.3.4.1
I manage some rather large imap mailboxes which I want to backup on a
regular basis. Some of them have relatively heavy traffic and one of
them is greater than 30GB in size.
I studied the docs for doveadm backup
(https://wiki2.dovecot.org/Tools/Doveadm/Sync) and even did some code
research to better understand the process.
The docs state that using stateful synchronization is the most
efficient way to synchronize mailboxes, therefore I chose this approach.
Highlevel overview:
- store a copy of the whole maildir in a separate directory
(/var/vmail/backup)
- backup to this directory once a minute (trying to make most use of
transaction logs) using the last state stored within a file
- create a backup once a day using tar (full, differential and
incremental ones) blocking the backup process of the before mentioned
step
I quite often receive notifications that doveadm backup returned an
exit code of 2, which should be quite normal. These notifications look
like that:
dsync(another_address@my.domain): Warning: Failed to do incremental
sync for mailbox INBOX, retry with a full sync (Modseq 171631 no
longer in transaction log (highest=177818, last_common_uid=177308,
nextuid=177309))
dsync(another_address@my.domain): Warning: Mailbox changes caused a
desync. You may want to run dsync again: Remote lost mailbox GUID
e9149d0ae4e02d532505000026ca4352 (maybe it was just deleted?)
Synced another_address@my.domain successfully but missing some
changes. Took 3 seconds. Starting retry 1...
The first message seems to point out that the transaction log got
rolled and no more contains the messages from the backup dir, right? I
thought about setting mail_index_log_rotate_min_age to 1hour to
prevent rolling transaction logs too often, but abandoned this thought
and increased the backup interval to once a minute. The warnings still
appear so maybe my thoughts about transactions logs are wrong. The
second message seems less alarming to me.
How does doeveadm backup behave in such situations? Does it directly
fall back to a less efficient way of syncing mails? Does the state
store the information "retry with a full sync" and the next run uses
this mode? To investigate on this I simply measured runtimes an saw
that the second/retry run takes a bit longer (up to about 15 seconds)
to sync the dir.
I'm afraid of losing messages using my approach. Is it safe to always
use doveadm backup -s $state? Simply counting one maildirs files
within the live directory and the backup copy shows a 100 fewer files
within the backup dir although the script runs only since a few days.
For reference, see my backup script below.
Regards
Christian
#!/bin/bash
# * * * * * /root/bin/backup.sh --sync-only
# 12 2 1-7 * * test $(date +\%u) -eq 6 && /root/bin/backup.sh --full
# 12 2 8-31 * * test $(date +\%u) -eq 6 && /root/bin/backup.sh
--differential
# 12 2 * * * test $(date +\%u) -ne 6 && /root/bin/backup.sh
synconly=0
differential=0
fullbackup=0
if [ $# -gt 0 ] ; then
if [ "$1" == "--sync-only" ] ; then
synconly=1
elif [ "$1" == "--differential" ] ; then
differential=1
elif [ "$1" == "--full" ] ; then
fullbackup=1
fi
fi
basedir="/var/vmail/backup"
targetdir="/var/vmail/backup/done"
mailaddresses="one_address@my.domain another_address@my.domain
yet_another@my.domain"
if [ ! -d "$basedir" ] ; then
mkdir -p "$basedir"
chown vmail:vmail "$basedir"
fi
if [ ! -d "$targetdir" ] ; then
mkdir -p "$targetdir"
chown vmail:vmail "$targetdir"
fi
for mailaddr in ${mailaddresses} ; do
#echo "Creating backup for $mailaddr."
domainpart=${mailaddr#*@}
localpart=${mailaddr%%@*}
lockfile="$basedir/$mailaddr.lock"
statefile="$basedir/$mailaddr.state"
backupdir="$domainpart/$localpart/Maildir"
snapshotfile_full="$basedir/$mailaddr.full.snar"
snapshotfile="$basedir/$mailaddr.snar"
backup_basename="$basedir/${mailaddr}_$(date '+%Y%m%d_%H%M%S')"
(
if [ $synconly -eq 1 ] ; then
flock -xn 200
if [ $? -eq 1 ] ; then
# failed to acquire lock. Skip mailbox silently.
exit
fi
fi
# try to acquire exclusive lock for one minute
flock -xw 60 200
if [ $? -eq 1 ] ; then
echo "Failed to acquire write lock within 60 seconds. Skipping
$mailaddr."
exit
fi
retries=0
retval=1
until [ $retval -eq 0 ] || [ $retries -ge 3 ] ; do
let 'retries++'
if [ -f "$statefile" ] ; then
oldstate=$(head -1 "$statefile")
else
oldstate=""
fi
start_time=$(date +%s)
ERROR=$((doveadm backup -u "$mailaddr" -s "$oldstate"
"maildir:$basedir/$backupdir") 2>&1 > "$statefile")
retval=$?
end_time=$(date +%s)
let 'duration=end_time-start_time'
if [ $retval -eq 2 ] ; then
#if [ $retries -gt 1 ] ; then
echo "$ERROR"
echo "Synced $mailaddr successfully but missing some
changes. Took $duration seconds. Starting retry $retries..."
#fi
elif [ $retval -ne 0 ] ; then
echo "$ERROR"
echo "Syncing $mailaddr failed. Return code $retval. Took
$duration seconds. Removing backup directory and starting retry
$retries..."
rm -rf "$basedir/$backupdir"
rm -f "$statefile" "$snapshotfile"
elif [ $retries -gt 1 ] ; then
echo "Successful sync took $duration seconds."
fi
done
# downgrade lock to shared lock
flock -sn 200
[ $synconly -eq 1 ] && exit
if [ $retval -ne 0 ] ; then
echo "Too many retries. Aborting backup of $mailaddr."
exit
fi
cd "$basedir"
if [ $fullbackup -eq 1 ] || [ ! -f "$snapshotfile_full" ] ; then
tar -cpzf "${backup_basename}_full.tar.gz" --level=0 -g
"$snapshotfile_full" "$backupdir"
cp -f "$snapshotfile_full" "$snapshotfile"
else
suffix=""
if [ $differential -eq 1 ] ; then
cp -f "$snapshotfile_full" "$snapshotfile"
suffix="_diff"
fi
tar -cpzf "${backup_basename}${suffix}.tar.gz" -g "$snapshotfile"
"$backupdir"
fi
cd - > /dev/null
mv "${basedir}/"*.tar.gz "$targetdir"
) 200>"$lockfile"
[ $synconly -eq 1 ] && continue
# housekeeping
newest_full=$(ls -1 "${targetdir}/${mailaddr}_"*_full.tar.gz
2>/dev/null | sort | tail -1)
if [ -n "$newest_full" ] ; then
#echo "Cleaning up files older than $newest_full..."
find "$targetdir" -depth -maxdepth 1 -name "${mailaddr}_*" !
-newer "$newest_full" ! -samefile "$newest_full" -printf 'Deleting
%p...\n' -delete
fi
done