Archiving / Transferring Email with imapsync

One increasingly common task here involves transferring email between two accounts, either to create an archive from a departing employee or to migrate an employee’s email between services. There are three possible mail back ends at the UW: Deskmail (in-house IMAP), Microsoft (Office 365), and Google (G Suite). We might be called on to migrate mail between accounts on any combination of those.

One way to do this would be to set up both accounts in a desktop email client such as Thunderbird or Outlook and drag and drop between the two. For small amounts of mail, this works fine. But once you’ve got mailboxes where folder organization matters, or if you’re going from Google (which has labels, not folders) to anything else, you’re going to want something more robust.

I’ve been using a command line tool named imapsync. I installed it on my desktop Mac using Homebrew, but it’s available for linux and Windows as well. You can do pretty much anything with imapsync; it’s just a matter of finding the right options.

IMPORTANT: If this is a departing employee situation, make sure they know that this process can copy their entire mail archive, not just work-related mail. Many employees will have been or are currently students, and it may not be appropriate to copy their entire mail archive. If they spend some time creating folders for messages they want to hand off upon departure, you can use the “–folder” or “–folderrec” options to select just those folders for transfer.

NB: Test any of these recipes first, by adding the --dry flag to make it a dry run. I notice I’ve been using single quotes in some places and double quotes in others; I should probably see if there’s a functional reason for that or if it’s a mistake.

Here are some recipes I’ve been using:

Simple transfer between two Office365 accounts

This one just copies everything from ACCOUNT1 to ACCOUNT2, without deleting anything which already exists in ACCOUNT2.

imapsync \
    --host1 outlook.office365.com \
    --user1 ACCOUNT1@uw.edu \
    --password1 'PASSWORD1' \
    --host2 outlook.office365.com \
    --user2 ACCOUNT2@uw.edu \
    --password2 'PASSWORD2' \
    --ssl1 \
    --ssl2 \
    --maxsize 45000000 \
    --maxmessagespersecond 4 \
    --regexflag "s/\\Flagged//g" \
    --disarmreadreceipts \
    --regexmess "s,(.{10500}),$1\r\n,g" \
    --noexpunge

From Gmail to Office 365, a full user archive

This one copies everything from a Gmail account to an Office 365 account, ignoring all Gmail labels and just dumping everything from ACCOUNT1 into a folder named “Archive-User” under ACCOUNT2. By specifying the “[Gmail]/All Mail” folder, we should get all the messages, and we don’t get duplicates from messages that have multiple labels. Google has a transfer limit of 2GB/day over IMAP; the maxbytespersecond option is set to a value a little below that, to allow the user to keep using their account during the transfer, which will probably take a couple of days if they have a large mail archive.

imapsync \
    --host1 imap.gmail.com \
    --user1 ACCOUNT1@uw.edu \
    --password1 'PASSWORD1' \
    --host2 outlook.office365.com \
    --user2 ACCOUNT2@uw.edu \
    --password2 'PASSWORD2' \
    --ssl1 \
    --ssl2 \
    --maxsize 45000000 \
    --maxmessagespersecond 4 \
    --regexflag 's/\\Flagged//g' \
    --disarmreadreceipts \
    --regexmess 's,(.{10500}),$1\r\n,g' \
    --maxbytespersecond 20000 \
    --useheader="X-Gmail-Received" \
    --useheader "Message-Id" \
    --regextrans2 "s,\[Gmail\].,," \
    --f1f2 "[Gmail]/All Mail"="Archive-User" \
    --folder "[Gmail]/All Mail" \
    --noexpunge

Since this eliminates the folder structure, it’s not a great recipe for migrating an active user from Gmail to Office 365. I don’t have a good recipe for that scenario yet, due to the problem of duplicate messages created by multiple Gmail labels. Imapsync provides a workaround for that, but I haven’t had an opportunity to try it yet (from the Gmail FAQ):

–skipcrossduplicates is optional but it can save Gigabytes of hard
disk memory. Within imap protocol, Gmail presents Gmail labels as
folders, so a message labeled “Work” “ProjectX” “Urgent” ends up
in three different imap folders “Work” “ProjectX” and “Urgent”
after an imap sync. –skipcrossduplicates prevent this behavior.

An issue with –skipcrossduplicates is that the first label synced
by imapsync goes to its corresponding folder and other labels are
ignored. This way, at least you can choose what labels have the
priority by using the –folderfirst option. For example
–folderfirst “Work” will sync messages labeled “Work” before
messages labeled “CanWait” or “Urgent”. By default imapsync
syncs folders (Gmail labels) using the classical alphanumeric order.

Deskmail to Office 365

This copies a Deskmail account to an Office 365 account, maintaining the folder structure as much as possible. NB: it’s possible to have mail outside of the “mail” folder in Deskmail, so do check for that first, and if it’s the case, leave out the two lines before “automap”.

imapsync \
    --host1 NETID1.deskmail.washington.edu \
    --user1 NETID1 \
    --password1 'PASSWORD1' \
    --host2 outlook.office365.com \
    --user2 NETID2@uw.edu \
    --password2 'PASSWORD2' \
    --ssl1 \
    --ssl2 \
    --maxsize 45000000 \
    --maxlinelength 10500 \
    --maxmessagespersecond 4 \
    --regexflag 's/\\Flagged//g' \
    --disarmreadreceipts \
    --regexmess 's,(.{10500}),$1\r\n,g' \
    --noexpunge \
    --folderrec 'mail' \
    --regextrans2 's,^(.*?)/mail/(.+),$1/$2,' \
    --automap 

If you want to copy everything in a Deskmail account to Office 365 but put it under a subfolder, you can add the following line:

    --subfolder2 'Archive-Folder' \

That’ll preserve the original folder structure, but stick it all under the “Archive-Folder” folder on the Office 365 side.

All these options were gleaned from the documentation at https://imapsync.lamiral.info/FAQ.d/, particularly FAQ.Gmail.txt and FAQ.Exchange.txt — there are good starting recipes in both, with explanations.

If there’s a particular scenario you’d like help with (at the UW), let me know and I’ll see if I can put together a working recipe.

Update, 2018.02.06:

Deskmail to Gmail

imapsync --host1 NETID.deskmail.washington.edu \
    --user1 NETID \
    --password1 'DESKMAIL PASSWORD' \
    --ssl1 \
    --expunge1 \
    --host2 imap.gmail.com \
    --user2 username@gmail.com \
    --password2 'GMAIL PASSWORD' \
    --ssl2 \
    --maxbytespersecond 20000 \
    --maxsize 25000000 \
    --addheader \
    --exclude "\[Gmail\]$" \
    --regextrans2 "s/[ ]+/_/g" \
    --regextrans2 "s/['\^\"\\\\]/_/g" \
    --folder INBOX \
    --folderrec "mail" \
    --regextrans2 's,^(.*?)/mail/(.+),$1/$2,' \
    --subfolder2 "Deskmail-Archive" 

I haven’t tried this one yet, and You might wonder why even bother, since UW IT provides a Deskmail to UW Gmail migration tool. Well, now that they’re discontinuing the Deskmail service, people such as faculty emeriti who are not eligible for UW Gmail will still need to be able to move their Deskmail somewhere. It should also be useful for staff who retire and get to keep their email address (via forwarding) but not service. I’ll give it a try soon.

This one assumes that all mail is in the INBOX and mail/ tree, and transfers it all to a Deskmail-Archive folder in Gmail. If you don’t want to use the “Deskmail-Archive” folder to store things under, and want to duplicate the folder structure, replace the last two lines with these lines:

    --regextrans2 's,^mail/(.+),$1,' \
    --automap

It might also be worth grabbing a copy of everything right off the deskmail server for safekeeping, too:

mkdir deskmail-archive
rsync -Pav NETID@NETID.deskmail.washington.edu: deskmail-archive

Leave a Reply

Your email address will not be published. Required fields are marked *