Converting nnml to Maildir

Volume 4, Issue 8; 25 Jan 2020

Converting “nnml” formatted messages to Maildir formatted messages. Crudely.

I’m feeling stressed and frustrated, so naturally my brain is inventing all kinds of displacement activities to keep me distracted. I’ll have to force myself to confront the real issues eventually, but first, let me do just this one little thing…

I wanted to play with a bit of software that needs a corpus of email messages in Maildir format. Trouble is, all my email is stored in nnml format, an eccentricity of Gnus.

No problem, I figured, I’ll just convert them. Except, I couldn’t find any tools that would do that. I glanced at the Maildir spec and decided that writing one was not a simple enough task to be a displacement activity!

I’d resigned myself to tinkering with a little trickle of messages that I could get over a few days, (or maybe downloading all of my messages from GMail) when I happened randomly across the Python reference page for mailbox.

Could I just brute force it in Python, I wondered?

Yes, yes, I could.

import os
import sys
import re
from mailbox import Maildir
from email.parser import BytesParser

ROOT = "Mail"

total = 0
parser = BytesParser()
for root, dirs, files in os.walk(ROOT):
    if root != "Mail":
        folder = root[5:].replace("/", ".")

        maildir = None
        count = 0
        for name in files:
            if re.match("^\d+$", name):
                if not maildir:
                    os.mkdir("/tmp/x/Maildir/%s" % folder)
                    os.mkdir("/tmp/x/Maildir/%s/cur" % folder)
                    os.mkdir("/tmp/x/Maildir/%s/new" % folder)
                    os.mkdir("/tmp/x/Maildir/%s/tmp" % folder)
                    maildir = Maildir("/tmp/x/Maildir/%s" % folder)
                fn = os.path.join(root, name)
                if count != 0 and count % 100 == 0:
                    print("Added %d to %s" % (count, folder))
                message = None
                with open(fn, "rb") as mail:
                    message = parser.parse(mail)
                count = count + 1
        print("Added %d to %s" % (count, folder))
        total = total + count

print("Total messages: %d" % total)

The nnml format is just directories full of messages in individual files named “1”, “2”, “3”, etc. There may be a few other files in there so I ignore any with names that don’t match ^\d+$. I’m making no effort to preserve the status of messages (read, unread, marked, etc.).

In nnml, directories can be nested and I don’t know if Maildir supports that so I turn Mail/one/two/three into Maildir/one.two.three. It took all of about ten minutes to bang together and a good bit longer to process 150k+ messages. You’ll have noticed that I’m shoving them all into /tmp/x/Maildir, well out of harm’s way so that I can sanity check them before dropping them into ~/Maildir.

I had no idea if that would be sufficient, but it appears to work, so: win!

It’s crude and it’s ugly but if you, random reader from the future, happened to type “convert nnml into Maildir” into your favorite search engine and it got you here, “you’re welcome.” I hope it works for you!