Converting nnml to Maildir
Converting “nnml” formatted messages to Maildir formatted messages. Crudely.
I’m feeling stressed and frustrated, so naturally my brain is inventing all kinds of displacement activities to keep me distracted. I’ll have to force myself to confront the real issues eventually, but first, let me do just this one little thing…
I wanted to play with a bit of software that needs a corpus of email messages in Maildir format. Trouble is, all my email is stored in nnml format, an eccentricity of Gnus.
No problem, I figured, I’ll just convert them. Except, I couldn’t find any tools that would do that. I glanced at the Maildir spec and decided that writing one was not a simple enough task to be a displacement activity!
I’d resigned myself to tinkering with a little trickle of messages
that I could get over a few days, (or maybe downloading all
of my
messages from GMail) when I happened randomly across the Python
reference page for mailbox
.
Could I just brute force it in Python, I wondered?
Yes, yes, I could.
import os
import sys
import re
from mailbox import Maildir
from email.parser import BytesParser
ROOT = "Mail"
total = 0
parser = BytesParser()
for root, dirs, files in os.walk(ROOT):
if root != "Mail":
folder = root[5:].replace("/", ".")
maildir = None
count = 0
for name in files:
if re.match("^\d+$", name):
if not maildir:
os.mkdir("/tmp/x/Maildir/%s" % folder)
os.mkdir("/tmp/x/Maildir/%s/cur" % folder)
os.mkdir("/tmp/x/Maildir/%s/new" % folder)
os.mkdir("/tmp/x/Maildir/%s/tmp" % folder)
maildir = Maildir("/tmp/x/Maildir/%s" % folder)
fn = os.path.join(root, name)
if count != 0 and count % 100 == 0:
print("Added %d to %s" % (count, folder))
message = None
with open(fn, "rb") as mail:
message = parser.parse(mail)
maildir.add(message)
count = count + 1
print("Added %d to %s" % (count, folder))
total = total + count
print("Total messages: %d" % total)
The nnml format is just directories full of messages in individual
files named “1”, “2”, “3”, etc. There may be a few other files in
there so I ignore any with names that don’t match ^\d+$
. I’m making
no effort to preserve the status of messages (read, unread, marked,
etc.).
In nnml, directories can be nested and I don’t know if Maildir supports that so
I turn Mail/one/two/three
into Maildir/one.two.three
. It took all of
about ten minutes to bang together and a good bit longer to process
150k+ messages. You’ll have noticed that I’m shoving them all into
/tmp/x/Maildir
, well out of harm’s way so that I can sanity check them
before dropping them into ~/Maildir
.
I had no idea if that would be sufficient, but it appears to work, so: win!
It’s crude and it’s ugly but if you, random reader from the future, happened to type “convert nnml into Maildir” into your favorite search engine and it got you here, “you’re welcome.” I hope it works for you!