Volume 2, Issue 22; 23 Aug 2018

Building bridges between pages. Oh, and spam, of course.

Recently, my attention was drawn to Webmention. The description pointed to on A List Apart provides a good summary of its features, so I won’t recapitulate them here.

I’m interested in improving the web and I’m happy to use this weblog as a playground to experiment with technologies that might improve it. If we’re serious about “fixing the web,” it’s going to take a whole lot more than a few new bits of technology to do so, and I hope I have the courage and stamina to participate fully in those other activities as well, but that’s not what this post is about.

This post is about webmention, which I have about ¾’s implemented in about 500 lines of XQuery. (Yay XQuery, also not what this post is about.)

I was thinking about how I might carve out another couple of hours to finish my implementation this weekend before I realized that I’ve left out the entire moderation infrastructure, the whole problem of moderation, really.

On this site, which is low volume, way down in the long tail, and built with entirely custom software, there’s still a steady (thankfully small) volume of spam to address. I don’t believe anyone has automated spamming this weblog. Automating spam comments looks like a tractable machine learning problem to my naïve eyes, so it’s probably just a matter of time. That means the spam comments are typed in by actual human beings (which is sad and a little bit horrifying). Despite the fact that there is no chance that I will approve a comment that points to drugz or warez or whatever they’re selling, some still get typed in.

In principle, webmention is fine. You mention me, I point to you. I mention you, you point to me. But webmention is only a very tiny bit harder to abuse for spam purposes than comments.

In fact, it’s arguably easier. It’s definitely harder in the sense that there’s a little more infrastructure to set up: the spam page must contain a link back to the page you’re mentioning, for example. But it’s automatable and it’s going to be harder to moderate. Much harder.

If you type in a comment, I look at your text and decide if you’re spamming. Your text is presented to me as text; there’s no scripting, there’s no editing it after the fact, there’s nowhere for you to conceal evil intent.

To moderate your webmention, I’m going to have to follow the link back to your site. Now we’re in a whole other world where even the act of moderation exposes me to a small risk in terms of malware delivered by the site. Do I believe what I see on your page? If I believe it today, is there any reason to believe it won’t be different tomorrow? How often do I have to re-moderate each mention?

A system like Vouch would help, but it really increases the usability and implementation challenge for participating in webmention. In principle, that “Vouch” link earlier in this paragraph is a webmention. My blogging software can automatically determine if webmention is supported and automatically establish the mention. But if I need to provide a pair of URIs for each mention, the link URI and the vouching URI, then I, the author, need somewhere to type in the vouching URI and some way to associate it with the link. Suddenly, it isn’t just automatic and it’s a lot harder.

Adapting my webmention implementation to support moderation is going to make it hugely more complicated (there will have to be unmoderated and moderated mentions, a system for identifying the unmoderated ones, a form for approving or rejecting them, the back end to support that form, etc.). And nothing about writing the moderation system is going to be fun or interesting.

The whole thing is depressing. Human beings suck. And don’t get me started about the use of microformats in pages to make webmentions more presentable.

Nevermind. I should go back to working on my XProc implementation anyway, I guess.