Recently there seems to have been a swell of interest around personal websites and blogs, and along with that I've noticed more discussion about embeddable comment engines and open source alternatives. A project I spent a few hours on early last year seems to approach this task in a different way, so I thought I'd tell you a little about it.
Comment boxes are commonly found all over the web. Disqus is one of the more popular solutions for implementing one on your own website, but it brings along ads and all the tracking that come along ads on the web. There are open source comment engines as well, but all that I am familiar with require a long running web service, which can be more effort than it's worth for smaller sites or sites that are statically generated. Ideally what I want is something I can set and forget, that requires no maintenance and doesn't cost me anything if it doesn't get used much.
So now there's chatter, a statically generated comment engine. Ok so "engine" might be overselling it, it's more of a comment "box". Regardless, static generation means we can serve our comments directly from S3, which pairs nicely with AWS Lambda to allow us to regenerate our comments on demand without the need to maintain a server. The pricing model of these two services scales with usage, which means in many cases means our maintenance and running costs round down to 0. Like always, these benefits come with tradeoffs, the biggest of which is probably S3's eventual consistency.
We're using S3 as our datastore, which does not guarantee strong consistency. Instead, S3 is an "eventually consistent" system - the changes that we make may not be visible to us for a short amount of time until the change propagates throughout the rest of system. (I think DNS is probably the most widely known example of an eventually consistent system?) In practical terms, the issue this poses for us is that we might miss recent comments when we're generating our comment index. To better illustrate the issue and the tradeoffs of our solution, the way chatter works with S3 is like this:
The issue rears its head around the second step. If our indexing function runs before the comment is fully propagated, the list operation it executes on our bucket may not contain the latest files, which means the generated index will also not contain the latest comments. We could try to grab the latest version of the index file and add our comment to the end of it, but that would introduce a bigger issue which risks completely losing any comments. If two comments were submitted in quick succession the second lambda function is likely to get a stale copy of the index file, thereby overwriting the changes from the first lambda function. Instead, chatter does the simple thing and delays the generation of the index file, which has tradeoffs of its own.
We could avoid this issue entirely if we used a datastore with stronger consistency guarantees. I'm not sure if there's a landscape for pay-by-the-minute style data stores, but I imagine most would fail the price point requirement.
Getting chatter running is somewhat simple thanks to zappa - install the few dependencies + tell it your S3 bucket and you should only need to run
zappa deploy prod to be up and running. The problem right now is chatter doesn't come with a frontend part, so integrating it into your site is a custom job.
You can see it working below, or find it on GitHub.