GIF Monger

Behold my genius as I unfold my journey of GIF slingin’ in Slack!

Steve Hersch

Feb 11, 2024 • 14 min read

We are gathered here today to discuss GIF Monger, my app for Slack (Slack gets mad if you call it a “Slack app”) that has been invading Slack workspaces from here to Hanoi. Please marvel at my homeric prose as I discuss the rollercoaster ride that led to foundation of my sprawling GIF empire.

In the beginning...

The idea to create GIF Monger was born from the absence that the Tenor GIF Keyboard left when it disappeared from the Slack app directory back in December of 2021. I later heard from someone at Tenor that they think it was pulled because Slack created a native integration for Giphy. We’ll dig more into that and why that sucks later on, but for now, the relevant point is this: I liked Tenor’s keyboard more. A lot of people I know liked Tenor’s keyboard more.

I had gotten pretty handy with Python in the couple of years before that, so I figured “how hard can it be to re-create it?” Now, mind you, I was not, and am not, a developer. I can write some code, but I’m not setting the world on fire with it ^{*(unless it’s due to a bug)}. And let me tell you about what I actually thought this app was going to be like.

user types /gifs
GET request to Tenor
User chooses GIF to submit
???
GIF is posted to channel

Here’s the final product to give you an idea:

Nice and simple, right? A couple of simple API calls and I’ll call it a day.

…

The joke was on me, and you'll realize why in a minute. I created a new app in Slack's API website. I think I just called it "GIF Keyboard" at the time. Powering through the seemingly endless options of Slack's developer portal, I pretty much disregarded the "basic information" section entirely. The Interactivity & Shortcuts section is where things started to click. I clicked the button to enable interactivity, which is required if you want your Slack app to have buttons or menus. It then expects you to enter a request URL. And then it clicked. This wasn't going to be some little serverless script that I was going to dump into Slack's portal. I was going to have to build a server application. Oh man.

As I've said before, I'm not a developer. However, I have developed a couple of things before. At this point, the flagship example of my coding prowess was on display in the form of https://wgetweather.com, a very basic Flask application that will return to you a PrettyTable of basic weather information based on your geographic location (using your public IP address). It was running in a Docker container on my Synology DS918+. I was using the Linuxserver.io SWAG container to manage proxying and SSL termination. Well, I wasn't about to spend real money on a domain for this thing quite yet, so gifs.stevenhersch.com was going to have to suffice as the request URL!

To the terminal we go. I created my Python virtual environment, installed Flask, and started reading documentation.

It took a couple of days, but I eventually got /gifs to actually return a hello world!, and for that I was proud. At least at this point, the communication was now there between Slack and my little server that could. It was a couple of years ago at this point, so I won't pretend to remember the details. Tenor's API documentation was solid though; I remember that. And Slack's blocks formatting for messaging... well, the documentation wasn't super clear, but I actually like how they make you format the messages. I didn't really understand the value of it at the time, but now that GIF Monger has had numerous makeovers, the it is pretty clear why it's like that.

It took me maybe a week before I was able to completely re-create the user experience of the Tenor GIF Keyboard. In my own private Slack workspace, I could now type /gifs <word/phrase>, get a list of 4 GIFs, and then I could submit one. Cool; time to submit the app for review.

Or not.

OAuth. I want to say "I forgot about OAuth", but I didn't. I was just wholeheartedly unaware of any need for it until I discovered in Slack's dev portal that an app can't be distributed without it.

One more time for the folks in the back: I am not a developer.

I had absolutely no idea how to make this work. I could argue that even today, my understanding is quite limited. But back in December of 2021, I was in over my skis.

Between my ignorance toward OAuth and my infant keeping my wife and I up all night every night, I did not have the capacity or the time to try to solve this problem. The final nail in the coffin came in May 2022 when I moved onto a new job that used MS Teams.

A monger is born...

After a whirlwind of a year, I started a new job in June of 2023 that had me Back in ~~Black~~ Slack! And this time, I was there to stay. I joined some other Slack workspaces (Pulumi, Grafana, etc.) as well, which really cemented Slack into my life. I was actually feeling smarter and more energized. I don't know what made me do it, but in early October, I decided to re-visit this GIF Keyboard code. It was totally foreign to me, and was poorly coded. I had learned a lot since my last foray into the GIF distribution industry, and I was ready to give it another go.

Instead of just skimming Slack's documentation, I actually read it. Turns out the installation and OAuth redirect weren't as hard as I thought. To be perfectly honest though, I was still unsure of myself, so I bounced back and forth with ChatGPT about it. Turns out I was overcomplicating the whole process. The whole thing ended up being two Flask endpoints: /install, and /oauth-redirect, each with only a few lines of code. I was actually a bit disappointed in myself for not realizing how straight-forward it actually was. But oh well, we're here now, and we have OAuth. I wasn't at the finish line yet, but I was now able to install it in a couple of different workspaces. My friend installed it in her personal workspace, and I had it in mine, where another friend and I were able to use it as well. It was slow...

Very

slow...

I was getting a lot of operation_timeout errors. This was due to Slack's limitations on their response URLs. When the user makes a request to your app server, part of the payload is a response URL. I saw this as a very easy way to communicate back and forth. It made perfect sense to leverage these response URLs. I didn't know what to do to. I thought maybe it was something to do with my home network. At this point, I was no longer using SWAG; I was using a Cloudflare tunnel, so I was wondering if I was working my Synology too hard. I decided to move the app. In hindsight, this was dumb, but it's not the dumbest part; that has yet to come.

I rented a VPS through OVH, and I got Docker and the Cloudflare tunnel set up with my fancy new domain https://gifmonger.xyz, and I got the app running on there. Problem still not solved. Hm...

Determined to see victory, I chose the nuclear option. In late 2022 and early 2023, I learned Javascript and a little bit of Swift. At one point, out of curiosity, I wrote a really small, simple script in all three languages. Javascript and Swift performed pretty much exactly the same. Python though... ouch. After seeing the performance of the other two, I felt like I could freaking walk faster than Python. So... I re-wrote the entire app as an Express.js app.

I was still new to Javascript so this took a bit of work, but once I figured out the basics of Express and Axios, it was pretty easy to convert my functions and conditionals and other basic code to js.

BUT IT DIDN'T STOP THE TIMEOUTS!

My friend, who I will affectionately refer to as the Keyboard Sloth unless he later explicitly allows me to name him, suggested I actually learn about async. So I did. I'm definitely not a pro, but after I converted some functions to async, I no longer got my timeouts! To this day, I'm technically not out of the woods on this, but we'll get into that shortly. For now though, my app was running, and was ready for Slack to review! So I whipped up a quick static website, I word-vomited my way through a privacy policy and ToS that I had my lawyer friend skim and approve, and then I plugged in all the information into Slack's portal. It was supposed to take weeks for them to review... and it freaking did... AND IT WAS DENIED.

Turns out the signing secret wasn't just a suggestion; oops. This is a good opportunity for me to pause and discuss Slack's javascript SDK. It's supposed to make light work of things like this request verification and OAuth and whatnot... but no matter what I did, I could never get anything to work. The entirety of GIF Monger does not use Slack's SDK. I still have this line in my code: const { WebClient } = require('@slack/web-api'); but it doesn't do anything. When it came to this request signature verification, I REALLY wanted to use it because this was a serious brain teaser to do manually, but it failed me at every turn. I'd be more than willing to accept that it's just my own ignorance that caused those failures, but I couldn't get past it, so I had to do it manually.

If you're curious what Slack's signature verification looks like, here's their doc: https://api.slack.com/authentication/verifying-requests-from-slack

Here's my description though: do a bunch of angry math on a string provided in the request payload; if the result of that angry math matches the string that Slack provides you in the dev portal, and the timestamp of the request is within the allotted amount of time, then the request is legitimate. Doing it without the Slack SDK was very difficult for me, and I'm not ashamed to admit it. Here is my less-than-stellar code that deals with this problem.

function verifySlackRequest(req) {
    const timestamp = req.headers['x-slack-request-timestamp'];
    const sigBasestring = 'v0:' + timestamp + ':' + req.rawBody;
    const mySignature = 'v0=' + crypto.createHmac('sha256', signingSecret)
        .update(sigBasestring)
        .digest('hex');
    const slackSignature = req.headers['x-slack-signature'];

    if (!slackSignature) {
        console.error('Verification failed: slackSignature is undefined.');
        return false;
    }

    if (Math.abs(Math.floor(Date.now() / 1000) - parseInt(timestamp)) > 300) {
        console.error('Verification failed: Request timestamp is too old.');
        return false;
    }

    if (!crypto.timingSafeEqual(Buffer.from(mySignature), Buffer.from(slackSignature))) {
        console.error('Verification failed: Signature mismatch.');
        return false;
    }
    return true;
}

As I look at it again, I can't help but think that a couple of OR operators could make that much cleaner, but I supposed I'd have to forgo the very simple error logging I have in there. Anyway, I'm sure this is an easy hurdle for some people, but I'm not one of those people, and I'm proud that I was able to get it working. For the last time, I'm not a developer! This was the final, and most difficult, step before the GIF Monger was ready for his global debut.

ThanksGIFing

November 23rd, 2023. Thanksgiving. A Thanksgiving like any other, but with one small difference: Slack approved GIF Monger for the app directory! I hit the publish button, and now I had a production application... in production.

I think I reached almost 100 installs in the first week. I even got an e-mail from a guy telling me how excited his team is to have it. He made a couple of requests though.

He wanted the GIF to be posted as the user. When I built the app, I did it the same way the Tenor GIF Keyboard originally worked: it posted as the bot user, and then it tagged the person that invoked it, and it showed their search term. I actually ended up having a number of people request this because they wanted to be able to delete/edit the post. Others also suggested not posting the search term.
He requested captioning. This is a feature that was in the original Tenor GIF Keyboard that I had not implemented. It is an expensive feature as far as compute resources go, and I was not generating income from this, so I was not in a hurry to look at this feature.

Posting as the user seemed doable, so I decided to look into it. It turned out to be more complicated than I wanted it to be though. It required something that I very much did not want to deal with: a database.

All your database are belong to us.

Databases are something I never really messed with. I didn't want to. I don't want to be responsible for data. Security implications aside, it's just annoying, and I didn't really know anything about SQL other than SELECT * FROM table, so I wasn't even sure where to begin. But, the Keyboard Sloth had my back. I had heard of ORMs before, but had never looked at one before. He recommended Prisma. If I had known about this previously, I probably would have been a lot less apprehensive about working with databases. Better late than never. So I got Prisma and SQLite going. I was then able to start collecting user tokens, which is what is needed in order to be able to post as the user. But this wasn't really a complete solution. Unlike the app itself, which is built and deployed via a GitLab CI/CD pipeline from a repository in GitLab Cloud, the database was not redundant, so I had to do something to protect it. SaaS databases weren't an option due to cost. Litestream was going to save the day. I set it up to replicate the database back to my Cloudflare R2 object storage. Brilliant. Deployed to production.

Bad idea.

For some reason, I was unable to get it to restore when I updated the app. So it would just create a new database every time and start all over again. I'm sure if I spent enough time on it, I could have found a way to make it work, but I was feeling pressured at this point because I wanted to keep working on the app, but any further deployments would reset all of the user tokens I've collected, forcing the users to re-authorize the app again.

Vercel has a managed Postgres database. The free tier allows for 60 hours of compute per month. This seemed like plenty. During all my shenanigans with Litestream, the database got corrupted, so I had no way to bring over all the user authorizations I already had. I wasn't happy about that, but I definitely learned something from it: don't use janky, undocumented solutions to manage user data.

Moving to Vercel was what made me realize the value of Prisma. I changed one line in the prisma.schema file, and one environment variable to the new URL, and that was it. Nice and simple.

One day after moving to Vercel, I checked the stats, and I had already burned 5 hours of compute time. Five hours per day was going to more than double what the free tier allowed. Time to get clever. And I know what you're thinking: Redis! But you forget: I'm on a shoestring budget! So I really simply created a function that would copy the entire database to an object. Any time a new user would authorize, it would write to Vercel, and to the cache object. Any time an existing user used the app, it would only look at the cache object. There were way more existing users than new users, so this seemed like a simple way to resolve the problem.

I was wrong though. The amount of new users was increasing every day. Every time a new user would authorize, it would wake up the Vercel instance. Vercel had a timeout of 5 minutes from the last request. So after that, I was down to three hours, which was still far more than what I was allowed on free tier. It was time to explore other options.

The Keyboard Sloth strikes again: he suggested PlanetScale. PlanetScale is mySQL, not Postgres, which does not matter to me and my simple app that has a single table. The one thing that's great about PlanetScale is that they don't have a limitation on compute time. Their free tier is quite usable. It allots 1 billion reads per month and 10 million writes, with 5GB of storage. I make about 100,000 reads per month, and as of right now, a few hundred writes per month, and my database is a whopping 280 kilobytes!

The caveats

If you recall earlier in this adventure, there were a couple of items that I said I’d discuss later. This is that moment. Let’s dig into them.

Slack’s native Giphy integration. Needless to say, it is hard to compete with a native integration. The number one headache that this has caused me is the number of requests I get to enable GIF Monger to work in Slack threads. I wish it were that simple. As of right now, Slack does not allow developer-created slash commands to work in threads. There is no practical method for me to make it happen. I have contacted Slack about this, and they “may” do something about this in the future. Either way, it feels unfair to me that Giphy can work in threads but GIF Monger cannot, but there is nothing I can do at the moment. I will be keeping an eye on changes to their documentation, and I await the day there is a way for me to get the Monger into threads.
Remember when I said I implemented some async/await magic, and that fixed all my issues? Well… that’s half true. The reality is that I’m still not addressing the responses correctly. Javascript being that much faster than Python fixed it for now, but what I believe I should be doing is immediately responding to the request with some sort of acknowledgement, THEN I can go about doing the rest of the work. This would have worked perfectly fine in Python too. So… moving to JS and implementing async fixed the symptom, not the problem. I expect that if GIF Monger ever reaches a level of traffic that has it constantly mongering GIFs, we may see the timeouts come up again.

I forgot to talk about the fun stuff!!

Let’s talk about infrastructure for a minute! Bearing in mind that GIF Monger is not profitable, I cut some corners here, but I still think my configuration is pretty nifty, and I’d like to shine a light on it.

As previously discussed, I started off on a Synology and moved to a VM in OVH. I’m still rocking the OVH VM, but I actually wanted to be a bit more automated in my deployments to minimize down time. In my home lab, I’m a pretty big Portainer fan. I actually run two GIF Monger containers; one for production and one for staging. I created a Slack app for the staging version, and I use it in my personal Slack workspace now. Anything that survives there goes to production. Anyway, as I mentioned before, I use GitLab to store my code and for CI/CD. Each GIF Monger environment has its own Docker compose file in Portainer, along with the appropriate environment variables for each. The one little bonus feature here is that each compose stack in Portainer has a webhook that triggers a pull and update. When I merge into Staging, it creates a Docker image, and pushes it to my private container repo that’s on my Synology with the :staging tag, and then when I merge to Main, it does the same but tags it with :latest. After that, the pipeline triggers the webhook for the specified environment, and bam, app update! I’m considering bumping things up to Docker Swarm, but I haven’t had a genuine need for it yet, so we’ll see. Another fun fact, although not exactly relevant to GIF Monger, is that I’ve been testing out using GitPod to work on GIF Monger. I actually don’t own a personal computer right now, so my iPad Air has been my workhorse for anything not relating to work. Gitpod has been a great addition to my arsenal since it gives me a proper IDE instead of just a text editor (not that Working Copy and Textastic haven’t been great).

Where do we go now?

So it’s February 11th, 2024. There are almost 500 active workspaces using GIF Monger now. That’s about 475 more than what it was going to take to make me proud. I’m ecstatic that it has been well received and is being used on a daily basis. It continues to grow, and I’m hoping to keep thinking of creative new things for it. I am currently considering ways to monetize it in order to not only cover the current costs, but to cover the costs of a little bit more robust infrastructure like a small ECS cluster and maybe switching to AWS RDS. Obviously, captioning is still on the table; we’ll have to see how that tanks my CPU though. Anyway, that’s the state of GIF Monger. It has been an immense learning experience for myself.