Crashing This (Go) App

It started out innocently from the well-known meme, but at the end this whole thing ended up very useful for my work, so I was thinking that it’d be great to share it here.

I’ll save you from the pain of hearing the backstory of meme exchanges between me and my friends, but instead I’ll tell you the conditions where this code finally be useful.

So basically one of the simulation services I had was running really well. It ran inside a VM that was nested inside another VM that was nested inside a big server. In my line of work, simulations mean calculations, and from my part, whenever we do calculations, ofte times it could be ranging from light structural calculation to long-running, parallel, asynchronous finite element simulations.

The simulation services were spread through several instances on several VMs. We used a pretty well-established and battle-tested frameworks. Still, from time to time, there were cases where, because of such extensive and demanding calculations, the app would ran out of memory and practically crashed. It was tolerable at first, until one day, the whole container froze and affected the VM. I couldn’t stop, pause, restart, or do anything to fix them. I had to practically turned off whole VM and restarted them. Luckily all containers were auto-self-restart, but it still sucked losing some precious hours struggling on this.

Through little discussion I offered, “Why don’t we crash it before it ran out of memory?” to my colleague. We laughed at that idea for a while, but almost immediately we started thinking seriously about it.


/meirl

We tried some things for a while, and I finally found out that it was a bit easier than I thought.

Taking advantage of goroutine, I simply made a function that’ll run continuously alongside the main function, monitoring the memory consumption all the time. The function was made so that it ran every second, and when it detected that the memory has passed certain limit (defined in environmental variable APP_MEMORY_LIMIT and, if such variable isn’t defined, it’ll use a default value defined in-code).

The function’s definition:

// app/internal/watcher/memwatcher.go
package memwatch

import (
	"log"
	"os"
	"runtime"
	"strconv"
	"time"
)

func CheckMemoryUsage() {
	var memStats runtime.MemStats

	limitStr := os.Getenv("APP_MEMORY_LIMIT")
	var maxMemory uint64
	if limitStr != "" {
		limit, err := strconv.ParseUint(limitStr, 10, 64)
		if err != nil {
			log.Printf("Invalid APP_MEMORY_LIMIT value, defaulting to 5MB: %v\n", err)
			maxMemory = 5 * 1024 * 1024 // Default to 5 MB for this example.
		} else {
			maxMemory = limit * 1024 * 1024 // Convert MB to bytes.
		}
	} else {
		maxMemory = 5 * 1024 * 1024 // Default to 5 MB for this example.
	}

	for {
		runtime.ReadMemStats(&memStats)
		if memStats.Alloc > maxMemory {
			log.Printf("Memory usage exceeded %v bytes, shutting down\n", maxMemory)
			os.Exit(1)
		}
		time.Sleep(1 * time.Second)
	}
}

You can see and test this function in action on the example repo that I’ve put on my github. In short though, the function would be called from the main() as follows:

package main

import (
  // The app name is 'crashing-this-fibo' btw
	"github.com/ahmad-alkadri/crashing-this-fibo/internal/api"
	memwatch "github.com/ahmad-alkadri/crashing-this-fibo/internal/watcher"

	"log"
	"net/http"
)

func main() {
	go memwatch.CheckMemoryUsage() // Start memory monitoring

	mux := api.NewRouter() // Setup routes using the built-in HTTP routing
	log.Println("Server is running on port 8080...")
	log.Fatal(http.ListenAndServe(":8080", mux))
}

The example application that I use here is a simple REST API app that accepts request to the /fib endpoint with query 'n' in which 'n' is an integer and it’ll calculate, and return, the fibonacci number of that 'n'.

Also, for demo purpose, I’ve included in the repo, a bash script that basically called the vegeta tool to do self-DDoS (jk, it’s a heavy-duty load testing tool, pretty cool because of its capabilities truly) and send the endpoint above like 200 requests per second (or even more). With that script you can test sending so many requests to the endpoint of the go REST API app above, and if things work correctly, the memory watcher will terminate the app once the load is bigger than the limit.

You’re welcome to clone, launch the app, test it, modify it as you wish, take or use the memory watcher function, etc.

Remember Though

One thing to remember from all of the above though (in case it wasn’t clear): this method is not fit for long term use. At best it’s a band-aid. If your app, in production, could be out of memory because of too much requests, either:

Implement some kind of queue system
Implement some kind of rate limiter
etc.

instead of purposefully crashing the app. Especially if many people (customers, clients, etc.) depend on your app. Treat it as a tech debt to be paid.

And yes, from my side, the simulation app was finally fixed and of course the ~~big guy~~ function above is no longer used. Still, there could be some cases or situations where it could be helpful.

So, to close this short blog: You’re welcome to clone the repo, launch the app, test it, modify it as you wish, take or use the memory watcher function, etc.

If you have any questions, feel free to ask!

Remember Though#

Remember Though