Managing all processes with Supervisord to prepare for chaos engineering

Hello, I'm Munou.
How is everyone doing?
Yesterday, I went cycling for a total of 6 hours, meeting up with friends along the way.

It was a jazz festival, so I walked around the city listening to music, and the outfits of the elderly people were often interesting and inspiring.
The reality of people out and about in the city, seen directly, is more truthful than street snaps and the like.

Then, my friend was going to make bolognese pasta at their house, so we went to a business supermarket and decided to try making it with super thick noodles called "PRAI Nest Pasta". We made it together and it was incredibly delicious.
So, it was a day where we went to a vintage clothing store, talked about our impressions of Project Mirai, learned about optimal transport and natural language processing until around 2 AM, and then I rode my fixed-gear bike for dozens of kilometers back home.

Introduction

The reason I've been re-organizing my home infrastructure environment recently is that I want to use it as a test environment for chaos engineering, which I've been interested in for a while.
Previously, Netflix first released something called ChaosMonkey, but it seems those forked monkeys are now an endangered species.

They are disappearing at a speed comparable to cryptocurrencies, but amidst that, there's an OSS called ChaosBlade that is being updated in real-time. I want to run this.
This is something that Alibaba Cloud in China has adopted, and where it's actually implemented is summarized in the Issues.
Who is using ChaosBlade
There's almost no information in Japan, but the documentation is quite well-made, and it's being committed to in real-time, so it should be usable.

I plan to delve into the details once I actually start working on it.

Daemon startup before Supervisord migration

There were many things I never thought I'd continue for so long.
Actually, I was starting them by writing scripts in the server's rc.local and wrapping them to launch at startup.
Specifically,

rc.local

$ sudo cat /etc/rc.local  
[sudo] password for haturatu`  
#!/bin/sh -e  
#  
# rc.local  
#  
# This script is executed at the end of each multiuser runlevel.  
# Make sure that the script will "exit 0" on success or any other  
# value on error.  
#  
# In order to enable or disable this script just change the execution  
# bits.  
#  
# By default this script does nothing.  
#echo never > /sys/kernel/mm/transparent_hugepage/enabled  
[ -d /etc/boot.d ] && run-parts /etc/boot.d  

/root/serverboot.sh &

In this way, I was launching each application by having them start from serverboot.sh, which I roughly created in the root's home directory.
Honestly, I was just running scripts, so whether it's correct to call this a wrapper or not is beside the point.

However, as you might have noticed, with this method, processes might crash, and log files are not output individually.
Therefore, I managed everything with Supervisord.

Afterwards

With this, even if a process crashes for some reason, Supervisor will automatically restart it, and logs will be output as configured for each.
This amounted to a total of 8 processes.

It didn't even take an hour to do, so I thought I should have done it sooner, but it can't be helped. Because I'm human.

Other things to do

Things like setting up a load balancer with HA Proxy. For that, I need to prepare another server, but having a second machine at home is too meaningless, so I need to find a cheap VPS somehow.
The reason is that the risks of a home environment, such as power outages, are high, so it feels better to have the second server in a completely different location.
The reason I think so is that this isn't just about a home server environment; I recall a large-scale outage in an AWS US region around 2021. When I was tracking exchanges like Binance going down during that time, the cause was AWS.
Having a redundant configuration in the same region is ultimately meaningless.
Power outages, or the Google DC in the US, I remember being able to tour it inside with Youtube or maybe Google Map, and while it seems absolutely unlikely to happen, it's not impossible.

One might say, 'Why not use Nginx for load balancing instead of HA Proxy?' but I remember seeing this in the past.
Why you should never use nginx for load balancing
Additionally, if the web server and load balancer are running on the same software, it becomes complicated, so I think using HA Proxy is more explicit and easier to understand.

However, if asked if there's really a point in having a redundant configuration at this stage, the answer is a complete NO. Using wasteful resources on the cloud with software like ChaosMonkey seems like it would cause trouble, and doing it at the DNS level would also be another issue.
After thinking about it a bit more, I plan to run ChaosBlade.
I haven't even prepared to Get information from the server during chaos engineering yet, so in a way, this will continue as a sequel.

That's all for now. See you again.

Introduction

Daemon startup before Supervisord migration

Afterwards

Other things to do

Related Posts