The web and backend services are critical to keeping any business up and operational 24/7/365. It’s a new world. If one of your web services or site goes down for even a minute, that could cost your money. Big money. The reputation damage alone could cost you millions. When I am in the middle of a critical business deal and an app I use (e.g. Slack, Atlassian Jira) and it goes down, I get very very angry – like the Incredible Hulk!!!
To keep your business up and your customers happy, you need to invest in good development practices that include
- Automated testing
- Continuous Integration / Continuous Deployment
- Rollback technology when you make a mistake
- Single button push technology to Production
- Automated SSL cert renewals
- Site Reliability Engineering
- Database Migration for app upgrades
I have done this successfully many times. When I was with CardBoard, my team and I had better uptime than Slack. The world went crazy when Slack went down. It was awful. I rarely got complaints about downtime at CardBoard. In fact, I only had one issue during the start of the Russian and Ukrainian war in 2022. It was a terrible feeling; I dropped everything to figure out what the problem was. I never want to feel that feeling ever again.
So, how did I have better uptime for CardBoard than Slack in 2020? Here are the keys:
- I have a strong background in Cloud Architecture and Site Reliability
- I love the platform Heroku and what they provide for hosting, scaleability, and PostgreSQL support
- GitHub from Microsoft is fire 🔥 🔥 🔥! It’s another great Ruby on Rails app like CardBoard!
Some people have “coined” the term DevOps to mean all of this. It’s actually a lot more than just “DevOps”. Solution architecture, good engineering/development practices and last but not least good people are needed to make all of this happen. I once heard a quote from one of the best Engineer/Physicist/Coder/Leaders I have ever worked with Brad Markison from Roche – “A fool with a tool is still a fool”.