Why we insist on hosting our applications on our own cloud

We are a SaaS company. We have an extremely experienced team. Together our five man team has over 70 years combined experience, two of us have been developing since before 1st grade and have over 40 years just to ourselves. Quality is our #1 goal. If something isn’t working after we’ve said it’s ready, we take it as a failure, no excuses, analyze what went wrong, and make changes so it doesn’t happen again. To achieve this level of quality we have found that we have to have a controlled environment. We’ve found that customers who insist on hosting our applications themselves are usually not as good as we are (sorry, we don’t want to sugar coat it, it’s just usually the way it is). Anytime we’ve setup our applications on someone else’s servers it invariably costs us extra time, money, and more importantly to us: quality.

So aside from the quality our cloud lets us achieve, here are some more reasons it’s the way we do business.

  • We value the value of experts and recognize that even though we have a high degree of confidence in our own skills and abilities, there is a lot we don’t know, and there are always people that know more than us. As such, even though we have 3 experienced Linux system administrators (two of which managed all the servers for a dial-up ISP back in the day) we use server setups and configurations that are standard in the industry and don’t have any weird hacks, tweaks, or funky setups. These are server images / configurations used by thousands of other companies.
  • Our servers are secure, only accessible via http, https, and ssh using a private key. We’ve never yet had an application we host hacked or compromised.
  • We make nightly backups and ship them off to Amazon Glacier (similar to Iron Mountain), as well as keep the last week of backups on site so that we can restore rapidly if we need to (we’ve never had to).
  • We have server monitoring in place, so that if a server does go down, it wakes up whoever is on call and they get it fixed. This happened a half dozen times on one server in February of 2012 while we were experimenting with server configurations, and once in March of 2013 when we forgot to renew our SSL certificate (how embarrassing… it’s now good till 2017, and we have multiple calendar and email reminders in place so it won’t happen again).
  • We have intensive logging and automatic error notifications on our servers, so if something goes wrong at the application level, it emails us, and we fix it before you or your clients even know there is a problem (that is a good feeling, let me tell you).
  • Because our servers are in the EC2 cloud, network availability is EXTREMELY high, and it’s easy to adjust our capacity if needed. So, if you have a product launch, and all of the sudden our integration is getting slammed, we can easily move our application to a dedicated, or larger shared server.
  • Our servers are equipped with an automated deployment system that lets us make and deploy changes / fixes VERY rapidly when needed, as well as minimize the risk of human error in the deployment process.
  • When something does go wrong, we have forensic level logging and tools in place that allows us to do an autopsy on the problem and prevent it from happening again, rather than just waiting to see if it happens again.
  • We’re also working on a High Availability strategy that will allow us to rapidly switch EC2 regions and guarantee up-time in the event of a catastrophic complete failure of an Amazon EC2 region. Right now, we can switch regions in a few hours, but we want to get the time required down so that the limit is confined by DNS ttl’s and not the time it takes us to setup servers.

Leave a Reply