by

A Hypothetical Web Service Startup Stack

Lately I’ve been wondering:

If I were to build a startup whose main product was a web service (i.e. where the website was an integral part of the product, not just an elaborate ad), what setup would I choose?

(That situation might actually not be so unlikely.) This article is my personal semi-educated answer to this question.

This setup assumes a small team (1–4 engineers) with a mostly-monolithic app (except a split between UI and API, see below) and only a few dozens of QPS — larger services have somewhat different demands that I think most people don’t need to plan for from the outset. Incorporating Microservices should not be too problematic, though (up to the point where you need a more sophisticated system for deploying and managing all of these jobs).

Of course having a great product is way more important than a “good” stack and there are many, many ways to build a great product. But there are still a few (a lot?) of foundational picks to make. The ones in this article are one point to start from.

What would be your picks?

Hosting

  • One main SSD-equipped dedicated root server, hosted either with Hetzner or OVH. This is quite cheap (less than € 100 per month) and provides lots of power.
  • A second (smaller) root server or VPS that only does load balancing via nginx or HAProxy. This way, the main server does not need to be exposed to the public web at all, hopefully reducing the surface area for attacks (especially database hacks).
  • Alternative: Put everything on AWS. This is better for horizontal scaling, but a few beefy servers can actually get you quite far, as StackOverflow demonstrates. It might also let us avoid having to set up and run our own database (e.g. with Amazon RDSor DynamoDB).
    I don’t have enough experience with AWS to tell whether their services make deployment so much easier to warrant the somewhat higher cost already. If there are other important things that you get “for free” with AWS, let me know — I’m probably missing something.

Backups

Every day, a script should back up the service’s database(s) and upload it to a remote storage service (e.g. Amazon S3). Other data on the servers should not need to be backed up, as we should have scripts that can set up a server with a production configuration from scratch. This script should also be used daily to set up an additional server and fetch and restore the database. This ensures that we can quickly recover from an outage. Plus, this server provides us with a system that we can failover to quickly in case something goes wrong with the main machine.

(We also need a service that monitors the successful completion of these scripts so that we know if anything is not in order.)

Deployment

In order to make the setup of new production, staging and development machines easier (and more reproducible), the servers themselves should need as little individual configuration and state as possible. This can be achieved by running them on CoreOS with Docker for containerization.

For our purposes, the Docker-provided services for orchestration (Docker Swarm) and composition (Docker Compose) should be sufficient.

This also lets us create a development environment that mirrors production, which avoid subtle bugs due to inconsistencies between the dev and prod environments.

Serving

(Especially for this part, there are a gazillion alternatives.)

  • Frontend: If possible, a static site that uses React to communicate with the backend and display data. I’d probably use either TypeScript or CoffeeScript instead of writing plain Javascript.
  • Backend: A RESTful API hosted with Django served by nginx. I would actually prefer a compiled, statically-typed language, but Python+Django let you get started extremely quickly and there’s tons of Python libraries for any imaginable purpose (although Python’s package managers are a bit weird).
    Now that Swift is open source, it will hopefully be possible to write solid web services with it soon. But the current state of Swift web frameworks (e.g. Perfect) is not very impressive.
    Alternatives:
    Protocol Buffers for data transfers, but I have no experience using them with Javascript. NodeJS or Ruby on Rails for the actual code. Or a million other options.
  • Database: PostgreSQL for a relational database. I have little experience with non-proprietary key-value stores, so no opinion there. Protocol Buffers are also great for data storage (especially if you need to extend your schema later), but it seems like only Riak supports them out of the box.
    (This article is not concerned with datasets large enough to warrant the use of e.g. Redis and Hadoop yet.)

Collaboration

Again, lots of options here (and you can be happy with any of them):

Communication

That’s it for now, but I intend to update and extend this list from time to time. So if you think there’s something missing, let me know.