Skip to content
All posts
Engineering··5 min read

Choosing the right cloud provider for a startup

AWS, GCP, or somewhere else? The honest answer depends less on the providers and more on three questions about your team and your product.

The most common question we get from founders building their first serious infrastructure is some version of: AWS, GCP, or somewhere else? Usually they want a recommendation. Usually they're hoping the answer is "GCP," because they've been told it's nicer to work with.

The honest answer is that for most startups, the choice between the major cloud providers matters less than three questions about your team. Get those questions right and any of the majors will work. Get them wrong and the prettiest console in the world won't save you.

Question one: what does your team know?

The single biggest predictor of cloud success isn't the provider's feature set. It's whether your engineers have shipped serious infrastructure on that provider before.

If your two senior engineers have collectively run twelve production systems on AWS, you're going to AWS. The opportunity cost of relearning IAM, networking, and the dozen weird edge cases of a new provider is enormous, and the upside of "GCP is nicer" doesn't pay for it. The same logic applies in reverse: if your team has GCP experience, go GCP.

This sounds obvious. It isn't followed half as often as it should be. Founders read a thoughtful post about Google Cloud Run and decide their team should "give it a try" even though every engineer they have has spent years on ECS. Six months later, they've spent half a quarter of engineering time learning a new platform that doesn't materially change their product.

Question two: what's your data gravity?

Once you have customers, your data has gravity. Moving 50TB of customer data from one cloud to another is a project, not a Saturday afternoon.

If your product processes data that originates in a specific cloud — say, your customers send you data from their Snowflake warehouses — there's gravitational pull toward where Snowflake runs cheaply. If you're integrating heavily with services that have a clear home (BigQuery, S3, Cosmos DB), don't fight that gravity for marginal reasons.

For early-stage products, the gravity is usually small. Customer data is small. Logs are small. The cost to switch is mostly engineering attention, not data movement. As you scale, the calculus inverts. The decision you make at 10 customers is much easier to revisit than the same decision at 1,000.

Question three: how good is your team at saying no?

Both AWS and GCP offer roughly 200 services. You will use about eight of them. The discipline of saying "no, we don't need that" to most of those services is what separates a clean infrastructure from one that takes six months to onboard a new engineer.

Strong teams pick a small core: compute, storage, a managed database, a queue, a load balancer. They use those services well, instrument them well, and add new services only when the case is clear. Weak teams accumulate services like decorative pillows. Every service is a context the next engineer has to learn.

Both clouds will let you build a fine system. Both clouds will let you build a sprawling, unsupportable mess. The provider isn't what determines which one you build.

The actual recommendation

For most startups we work with, the honest recommendation is:

  1. Pick the cloud your senior engineers have shipped production on before. If that's AWS, you're on AWS.
  2. Use the boring services. EC2 or Compute Engine, RDS or Cloud SQL, S3 or GCS, a managed Kubernetes if you really need it, the basic load balancer. Stop there until you have a documented reason to add more.
  3. Write your infrastructure as code from day one. Terraform or Pulumi. Not because you'll use it for blue-green deploys; because you'll use it to recreate your staging environment when the intern accidentally deletes it.
  4. Spend the saved budget on observability. OpenTelemetry traces, structured logs, a serious metrics stack. The first time something breaks at 2am, you will be grateful.

That's the recommendation. The provider matters less than the discipline.

When to seriously consider Cloudflare, Fly, or Vercel

There's a fourth option that's worth considering for a specific shape of product: full-stack web applications where the bulk of your compute is HTTP request handling, the database can fit in a managed Postgres, and you don't need long-running background jobs.

For that shape — which is most B2B SaaS in its first two years — Cloudflare Workers + D1, Fly.io, or Vercel + a managed Postgres provider can ship you faster than AWS. The trade-off is that you'll outgrow them at some point, and the migration to a major cloud is real work. If you're confident about your product shape and you don't have AWS expertise on the team, this can be the right call.

The catch: the day you need to run a 2-hour batch job, ingest a million events a second, or run a workload that demands precise networking control, you're back to needing a major cloud. Plan for that day.

The point you came here for

There's no provider that will save a team that can't say no to complexity. There's no provider that will sink a team that picks five services and uses them well.

Pick the cloud your engineers know. Pick the smallest set of services you can ship on. Write it down in code. Move on to the actual product.

Author

Hannah Müller

Principal Engineer

Newsletter

More writing like this, once a month.

One essay, no clickbait. Unsubscribe in one click.