The 10-Minute Infrastructure Scaling Health Check

Use this checklist to quickly assess if your infrastructure is ready to scale from £1M to £10M ARR.

Each “No” answer is a scaling bottleneck that will bite you in the next 6 months.

1. Database Performance ✓

□ Can your database handle 10x current traffic?

Quick test:

Check current read/write IOPS against database limits
If you’re above 60% capacity → you need to scale soon
Review slow query log – anything taking >100ms under load?

Red flags:

No read replicas configured
No connection pooling (seeing “too many connections” errors)
Running on default database instance from 2 years ago
No query performance monitoring

□ Are critical queries indexed properly?

Run this on PostgreSQL:

SELECT schemaname, tablename,
       pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC
LIMIT 10;

Check if your largest tables have indexes on frequently queried columns.

□ Do you have automated backups with tested restores?

Not just “backups enabled” – when did you last restore from backup to verify it works?

2. Application Architecture ✓

□ Can you scale horizontally by adding more servers?

Warning signs you can’t:

In-memory session storage (not Redis/database)
File uploads stored on local disk (not S3/object storage)
Background jobs running on same server as web app
Hard-coded server IPs anywhere in config

□ Are static assets served via CDN?

Check: View source on your app. Are images/CSS/JS served from your domain or a CDN?

If from your domain → your servers are wasting resources serving static files.

□ Is there a queue for background jobs?

Red flags:

Email sending blocks HTTP requests
Report generation happens in web requests
No job queue system (Sidekiq, Celery, BullMQ, SQS)

3. Deployment Pipeline ✓

□ Can you deploy without downtime?

Test: Run a deployment during business hours. Do users see errors or connection drops?

Blue-green or rolling deployments are essential at scale.

□ Can you rollback a bad deployment in under 5 minutes?

Do you have:

Automated rollback command/button?
Process to identify and revert bad deploys quickly?
Documented rollback playbook?

□ Does deployment take under 15 minutes?

If deploys take 30+ minutes, engineers batch changes instead of shipping continuously.

This slows innovation and increases risk per deployment.

4. Monitoring & Observability ✓

□ Do you have alerts for critical failures?

Minimum required alerts:

Server/container health checks failing
Database connection pool exhausted
Error rate above threshold (5xx responses)
Response time P95 above SLA
SSL certificate expiring soon

□ Can you trace a slow request from user → database?

Do you have:

Application Performance Monitoring (APM)?
Distributed tracing across microservices?
Ability to see full request lifecycle?

□ Are you monitoring business metrics, not just technical metrics?

Track:

Sign-ups per hour
Failed payment attempts
Feature usage rates
Customer-facing transaction success rates

If a deployment breaks sign-ups but servers are “healthy,” technical monitoring alone won’t catch it.

5. Security & Compliance ✓

□ Are secrets managed properly (not in code)?

Check:

No API keys or passwords in GitHub
Using secret management (AWS Secrets Manager, Vault, etc.)
Environment variables properly injected at runtime

□ Is production access restricted and audited?

Required:

Production SSH/database access requires MFA
Audit log of who accessed what and when
Principle of least privilege (developers don’t have production DB passwords)

□ Are dependencies kept up to date?

Run: npm audit or pip-audit

If you see critical vulnerabilities → attackers can too.

6. Infrastructure as Code ✓

□ Can you recreate your entire infrastructure from code?

Test: If AWS eu-west-1 went down tomorrow, how long would it take to rebuild in us-east-1?

If answer is “no idea” or “weeks” → you need IaC.

□ Is your infrastructure version controlled?

Check:

All infrastructure defined in Terraform/CloudFormation/Pulumi?
Changes go through Git and code review?
Can roll back infrastructure changes like code?

□ Do you have separate environments (dev/staging/prod)?

Red flags:

Testing in production because staging doesn’t exist
Staging shares database with production
Can’t test infrastructure changes safely

7. Cost Management ✓

□ Do you know where your cloud spend goes?

Can you answer these in 60 seconds:

What’s your biggest AWS cost category? (Compute, storage, data transfer?)
Which service/product line costs most to run?
What’s your cost per customer/transaction?

□ Have you optimized cloud costs in the last 6 months?

Quick wins often available:

Reserved instances / Savings Plans for predictable workloads
Rightsizing over-provisioned resources
Deleting orphaned volumes/snapshots
Shutting down non-production environments overnight

□ Do you have cost alerts configured?

Set up:

Alert when monthly spend exceeds budget by 20%
Alert on unusual spend spikes
Per-service budgets for high-cost items

Scoring Your Infrastructure

18-21 checkmarks: You’re in great shape. Minor optimizations only.

14-17 checkmarks: Some gaps. Prioritize the missing ones before scaling aggressively.

10-13 checkmarks: Significant scaling risks. You’ll hit major issues in next 6 months.

Below 10: Critical infrastructure debt. Scaling will be painful and expensive.

What To Do If You Scored Low

Don’t panic. Most teams at £1-3M ARR have 8-12 checkmarks.

The key is fixing this before you hit scaling problems, not after.

Our Infrastructure Audit (£3,000, 2 weeks)

We complete this checklist in detail, then deliver:

Scored assessment of each area
Prioritized roadmap of fixes (critical → nice-to-have)
Specific implementation recommendations
Cost-benefit analysis for each improvement
We implement the top 5 quick wins for you

Platform Acceleration Programme (£23,000, 10 weeks)

For teams ready to fix everything:

Week 1-2: Complete infrastructure audit
Week 3-8: Implement all critical improvements
Week 9-10: Team training and documentation

Guaranteed outcomes:

30-40% cost reduction
50%+ faster deployments
Production-ready security posture
Infrastructure ready for 10x growth

Book a Free Discovery Call

We’ll go through this checklist together and identify your biggest risks.