Day 3: Architecture sketch on a napkin

Day 3 of 30. Today we're drawing boxes and arrows before writing our code.

#api-design #planning

Day 3 of 30. Today we’re drawing boxes and arrows before writing our code.

As a software developer, it’s very tempting to just start coding. But while tempting, we’ve been burned before, by diving in without a clear picture of how the pieces connect. A few hours of sketching and brainstorming now could potentially save us days (or more!) of rework later, so that’s exactly what we’ll do in this session.

Just a small note: this isn’t a “formal” architecture document. It’s just the napkin sketch, the document we’ll use for discussions and some our decision-making, but with the full understanding that our end solution will most likely deviate from this in the future.

Two API modes: sync and async

Before we’ve even started, we’ve already introduced some version of feature creep: we’re offering two ways to capture screenshots. We thought long and hard about this, like can we get away with just 1 version for simplicity reasons, but given our domain, and our core focus on making screenshots, we’ve decided that we need to offer both.

tech-compare - API mode comparison
[✓] Synchronous API (v1 focus) CHOSEN

[+] Pros

  • + Simple: request comes in, screenshot returns directly
  • + Covers 80% of use cases
  • + Faster time to market
  • + Easier to explain to customers

[-] Cons

  • - Client waits for full capture (2-5 seconds)
  • - Timeouts for very slow sites
Verdict

If your screenshot takes 3 seconds, waiting 3 seconds is fine. You don't need async complexity.

[ ] Asynchronous API (v2)

[+] Pros

  • + Better for high-volume batch processing
  • + Handles very slow sites (30+ seconds)
  • + Supports webhooks for notifications
  • + Returns immediately (50ms)

[-] Cons

  • - Requires job queue and polling logic
  • - More complex client integration
  • - Webhook infrastructure needed
Verdict

We'll add this when customers ask for batch processing.

The sync happy path

Here’s what happens when someone requests a screenshot synchronously:

Simple flow:

  1. Client sends URL
  2. We validate (API key, quota, URL format)
  3. We capture the screenshot
  4. We upload to storage
  5. We return the image URL

Total time: 2-5 seconds depending on the target site.

The components

Here’s what we’re actually building:

===

System architecture

==============================
+------------------------------------------------------------------+
|                           VPS (Hetzner)                          |
|  +------------------------------------------------------------+  |
|  |                    Spring Boot API                         |  |
|  |  +----------------+  +----------------+  +---------------+ |  |
|  |  |   REST API     |->|  Auth & Quota  |->|  Screenshot   | |  |
|  |  |   Endpoints    |  |   Middleware   |  |   Service     | |  |
|  |  +----------------+  +----------------+  +---------------+ |  |
|  +------------------------------------------------------------+  |
|           |                                      |               |
|  +--------v--------+                   +---------v---------+     |
|  |    Postgres     |                   |   Browser Pool    |     |
|  |   (Docker)      |                   |   (Playwright)    |     |
|  +-----------------+                   +-------------------+     |
+------------------------------------------------------------------+
           |
   +-------v-------+
   | Cloudflare R2 |
   | (Screenshots) |
   +---------------+

    
[API] Spring Boot API
Main application handling REST endpoints, auth, and orchestration
[WWW] Browser Pool
Reusable Playwright browser instances. Creating a browser is slow (~500ms), so we keep a few warm.
[DB] PostgreSQL
Users, API keys, usage tracking. Running in Docker on the same VPS.
[S3] Cloudflare R2
Screenshot storage. Returns signed URLs that expire after a period of time.
[API] React Frontend
Landing page and dashboard. Static files served by the API.
[API] Service
[DB] Database
[S3] Storage
[WWW] Browser
==================================================

Spring Boot API - The main application. For sync mode, everything happens in the request thread or in coroutines.

Browser Pool - Reusable browser instances. Creating a browser is slow (~500ms), so we keep a few warm.

Postgres - Users, API keys, usage tracking. Most likely not used for job queuing in sync mode.

R2 Storage - Screenshots uploaded here. Depending on the configuration, we return signed URLs that expire after a period of time.

React Frontend - Landing page and dashboard. Static files.

Sync API design

Create screenshot (sync)

The current design is just a proposal. There’s a high chance we’ll make this a GET API instead of POST, so it’s easier to call the API from an image tag.

POST /api/v1/screenshots

Request:

{
  "url": "https://example.com",
  "device": "desktop",
  "full_page": false,
  "format": "png"
}

Response (200 OK):

{
  "id": "scr_abc123def456",
  "url": "https://example.com",
  "image_url": "https://storage.../scr_abc123.png?signature=...",
  "created_at": "2024-01-15T10:30:00Z",
  "metadata": {
    "width": 1920,
    "height": 1080,
    "file_size": 245678,
    "format": "png",
    "capture_time_ms": 2340
  }
}

Timeout handling

Sync requests have a configurable timeout. If the page doesn’t load by then we’ll return an error:

{
  "error": {
    "code": "timeout",
    "message": "Page did not load within 30 seconds"
  }
}

For sites that regularly take longer, or multiple pages need to be captured, async mode will be the answer.

What we’re deliberately not building yet

Multiple browser types - Chromium only. We’ll implement Firefox/WebKit and other browsers later.

Screenshot caching - We want to offer some form of configurable caching, but at this moment, we’ll always make a screenshot.

Horizontal scaling - At this moment, we’ll run a simple VPS. We can scale both horizontally and vertically when we need to.

What we did today

Mostly thinking and sketching:

  • Drew the architecture diagrams
  • Designed both sync and async APIs
  • Decided to build sync first
  • Identified what we’re not building, yet

We haven’t pushed any code today, on purpose. Our focus is on getting the foundation right.

Tomorrow: CI/CD on day one

Tomorrow, we’re getting our hands dirty, and we’ll finally be writing some code, plus we’re setting up the deployment pipeline, where every successful build will end up straight in production.

Book of the day

Designing Data-Intensive Applications by Martin Kleppmann

This is the book we wish we’d read earlier in our careers. It fundamentally changed how we think about systems.

Kleppmann covers databases, distributed systems, batch processing, stream processing - all the building blocks of modern applications. But more importantly, he explains why things work the way they do.

The chapter on message queues vs. request/response patterns directly informed our sync-first decision.


Day 3 stats

Hours
█░░░░░░░░░░░░░░
5h
</> Code
░░░░░░░░░░░░░░░
50
$ Revenue
░░░░░░░░░░░░░░░
$0
Customers
░░░░░░░░░░░░░░░
0
Achievements:
[✓] Architecture documented [✓] API design (sync) [✓] API design (async)
╔════════════════════════════════════════════════════════════╗
E

Erik

Building Allscreenshots. Writes code, takes screenshots, goes diving.

Try allscreenshots

Screenshot API for the modern web. Capture any URL with a simple API call.

Get started