Day 25: Performance optimization - screenshots in under 2 seconds

Day 25 of 30. Today we make things faster.

#optimization #speed

Day 25 of 30. Today we make things faster.

Our customer is happy, but they mentioned: “Some screenshots take 5+ seconds. Any way to speed that up?”

And while we mentioned not to build any new features and focus on traction, improving an existing feature is a bit in the gray zone here right? Let’s see if we can optimise things a bit.

Interactive performance benchmark

See the before and after results of our optimizations:

===

Performance optimization results

==============================
Before
After
Context acquire 97% faster
Before:
180ms
After:
5ms
Page create
Before:
45ms
After:
45ms
Navigation 8% faster
Before:
1.2s
After:
1.1s
Wait 69% faster
Before:
2.1s
After:
650ms
Screenshot
Before:
340ms
After:
340ms
Upload 57% faster
Before:
280ms
After:
120ms
==================================================

Where does time go?

We instrumented everything:

fun captureWithTiming(request: ScreenshotRequest): TimedResult {
    val timings = mutableMapOf<String, Long>()

    timings["context_create"] = measureTimeMillis {
        context = browser.newContext(options)
    }

    timings["page_create"] = measureTimeMillis {
        page = context.newPage()
    }

    timings["navigation"] = measureTimeMillis {
        page.navigate(request.url)
    }

    timings["wait"] = measureTimeMillis {
        page.waitForLoadState(LoadState.NETWORKIDLE)
    }

    timings["screenshot"] = measureTimeMillis {
        bytes = page.screenshot()
    }

    timings["upload"] = measureTimeMillis {
        url = storage.upload(bytes)
    }

    return TimedResult(bytes, timings)
}

Results for a typical site (example.com):

PhaseTime
Context create180ms
Page create45ms
Navigation1,200ms
Wait for network idle2,100ms
Screenshot capture340ms
Upload to R2280ms
Total4,145ms

As you can see in the table above, the wait phase is the part which slows us down the most. The network idle waits for 500ms of no network activity. On sites with analytics, chat widgets, and lazy loading, that takes a significant amount of time.

Optimization 1: Smarter wait strategies

Instead of always waiting for network idle, we will offer more wait options:

enum class WaitStrategy {
    LOAD,           // DOMContentLoaded event (~fast)
    DOMCONTENTLOADED, // Same as above
    NETWORKIDLE,    // No network for 500ms (~slow)
    NONE           // Don't wait at all (~fastest)
}

For most sites, LOAD + small delay is enough:

when (request.waitStrategy) {
    WaitStrategy.LOAD -> {
        page.waitForLoadState(LoadState.LOAD)
        page.waitForTimeout(500.0)  // Small buffer for JS, will be configurable
    }
    WaitStrategy.NETWORKIDLE -> {
        page.waitForLoadState(LoadState.NETWORKIDLE)
    }
    WaitStrategy.NONE -> {
        // Just wait for navigation to complete
    }
}

Same site with LOAD strategy:

PhaseTime
Wait650ms (was 2,100ms)
Total2,695ms (was 4,145ms)

35% faster by changing one parameter.

Optimization 2: Block unnecessary resources

Analytics, ads, and tracking pixels often don’t affect visual appearance, or, at least they shouldn’t. So, we’re blocking them, or at least, some of them, and will offer a way later to be more selective in the type of blocking:

val blockedDomains = listOf(
    "google-analytics.com",
    "googletagmanager.com",
    "facebook.net",
    "doubleclick.net",
    "hotjar.com",
    "intercom.io",
    "segment.com",
    "mixpanel.com",
    "amplitude.com"
)

page.route("**/*") { route ->
    val url = route.request().url()
    if (blockedDomains.any { url.contains(it) }) {
        route.abort()
    } else {
        route.resume()
    }
}

This saves 200-500ms on sites heavy with trackers, which frankly seems to be most sites these days.

Optimization 3: Browser reuse

We were creating new browser contexts for each request. The context creation is 150-200ms, which can be improved.

So, instead of creating a new browser context, we maintain a pool of warm contexts, especially for our sync browser API:

@Service
class BrowserPool(
    private val poolSize: Int = 3
) {
    private val browser = Playwright.create().chromium().launch()
    private val contexts = ConcurrentLinkedQueue<BrowserContext>()

    init {
        repeat(poolSize) {
            contexts.offer(createWarmContext())
        }
    }

    fun acquire(): BrowserContext {
        return contexts.poll() ?: createWarmContext()
    }

    fun release(context: BrowserContext) {
        // Clear cookies, storage, etc.
        context.clearCookies()
        contexts.offer(context)
    }

    private fun createWarmContext(): BrowserContext {
        return browser.newContext(defaultOptions)
    }
}

With this change, the context acquisition dropped from 180ms to ~5ms. Every bit helps, and percentage wise, it’s a very significant improvement.

Optimization 4: Parallel upload

We were first capturing screenshots, and then uploading them. These actions can overlap for the next request:

// Start upload in background
val uploadFuture = CompletableFuture.supplyAsync {
    storage.upload(screenshotBytes)
}

// Return result (upload continues async if needed)
// Actually, we need the URL... so this doesn't help much for sync API

For the sync API, this doesn’t help. But when we build async mode, we can return the job ID before the upload completes, which gives faster response times.

Optimization 5: Image compression

We noticed that sometimes large PNGs are slower to upload, especially when making full page screenshots. To address this, we added a JPEG option with quality control:

val screenshotOptions = Page.ScreenshotOptions()
    .setType(if (request.format == "jpeg") ScreenshotType.JPEG else ScreenshotType.PNG)
    .setQuality(if (request.format == "jpeg") request.quality ?: 80 else null)

JPEG at 80% quality is typically 60-70% smaller than PNG while still providing good quality. When a pixel perfect quality of the screenshot isn’t the most important thing, we now offer a way to influence this, and as a result, the upload time drops proportionally.

Results

After all optimizations, these are our numbers when capturing the same site:

PhaseBeforeAfter
Context acquire180ms5ms
Page create45ms45ms
Navigation1,200ms1,100ms
Wait2,100ms650ms
Screenshot340ms340ms
Upload280ms120ms (JPEG)
Total4,145ms2,260ms

45% faster. We’re now under 2.5 seconds on average for our list of benchmark sites!

Several, more complex, sites (SPAs, heavy JS) still take 3-4 seconds to capture, while simple sites can be under 1.5 seconds. Some of this slowdown could be related to our own network location, or it could be that the target site is just a bit less responsive. We will need to do more analysis on this in the future.

API changes

We introduced new parameters:

{
  "url": "https://example.com",
  "wait_strategy": "load",
  "format": "jpeg",
  "quality": 80,
  "block_trackers": true
}

Documented with recommendations:

  • Use wait_strategy: "load" for most sites
  • Use format: "jpeg" when file size matters
  • Use block_trackers: true for faster captures

Our client’s response

We pushed the update and emailed our client:

We optimized screenshot capture: it should be 40-50% faster for most sites. There are a few new parameters available: wait_strategy, format, quality, block_trackers.

Their reply:

Just tested - you’re right, noticeably faster. The JPEG option is great for our thumbnails. Thanks!

We love this type of communications, it’s great. We have a happy customer and a faster product, so everybody wins!

Tomorrow: adding more customers

Day 26. We have one customer. Let’s try to find more! Perhaps a small promotion could help?

Book of the day

High Performance Browser Networking by Ilya Grigorik

Illy Grigorik (from Google, now at Shopify) wrote the definitive guide to web performance. It covers everything from TCP/IP to browser rendering.

The chapter on “Optimizing for Mobile Networks” is particularly relevant - understanding why pages load slowly helps you optimize screenshot capture. Latency, bandwidth, connection reuse - all concepts that apply.

The book is available for free online at hpbn.co. Worth reading for anyone building web tools.


Day 25 stats

Hours
███████████░░░░
73h
</> Code
███████████████
4,900
$ Revenue
█░░░░░░░░░░░░░░
$45
Customers
██░░░░░░░░░░░░░
1
Hosting
████░░░░░░░░░░░
$5.5/mo
Achievements:
[✓] 45% performance improvement [✓] Browser pooling added [✓] JPEG compression option
╔════════════════════════════════════════════════════════════╗
E

Erik

Building Allscreenshots. Writes code, takes screenshots, goes diving.

Try allscreenshots

Screenshot API for the modern web. Capture any URL with a simple API call.

Get started