Performance & Latency

What to expect when using ScreenMCP for phone automation.

Overview

ScreenMCP commands travel through a multi-hop relay: your AI client sends a command to the ScreenMCP API, which routes it through a WebSocket worker to the phone. The phone executes the action and sends the result back through the same path. Most commands (tap, type, back, home) complete in under 500ms. Screenshots take longer because of image capture, compression, and transfer.

Connection Setup

The first command in a session requires a connection setup. Subsequent commands reuse the existing connection and are much faster.

StepWhat happensTypical time
DiscoverAPI finds the best worker and notifies your phone0.6–1.0s
WebSocket + AuthClient connects to worker and authenticates0.8–1.2s
Phone wake-upPhone receives push notification and connects to worker0.3–2.0s
Total (first command)2–4s

If the phone is already connected (e.g. you sent a command recently), the wake-up step is skipped. The phone stays connected for 5 minutes after the last command.

Screenshot Performance

Screenshots are the most latency-sensitive command. The phone captures the screen, compresses it to WebP, base64-encodes it, and sends it back through the relay.

StepDetailsTypical time
Screen captureAndroid AccessibilityService captures pixels60–150ms
Compress + encodeWebP compression and base64 encoding150–300ms
Network transferPhone → worker → client (~80KB)200–500ms
Total per screenshot0.5–1.5s

Image size

60–90 KB

WebP @ quality 80, 720px wide

Phone processing

~280 ms

Capture + compress + encode

Avg round trip

~900 ms

End-to-end with network

Other Commands

Non-screenshot commands are faster because they don't involve image compression or large data transfer.

CommandTypical round trip
click, long_click200–400ms
type, get_text200–400ms
back, home, recents150–300ms
drag, scroll300–600ms
ui_tree300–800ms
screenshot500ms–1.5s

Tips for Best Performance

  • 1Use lower screenshot quality. Set quality: 60 and max_width: 720 for faster compression and smaller transfer. The AI can still read the screen clearly.
  • 2Keep sessions active. The phone disconnects after 5 minutes of idle time. Sending commands regularly keeps the connection alive and avoids the 1–2 second reconnection delay.
  • 3Use ui_tree for navigation. Instead of taking a screenshot to find a button, use ui_tree to get element positions. It returns text data which is faster to transfer than images.
  • 4Good network matters. The phone needs a reliable internet connection (Wi-Fi or strong cellular). Latency depends on the phone's network speed and distance to the relay server.

How It Works

AI Client (Claude, Cursor, SDK)
    │
    │  HTTPS ── discover worker, get wsUrl
    ▼
ScreenMCP API  ──push notification──▶  Phone
    │                                    │
    │                                    │ connects to worker
    ▼                                    ▼
Worker (WebSocket relay) ◀────WSS────▶ Phone
    │                                    │
    │  command ──────────────────────▶   │ execute
    │  ◀──────────────── response ──    │
    ▼
AI Client receives result

All communication is encrypted (TLS). The worker is a stateless relay — it does not store screenshots or commands. Data flows through in real time and is not persisted.