Performance & Latency
What to expect when using ScreenMCP for phone automation.
Overview
ScreenMCP commands travel through a multi-hop relay: your AI client sends a command to the ScreenMCP API, which routes it through a WebSocket worker to the phone. The phone executes the action and sends the result back through the same path. Most commands (tap, type, back, home) complete in under 500ms. Screenshots take longer because of image capture, compression, and transfer.
Connection Setup
The first command in a session requires a connection setup. Subsequent commands reuse the existing connection and are much faster.
| Step | What happens | Typical time |
|---|---|---|
| Discover | API finds the best worker and notifies your phone | 0.6–1.0s |
| WebSocket + Auth | Client connects to worker and authenticates | 0.8–1.2s |
| Phone wake-up | Phone receives push notification and connects to worker | 0.3–2.0s |
| Total (first command) | 2–4s |
If the phone is already connected (e.g. you sent a command recently), the wake-up step is skipped. The phone stays connected for 5 minutes after the last command.
Screenshot Performance
Screenshots are the most latency-sensitive command. The phone captures the screen, compresses it to WebP, base64-encodes it, and sends it back through the relay.
| Step | Details | Typical time |
|---|---|---|
| Screen capture | Android AccessibilityService captures pixels | 60–150ms |
| Compress + encode | WebP compression and base64 encoding | 150–300ms |
| Network transfer | Phone → worker → client (~80KB) | 200–500ms |
| Total per screenshot | 0.5–1.5s |
Image size
60–90 KB
WebP @ quality 80, 720px wide
Phone processing
~280 ms
Capture + compress + encode
Avg round trip
~900 ms
End-to-end with network
Other Commands
Non-screenshot commands are faster because they don't involve image compression or large data transfer.
| Command | Typical round trip |
|---|---|
| click, long_click | 200–400ms |
| type, get_text | 200–400ms |
| back, home, recents | 150–300ms |
| drag, scroll | 300–600ms |
| ui_tree | 300–800ms |
| screenshot | 500ms–1.5s |
Tips for Best Performance
- 1Use lower screenshot quality. Set
quality: 60andmax_width: 720for faster compression and smaller transfer. The AI can still read the screen clearly. - 2Keep sessions active. The phone disconnects after 5 minutes of idle time. Sending commands regularly keeps the connection alive and avoids the 1–2 second reconnection delay.
- 3Use ui_tree for navigation. Instead of taking a screenshot to find a button, use
ui_treeto get element positions. It returns text data which is faster to transfer than images. - 4Good network matters. The phone needs a reliable internet connection (Wi-Fi or strong cellular). Latency depends on the phone's network speed and distance to the relay server.
How It Works
AI Client (Claude, Cursor, SDK)
│
│ HTTPS ── discover worker, get wsUrl
▼
ScreenMCP API ──push notification──▶ Phone
│ │
│ │ connects to worker
▼ ▼
Worker (WebSocket relay) ◀────WSS────▶ Phone
│ │
│ command ──────────────────────▶ │ execute
│ ◀──────────────── response ── │
▼
AI Client receives resultAll communication is encrypted (TLS). The worker is a stateless relay — it does not store screenshots or commands. Data flows through in real time and is not persisted.