We Let Maestro Handle the Boring Parts

We’re building a new SDK. Part of the process means benchmarking — running pre-defined flows on a real device and measuring RAM, CPU, and network. We were going to do it manually. Then we thought: what about the next run? And the one after that?

Why Not Appium?

Appium is the obvious answer. It’s also massive — server setup, desired capabilities, a test runner, Java or Python glue code. For a harness that taps five things and calls it a day, that’s scaffolding I’d spend more time maintaining than running. Overskill.

We asked an AI for alternatives and landed on Maestro in a few minutes.

What Maestro Is

YAML-based mobile UI automation. Write what you want the device to do, run it, Maestro handles the timing. No explicit waits, no boilerplate. Supports Android, iOS, React Native, Flutter, SwiftUI, Jetpack Compose — one syntax for all of them.

Install

curl -Ls "https://get.maestro.mobile.dev" | bash

You’ll need Java 17+ and Android SDK (or Xcode for iOS). The CLI itself is one command.

A Flow

appId: com.example.sdk.demo
---
- launchApp:
    clearState: true
- tapOn: "Start Benchmark"
- waitForAnimationToEnd
- tapOn: "Run Flow A"
- waitForAnimationToEnd
- takeScreenshot: flow-a-complete

tapOn matches by text, accessibility label, or ID. The screenshots give us timestamp anchors to match against profiler data.

Run it:

maestro test benchmark-flow.yaml

For live reload while editing the flow:

maestro test --continuous benchmark-flow.yaml

If you need to find element names, maestro studio opens a browser-based inspector that can record taps and generate YAML for you. Full command syntax is in the commands reference.

What We Actually Get Out of This

Same flow runs on Android and iOS — just swap the connected device. Maestro drives the taps; a separate monitoring script captures the metrics in parallel. I’ll do a dedicated post on that setup — there’s enough to say about it.

What Maestro eliminates is the human variable: slightly different timing each run, missed steps, 10 minutes on a task that should be 10 seconds.

Also — we didn’t write the YAML by hand. We built a Claude skill that takes the app’s codebase and test scenario, then generates the script. So the full pipeline is: describe what to test → Claude writes the YAML → Maestro runs it. At this point we’re mostly just watching things happen and taking credit.

References