agent-qa

Install agent-qa, prepare web and mobile runtimes, connect an LLM, and inspect your first run from the dashboard or CLI.

agent-qa ships as an npm package. Add it to an existing codebase when you already have an app repository, or start a small JavaScript workspace when you want to try agent-qa beside a non-JavaScript project first.

Prerequisites

  • A JavaScript runtime such as Node.js or Bun. agent-qa is written in JavaScript and needs a local runtime for the CLI and dashboard.
  • Access to an LLM for inference. You can use a remote API endpoint, a local model served by tools such as Ollama or LM Studio, or subscription auth.
    • OpenAI-compatible API endpoints
    • Anthropic-compatible API endpoints
    • Gemini models
    • Codex or Claude Code subscriptions through the optional subscription auth plugin

Info

Use a multimodal model for the normal quickstart. Web and mobile runs inspect screenshots, so text-only models are not a good fit for visual QA workflows.

  • Docker is optional, but recommended. agent-qa can run JavaScript, Python, Bash, and Bun hooks inside an isolated Docker runtime.

Install agent-qa and prepare the environment

agent-qa runs independently from the application under test, so you can install it in JavaScript, Rails, Django, Laravel, Go, Java, Swift, Kotlin, web, or mobile repositories. You only need enough Node.js tooling to install and run the CLI.

node --versionnpm --version

Skip this step when your repository already has a package.json. Otherwise, create one before installing agent-qa:

npm init -y

Install agent-qa as a dev dependency so every teammate and CI job can run the same version.

npm install -D agent-qa

If you want to use Codex or Claude Code subscription auth instead of provider API keys, install the optional subscription auth package.

npm install -D @vostride/agent-qa-subscription-auth

Set up the testing environment

Let's set up the test environment by installing browser runtimes for web and the relevant platform tools for mobile.

Web browsers

Install browser runtimes for web tests. These agent-qa-managed browsers do not replace or interfere with browsers you already have installed.

npx agent-qa install-browsers --all

Mobile drivers

For Android or iOS tests, install the Appium runtime first:

npm install -g appiumappium --version

Then install the relevant mobile drivers:

npx agent-qa install-mobile-drivers --all

You also need the developer platform tools:

Hook runtime

Hooks run in an isolated Docker environment. Install Docker from Docker's Get Started page, start Docker Desktop or the Docker daemon, then confirm the CLI can reach it:

docker --versiondocker info

You only need Docker for tests or suites that use hooks. If your first run does not use hooks, you can set it up later.

Initialize agent-qa

Run the init command to scaffold the config files, local settings, sample tests, and hook examples.

npx agent-qa init

Verify the environment

Run the doctor command after initialization to validate the local runtime pieces before your first test run.

npx agent-qa doctor

The generated workspace usually looks like this. Exact sample file names can vary by version, but the shape is stable: project config at the root, optional hook scripts, suites, and tests. Select a file to inspect the generated content.

Initialized agent-qa workspace
agent-qa.config.yamlYAML
1
2# agent-qa Configuration
3
4# File discovery and project settings
5workspace:
6  testMatch:
7    - tests/**/*.yaml
8  suiteMatch:
9    - suites/**/*.suite.yaml
10  hooksFile: hooks.yaml
11  agentRules: ./agent-rules.md
12  envFile: .env
13  secretsFile: .env.secrets.local
14# Optional: ignore archived or generated tests.
15  # testPathIgnore:
16  #   - tests/archive/**/*.yaml
17
18# Infrastructure services (dashboard, MCP, cache, auth state, logging)
19services:
20  dashboard:
21    port: 3100
22    artifactsDir: .agent-qa/artifacts
23    # Optional: persist dashboard state to a custom SQLite path.
24    # dbPath: .agent-qa/dashboard.sqlite
25  mcp:
26    enabled: true
27    transport: http
28    host: 127.0.0.1
29    port: 3471
30    path: /mcp
31  cache:
32    dir: .agent-qa/cache
33    ttl: 7d
34  # Accessibility checks power the W3C BAD demo test.
35  authState:
36    dir: .agent-qa/auth-states
37  accessibility:
38    enabled: true
39    standard: wcag2aa
40    runAfter: every-step
41    failOnViolation: false
42  recording:
43    enabled: true
44  memory:
45    enabled: true
46    provider: local
47    dir: agent-qa-memory
48  logging:
49    level: warn
50
51# Named resource definitions (LLM configs, app targets)
52registry:
53  # Optional: local and cloud device/provider profiles.
54  # devices:
55  #   android-emu:
56  #     platform: android
57  #     transport: local
58  # providers:
59  #   browserstack:
60  #     username: ${BROWSERSTACK_USERNAME}
61  #     accessKey: ${BROWSERSTACK_ACCESS_KEY}
62  llms:
63    - name: codex
64      provider: openai-subscription
65      model: gpt-5.5
66      screenshotSize: 50kb
67      effectiveResolution: 500
68  targets:
69    # Optional: add more web or mobile targets here.
70    # my-mobile:
71    #   platform: android
72    #   appPackage: com.example.app
73    #   appActivity: .MainActivity
74    #   app:
75    #     path: apps/example.apk
76    example-web:
77      platform: web
78      url: https://example.com
79    automation-exercise:
80      platform: web
81      url: https://automationexercise.com
82    wai-bad:
83      platform: web
84      url: https://www.w3.org/WAI/demos/bad/before/home.html
85    example-android:
86      platform: android
87    example-ios:
88      platform: ios
89
90# Optional subscription auth plugin declarations
91plugins:
92  auth:
93    - package: "@vostride/agent-qa-subscription-auth"
94
95# Execution settings (cascades: global -> suite -> test -> CLI flags)
96use:
97  browser:
98    name: chromium
99    headless: true
100    viewport:
101      width: 1280
102      height: 720
103  mobile:
104    appState: preserve
105  timeout:
106    step: 5m
107    test: 30m
108    navigation: 1m
109  healing:
110    maxAttempts: 3
111  planner:
112    maxSubActions: 10
113    previousStepCount: 5
114  logCapture:
115    console: true
116    network: true
117  parallel: false
118  llm: codex
119  # Optional: bind mobile runs to a configured device profile.
120  # device: android-emu

Keep generated run artifacts out of commits unless your team intentionally stores them.

Open the dashboard

Start the local dashboard from the project root.

npx agent-qa dashboard --open

The --open command opens the agent-qa dashboard in your default browser. Before running your first test, connect the LLM model agent-qa should use. On the dashboard:

  1. Go to Config > LLM.
  2. Add an LLM configuration.
  3. Choose the provider or subscription auth mode.
  4. Test the connection.
  5. Go to Config > Execution Defaults and select the new LLM configuration.
  6. Save the config.

The dashboard writes back to your local config files, so review the diff the same way you would review any other project configuration change. Model secrets, such as API keys or auth tokens, are stored in ~/.agent-qa/auth.json.

Run your first test from the dashboard

Use the generated sample before writing a custom test.

  1. Go to Tests.
  2. Select Example passing test.
  3. Click Run or press R.

By default, runs execute in headless mode. Disable headless mode in the execution settings when you want to watch the browser or mobile session directly.

While the test is running, the live view shows the active execution. After the run completes, open the run view and inspect the full timeline:

  • what the agent observed before each step
  • how it planned the next action
  • what it actually executed, such as clicking a button or filling an input
  • how it verified the result
  • how each assertion was evaluated, including the reasoning behind the pass or failure

This view is the fastest way to learn whether a failure came from product behavior, test wording, environment setup, or model interpretation.

Run your first test from the CLI

agent-qa is designed to work with teams at scale. Run the same tests from CI, release jobs, or post-deploy checks to catch regressions before they reach users.

For CI, start with a narrow command that targets the tests you trust, then expand to suites as coverage grows:

npx agent-qa run tests/example-pass.yaml

The dashboard and CLI share the same file-backed definitions and run artifact storage. Use the dashboard for rich local debugging and the CLI for repeatable automation.

Feel free to reach out if you face any issues.