Run Nightly E2E Tests on Staging

Set up a service user for API access

Scheduled sessions created through the API need a service user with the right permissions. You’ll set one up once, then use its API key in all the calls below.

Go to app.devin.ai > Settings > Service Users and click Create Service User
Assign a role that includes the ManageOrgSchedules permission
Save the API key shown after creation — it’s only displayed once and you’ll use it as your Bearer token

To get your organization ID, call the List Organizations endpoint with your service user token:

curl "https://api.devin.ai/v3/enterprise/organizations" \
  -H "Authorization: Bearer $DEVIN_API_KEY"

Export both values so the commands in this guide work as-is:

export DEVIN_API_KEY="sk-your-service-user-key"
export ORG_ID="your-org-id"

See the API authentication docs for more on service users and permissions.

Write a playbook for the test run

Before creating the schedule, write a playbook that tells Devin exactly how to run your E2E suite and what to do with the results. Go to Settings > Playbooks and create a new playbook — or use Advanced Devin to generate one for you from a description of your test workflow. Here’s an example for a Playwright suite:Note the playbook ID after saving — you’ll need it for the API call. You can find it in the URL when viewing the playbook (app.devin.ai/.../playbooks/{playbook_id}).

Install the Linear integration so Devin can create tickets as part of the playbook. When creating the schedule, you can also set a Slack channel (e.g., #qa-results) so your team gets notified automatically. Give Devin read-only access to your staging environment’s secrets (database URLs, API keys) via organization secrets if your tests need them.

Create the nightly schedule via the API

Now use the POST /v3/organizations/{org_id}/schedules endpoint to register the schedule. This example runs every night at 2 AM UTC:

curl -X POST "https://api.devin.ai/v3/organizations/$ORG_ID/schedules" \
  -H "Authorization: Bearer $DEVIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Nightly E2E — staging",
    "prompt": "Run the nightly E2E test suite against staging. Follow the playbook exactly.",
    "schedule_type": "recurring",
    "frequency": "0 2 * * *",
    "playbook_id": "your-playbook-id"
  }'

The response includes a schedule_id you’ll use to manage this schedule later. Save it:

export SCHEDULE_ID="a1b2c3d4-e5f6-7890-abcd-ef1234567890"  # from the response

The frequency field uses standard cron syntax. Some useful alternatives:

Pattern	When it runs
`0 2 * * *`	Every night at 2 AM UTC
`0 2 * * 1-5`	Weeknights only (Mon-Fri)
`0 6 * * 1`	Every Monday at 6 AM UTC

Why 2 AM? You want tests to run after the last deploy of the day has settled on staging, but early enough that failures are visible when engineers start work. Adjust to match your team’s timezone and deploy cadence.See the Create schedule endpoint docs for all available fields.

Verify the first run and tune the prompt

After the schedule fires for the first time, check the session to make sure Devin ran the tests correctly and the output matches what you expect.

Open the Devin dashboard and find the session under Past Sessions — it will be tagged with the schedule name
Did the Playwright suite execute? Were Linear tickets created for real failures (not flaky tests)?
Check the #qa-results Slack channel for the summary message

Common issues on the first run and how to fix them:

Devin can’t access staging: Add your staging environment variables (like STAGING_API_KEY or DATABASE_URL) as organization secrets so they’re available in every scheduled session
Too many tickets from flaky tests: Add a retry to your playbook: “Re-run any failing test once before filing a ticket. Only file tickets for tests that fail twice.”
Tests take too long: Scope the suite — e.g., “Only run tests in tests/critical/ and tests/smoke/” — or increase the session timeout

Manage schedules as code

Once your nightly run is stable, you’ll want to manage it alongside your other schedules — pausing during deploy freezes, updating the prompt when your test suite changes, or spinning up a second schedule for a different environment.Pause the schedule during a deploy freeze or maintenance window:

curl -X PATCH "https://api.devin.ai/v3/organizations/$ORG_ID/schedules/$SCHEDULE_ID" \
  -H "Authorization: Bearer $DEVIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"enabled": false}'

Re-enable it when the freeze ends:

curl -X PATCH "https://api.devin.ai/v3/organizations/$ORG_ID/schedules/$SCHEDULE_ID" \
  -H "Authorization: Bearer $DEVIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"enabled": true}'

List all schedules to audit what’s running:

curl "https://api.devin.ai/v3/organizations/$ORG_ID/schedules" \
  -H "Authorization: Bearer $DEVIN_API_KEY"

For teams managing multiple schedules, ask Devin to build a CLI that syncs schedule definitions from a YAML config file — so you can version-control your schedules alongside your test configuration:

Gallery

Run Nightly E2E Tests on Staging