Skip to main content

Run Nightly E2E Tests on Staging

Schedule Devin to run your end-to-end test suite against staging every night and file tickets for failures.
AuthorCognition
CategoryAutomations
FeaturesAPI, Schedules, Playbooks
This guide walks through managing schedules via the v3 API, which is useful for automation and infrastructure-as-code workflows. You can also create and manage schedules directly in the Devin UI without any API setup.
1

Set up a service user for API access

Scheduled sessions created through the API need a service user with the right permissions. You’ll set one up once, then use its API key in all the calls below.
  1. Go to app.devin.ai > Settings > Service Users and click Create Service User
  2. Assign a role that includes the ManageOrgSchedules permission
  3. Save the API key shown after creation — it’s only displayed once and you’ll use it as your Bearer token
To get your organization ID, call the List Organizations endpoint with your service user token:
curl "https://api.devin.ai/v3/enterprise/organizations" \
  -H "Authorization: Bearer $DEVIN_API_KEY"
Export both values so the commands in this guide work as-is:
export DEVIN_API_KEY="sk-your-service-user-key"
export ORG_ID="your-org-id"
See the API authentication docs for more on service users and permissions.
2

Write a playbook for the test run

Before creating the schedule, write a playbook that tells Devin exactly how to run your E2E suite and what to do with the results. Go to Settings > Playbooks and create a new playbook — or use Advanced Devin to generate one for you from a description of your test workflow. Here’s an example for a Playwright suite:Note the playbook ID after saving — you’ll need it for the API call. You can find it in the URL when viewing the playbook (app.devin.ai/.../playbooks/{playbook_id}).
Install the Linear integration so Devin can create tickets as part of the playbook. When creating the schedule, you can also set a Slack channel (e.g., #qa-results) so your team gets notified automatically. Give Devin read-only access to your staging environment’s secrets (database URLs, API keys) via organization secrets if your tests need them.
3

Create the nightly schedule via the API

Now use the POST /v3/organizations/{org_id}/schedules endpoint to register the schedule. This example runs every night at 2 AM UTC:
curl -X POST "https://api.devin.ai/v3/organizations/$ORG_ID/schedules" \
  -H "Authorization: Bearer $DEVIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Nightly E2E — staging",
    "prompt": "Run the nightly E2E test suite against staging. Follow the playbook exactly.",
    "schedule_type": "recurring",
    "frequency": "0 2 * * *",
    "playbook_id": "your-playbook-id"
  }'
The response includes a schedule_id you’ll use to manage this schedule later. Save it:
export SCHEDULE_ID="a1b2c3d4-e5f6-7890-abcd-ef1234567890"  # from the response
The frequency field uses standard cron syntax. Some useful alternatives:
PatternWhen it runs
0 2 * * *Every night at 2 AM UTC
0 2 * * 1-5Weeknights only (Mon-Fri)
0 6 * * 1Every Monday at 6 AM UTC
Why 2 AM? You want tests to run after the last deploy of the day has settled on staging, but early enough that failures are visible when engineers start work. Adjust to match your team’s timezone and deploy cadence.See the Create schedule endpoint docs for all available fields.
4

Verify the first run and tune the prompt

After the schedule fires for the first time, check the session to make sure Devin ran the tests correctly and the output matches what you expect.
  1. Open the Devin dashboard and find the session under Past Sessions — it will be tagged with the schedule name
  2. Did the Playwright suite execute? Were Linear tickets created for real failures (not flaky tests)?
  3. Check the #qa-results Slack channel for the summary message
Common issues on the first run and how to fix them:
  • Devin can’t access staging: Add your staging environment variables (like STAGING_API_KEY or DATABASE_URL) as organization secrets so they’re available in every scheduled session
  • Too many tickets from flaky tests: Add a retry to your playbook: “Re-run any failing test once before filing a ticket. Only file tickets for tests that fail twice.”
  • Tests take too long: Scope the suite — e.g., “Only run tests in tests/critical/ and tests/smoke/” — or increase the session timeout
5

Manage schedules as code

Once your nightly run is stable, you’ll want to manage it alongside your other schedules — pausing during deploy freezes, updating the prompt when your test suite changes, or spinning up a second schedule for a different environment.Pause the schedule during a deploy freeze or maintenance window:
curl -X PATCH "https://api.devin.ai/v3/organizations/$ORG_ID/schedules/$SCHEDULE_ID" \
  -H "Authorization: Bearer $DEVIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"enabled": false}'
Re-enable it when the freeze ends:
curl -X PATCH "https://api.devin.ai/v3/organizations/$ORG_ID/schedules/$SCHEDULE_ID" \
  -H "Authorization: Bearer $DEVIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"enabled": true}'
List all schedules to audit what’s running:
curl "https://api.devin.ai/v3/organizations/$ORG_ID/schedules" \
  -H "Authorization: Bearer $DEVIN_API_KEY"
For teams managing multiple schedules, ask Devin to build a CLI that syncs schedule definitions from a YAML config file — so you can version-control your schedules alongside your test configuration: