Run Nightly E2E Tests on Staging
Schedule Devin to run your end-to-end test suite against staging every night and file tickets for failures.This guide walks through managing schedules via the v3 API, which is useful for automation and infrastructure-as-code workflows. You can also create and manage schedules directly in the Devin UI without any API setup.
Set up a service user for API access
Scheduled sessions created through the API need a service user with the right permissions. You’ll set one up once, then use its API key in all the calls below.Export both values so the commands in this guide work as-is:See the API authentication docs for more on service users and permissions.
- Go to app.devin.ai > Settings > Service Users and click Create Service User
- Assign a role that includes the
ManageOrgSchedulespermission - Save the API key shown after creation — it’s only displayed once and you’ll use it as your
Bearertoken
Write a playbook for the test run
Before creating the schedule, write a playbook that tells Devin exactly how to run your E2E suite and what to do with the results. Go to Settings > Playbooks and create a new playbook — or use Advanced Devin to generate one for you from a description of your test workflow. Here’s an example for a Playwright suite:Note the playbook ID after saving — you’ll need it for the API call. You can find it in the URL when viewing the playbook (
app.devin.ai/.../playbooks/{playbook_id}).Create the nightly schedule via the API
Now use the The response includes a The
Why 2 AM? You want tests to run after the last deploy of the day has settled on staging, but early enough that failures are visible when engineers start work. Adjust to match your team’s timezone and deploy cadence.See the Create schedule endpoint docs for all available fields.
POST /v3/organizations/{org_id}/schedules endpoint to register the schedule. This example runs every night at 2 AM UTC:schedule_id you’ll use to manage this schedule later. Save it:frequency field uses standard cron syntax. Some useful alternatives:| Pattern | When it runs |
|---|---|
0 2 * * * | Every night at 2 AM UTC |
0 2 * * 1-5 | Weeknights only (Mon-Fri) |
0 6 * * 1 | Every Monday at 6 AM UTC |
Verify the first run and tune the prompt
After the schedule fires for the first time, check the session to make sure Devin ran the tests correctly and the output matches what you expect.
- Open the Devin dashboard and find the session under Past Sessions — it will be tagged with the schedule name
- Did the Playwright suite execute? Were Linear tickets created for real failures (not flaky tests)?
- Check the
#qa-resultsSlack channel for the summary message
- Devin can’t access staging: Add your staging environment variables (like
STAGING_API_KEYorDATABASE_URL) as organization secrets so they’re available in every scheduled session - Too many tickets from flaky tests: Add a retry to your playbook: “Re-run any failing test once before filing a ticket. Only file tickets for tests that fail twice.”
- Tests take too long: Scope the suite — e.g., “Only run tests in
tests/critical/andtests/smoke/” — or increase the session timeout
Manage schedules as code
Once your nightly run is stable, you’ll want to manage it alongside your other schedules — pausing during deploy freezes, updating the prompt when your test suite changes, or spinning up a second schedule for a different environment.Pause the schedule during a deploy freeze or maintenance window:Re-enable it when the freeze ends:List all schedules to audit what’s running:For teams managing multiple schedules, ask Devin to build a CLI that syncs schedule definitions from a YAML config file — so you can version-control your schedules alongside your test configuration:
