Best Practices
How to get the most out of Devin
Deciding on a candidate use case
There are a couple of ways to approach figuring out the best use case for your organization. Read below or see examples
Best Enterprise Use Cases |
---|
Large, high-business value projects that can be sliced into isolated & repetitive subtasks. |
Scoped Tasks that are < 90 minutes of eng time. |
Backwards Compatible Tasks |
Devin’s Ideal Requirements
Requirement | ✅ |
---|---|
High volume of repetitive subtasks (slices) | [ ] |
Subtasks of junior engineer-level difficulty | [ ] |
Isolated & Incremental subtasks | [ ] |
Objective & Verifiable subtasks | [ ] |
(Recommended) Minimal project dependencies | [ ] |
Crafting Devin’s work
Example Scenario | Reliability Concern | Task Type |
---|---|---|
Asking Devin for complex, net-new features (even if the work is repetitive!) | Unlikely to be sufficiently reliable at scale. | Tall & Deep |
Asking Devin to handle simple, well-defined tasks | Reliable and effective. | Wide & Shallow |
A large backlog of horizontal, simple changes (e.g. SonarQube issues) is much more likely to generate enormous ROI when horizontally scaled.
⚠️ What to Slice (Critical!)
Migrations, refactors, modernizations, or technical debt backlogs are all great use cases for Devin.
Below - let’s assume we’re working on a code migration.
We must be able to split the migration into isolated slices, where each task gets tackled by an individual Devin session.
Verification
A slice should be the smallest atomic unit of the project.
Example Slices |
---|
File |
Notebook |
Module |
Requirement | Details |
---|---|
⏳ Time Limit | Each slice must take under 90 minutes of human engineering work. |
✅ Verification | Must have a way to verify code changes, such as: - Running tests - Building the code - CI checks - A custom verification script |
Devin must have a way to verify if it has succeeded at a task.
Avoid having too many dependencies or platforms to interact with. Devin is best at coding!
Requirement | Description |
---|---|
✅ Isolation | Each slice must be isolated and backwards compatible. |
⚡ Parallel Execution | Leverage Devin’s parallelism to incrementally complete the migration one slice at a time. |
🔍 Human Review | After a human review, each PR is successively merged into main . |
Overall Model
Principle | Description |
---|---|
🎯 Slice-Level Reliability | Devin is trained to be maximally reliable at the individual slice level for every use case. |
⚡ Scaling Consideration | When parallelizing over thousands of slices, maintaining high reliability is critical. |
⚠️ Error Impact | Even a small margin of error can lead to many incorrect changes at scale. |
Overall Model
✅ Requirement | Description |
---|---|
Clear step details | Provide clear detail for every step in each slice. |
End-to-end reference | A detailed write-up or video of the full process is highly effective. |
Before/After examples | Share several examples of before/after code changes (i.e., input/output pairs). |
Dependency access | Ensure Devin has access to all required dependencies for each slice. |
Examples
Devin excels at ongoing technical debt tasks (e.g. PR review or QA testing) — assuming they can be split into slices.
Migrations, modernizations, and refactors are often good use cases for Devin, so long as they are incremental. For example, a migration that requires upgrading the entire repository to the new system at the same moment, rather than one slice at a time, is not recommended.
Case Study: Nubank Migration Case Study