I built a Copilot agent that turns Jira tickets into PRs

Table of Contents

Pointing an AI coding agent at a legacy PHP codebase with no guardrails is how you get a 3 AM incident. This post walks through exactly what I built — a GitHub Copilot coding agent that reads a Jira ticket, proposes a fix, waits for my sign-off, and only then opens a PR and eventually deploys to QA. Two human checkpoints. Zero surprises.

The codebase context: PHP 5.6 to 8.x migration

The application this runs against is large — hundreds of thousands of lines, born in the PHP 5.6 era, recently migrated to PHP 8.2. That migration introduced a long tail of subtle breakages: dynamic properties now emit deprecation notices, str_contains exists where we used to hand-roll it, and nullable type declarations are scattered inconsistently. Jira tracks every reported regression as a ticket. Before this agent, a developer would read the ticket, grep through the codebase, make a fix, push a branch, and open a PR manually. Repetitive, slow, and easy to skip steps under deadline pressure.

If you’ve gone through a similar migration, the PHP 7 migration notes are a useful reference for the class of issues that tend to survive version bumps and surface later.

What I actually built: the full stack

The setup has three moving parts:

1. Jira MCP server — I’m running the official Atlassian MCP server locally. It exposes Jira as a tool the agent can call: read a ticket, fetch its description, pull acceptance criteria, list linked issues. No hand-rolled REST wrappers. If you want to understand why MCP beats rolling your own integrations, building custom MCP servers for local development workflows covers the mechanics well.

2. GitHub Copilot coding agent (VS Code, agent mode, June 2026 build) — This is the brain. It can read files, run terminal commands, edit code, and call MCP tools. I trigger it in the chat panel.

3. gh CLI — The agent uses this for branch creation, PR creation, and triggering the QA deploy workflow. No GitHub REST API calls in custom code; gh handles auth and gives the agent a stable, predictable interface.

The configuration that makes it safe lives in two files:

.github/copilot-instructions.md   # custom instructions loaded automatically
AGENTS.md                          # project memory the agent MUST read first

AGENTS.md: the durable memory layer

This file is the most important part of the setup. It’s not documentation — it’s a curated briefing the agent reads before touching anything. Here’s an excerpt of what’s in mine:

# AGENTS.md — Read this before every task

## Architecture
- Entry point: public/index.php (legacy bootstrap, do NOT modify)
- Routing: src/Router.php — custom, non-Laravel
- DB layer: src/DB/LegacyPDO.php — wrapper around raw PDO, no ORM

## PHP 8.2 migration status
- Dynamic properties: patched in 80% of models, remainder tracked in JIRA label `php82-dynamic-prop`
- Nullable types: NOT uniformly applied — do not add `?` types speculatively

## Directories you must not touch
- /legacy_modules/billing/ — pending rewrite, frozen
- /legacy_modules/auth/   — security review in progress

## Testing
- Run: composer test -- --filter <ClassName>
- PHPUnit 10, config: phpunit.xml
- No tests = no PR. Add or update a test for every fix.

## Deployment
- QA deploy: gh workflow run deploy-qa.yml --ref <branch>
- Do NOT trigger staging or production deploys

I add to this file every time the agent makes a wrong assumption. It’s the feedback loop that makes the system smarter over time without retraining anything. Think of it as a project-specific system prompt I control completely.

The custom instruction file

.github/copilot-instructions.md sets the agent’s behaviour contract:

## Workflow for every task

1. Read AGENTS.md fully before writing a single line of code.
2. Call the Jira MCP tool to fetch the ticket. Extract: summary, description, acceptance criteria, reporter.
3. Investigate the codebase. Identify the affected files. Do NOT edit yet.
4. Report your findings to the developer: what is broken, which files are involved, what you plan to change.
5. WAIT for explicit approval ("yes", "go ahead", "lgtm") before making edits.
6. Implement the fix. Run the relevant tests. Show the diff.
7. WAIT for a second approval before running any `gh` command.
8. On approval: create branch, commit, push, open draft PR with the Jira ticket number in the title.
9. After PR is human-reviewed and approved in GitHub: run the QA deploy workflow.

## Hard rules
- Never modify files listed under "Directories you must not touch" in AGENTS.md.
- Never open a PR to main or master directly.
- Never add speculative refactoring outside the ticket scope.
- If you are unsure whether a change is in scope, ask. Do not assume.

That “never add speculative refactoring” rule saved me twice in the first week. A large language model will happily “improve” five surrounding functions while fixing one bug. On a legacy codebase, that’s how you introduce regressions.

This mirrors the philosophy I wrote about in setting up Cursor rules for consistent AI-assisted coding — explicit do/don’t instructions pay off more than prompt engineering at call time.

A real example: ticket XYZ-891

Ticket summary: “Dynamic property warning on Order model causes 500 on order history page in PHP 8.2”

I typed into the Copilot chat panel:

Work on XYZ-891

The agent called the Jira MCP tool, read the ticket, then reported back:

Ticket XYZ-891 describes a PHP 8.2 dynamic property deprecation on src/Models/Order.php. The property $statusLabel is assigned in getStatusLabel() without being declared in the class definition. This will throw an ErrorException in strict mode. Acceptance criteria: no warnings on order history page, existing unit tests pass. I plan to add a typed property declaration public string $statusLabel = ''; to the class. No other files need changing. Ready to proceed?

I said “go ahead.” It made the edit, ran composer test -- --filter OrderTest, tests passed, showed me the diff. Asked for a second approval. I reviewed the diff — four lines, exactly what it described — and said “open the PR.”

gh pr create \
  --title "fix(XYZ-891): declare dynamic property on Order model for PHP 8.2 compat" \
  --body "Fixes XYZ-891. Declares \$statusLabel as typed property. Tests: OrderTest passes." \
  --draft \
  --base develop

After a teammate approved the PR in GitHub, I told the agent “deploy to QA” and it ran:

gh workflow run deploy-qa.yml --ref fix/XYZ-891-order-dynamic-property

Total time from “Work on XYZ-891” to QA deploy: 18 minutes. My active involvement: two approvals and a diff review.

Why this works on legacy code specifically

Scoped instructions beat general capability. The agent knows PHP 8.2 inside out, but it doesn’t know that our billing module is frozen. AGENTS.md is the delta between general knowledge and project-specific reality.

Two gates, not zero. Full autonomy on a legacy codebase is a liability. The pre-edit gate stops the agent from going down the wrong path entirely. The pre-PR gate is the last line of defence before code enters review. This is similar to the human-in-the-loop pattern I used in automating DevOps tasks with Claude Code Routines.

MCP + gh beats custom integrations. I’m not maintaining a Jira API wrapper or a GitHub REST client. The MCP server handles Jira auth and query formatting. The gh CLI handles GitHub auth and workflow dispatch. When Atlassian updates their API, the MCP server maintainer deals with it, not me.

AGENTS.md compounds. Every wrong assumption the agent makes becomes a new line in AGENTS.md. After two weeks and ~40 tickets, the agent almost never asks a clarifying question I’ve already answered in that file.

You can complement this with automated AI code review in GitHub pull requests — I run that on top of agent-generated PRs for a second automated opinion before human review.

Honest lessons

Where it exceeded expectations: Routine PHP 8.2 compatibility fixes — dynamic properties, deprecated functions, missing return types on obvious cases — are genuinely faster. The agent reads the ticket, finds the exact line, fixes it, writes a test. A task that took 25-35 minutes takes under 20, most of which is my review time.

Where the guardrails were essential: On the third ticket, the agent decided a helper function it was calling “could be simplified” and rewrote it. That helper was called from 34 other places. The “no speculative refactoring” rule now catches this — but without it in the instructions, it would have sailed into the PR.

What to tell another engineer before they start: Write AGENTS.md before you write a single prompt. List the frozen directories, the test commands, the deploy restrictions, the architectural oddities. The agent will follow written rules more reliably than conversational corrections. Start with tickets that have clear acceptance criteria — ambiguous tickets produce ambiguous fixes. And treat the first ten runs as calibration, not production velocity.