Stop using GitHub Copilot as a chatbot!
Part 0 showed why constant prompting, re-prompting, and steering GitHub Copilot feels fragile. Not because Copilot is unreliable, but because nothing in the interaction is stable. Every follow-up reshapes the problem again, and the system has no way to know what must remain invariant.
This post answers the only question that matters after that: what exactly do you need to change, how do you do it in practice, and why does this suddenly make Copilot behave like a serious tool instead of a slot machine.
The core shift is simple, but uncomfortable. You stop putting decisions into chat, and you start putting them into files.
What you are actually changing
In most Copilot workflows today, decisions live in the worst possible places. Some are in your head. Some are half-expressed in a chat message you typed while thinking. Some are implicit corrections you made three prompts ago. Copilot only ever sees fragments of these decisions, and those fragmets keep changing.
The fix is not better wording or a more advanced model, but relocating decisions into stable artifacts that survive longer than a single interaction.
The first file you create: rRules that never change
Before you generate anything, create a place in the repository where global rules live. This is not documentation for you or your team mates, but the constraint system for Copilot.
Create this folder if it does not exist:
.github/instructions/
Inside it, create a file called:
global.md
This file defines what must always be true, regardless of task.
For example:
# Global development rules
These rules apply to all code generated in this repository.
## Invariants
* Existing public APIs must not change.
* Naming conventions must be preserved.
* Errors must be handled explicitly.
* No commented-out code is allowed.
## Architecture
* Reuse existing patterns.
* No new abstractions without need.
* Async APIs only.
## Quality bar
* Readable over clever.
* No dead code.
* No speculative features.
## Testing
* New logic requires tests.
* Tests must be deterministic.
Nothing in this file is negotiable during generation. This single file already removes a surprising amount of randomness.
The second file you create: The task, frozen in time
Now you stop starting work in the chat. Before you ask Copilot to generate anything, you write down the task in a dedicated file. Not while thinking, not mid-conversation. Create this folder:
.github/specs/
For each non-trivial task, create one spec file. For example:
user-export.md
# User export feature
## Intent
* Allow administrators to export users as CSV.
## Scope
* Backend logic only. No UI changes.
## Must do
* The export includes id, name, and email.
* Large datasets are handled through pagination.
## Must not do
* No UI changes.
* No new dependencies.
## Constraints
* Follow existing repository patterns.
* Async APIs only.
## Definition of done
* The code builds successfully.
* Tests are included.
* All global rules are respected.
Once generation starts, this file is frozen. If something is wrong, you edit the spec and run again. You never “fix it in chat”. This is where most people feel friction the first time. That friction is the old habit dying.
How you now use Copilot (exactly)
Only after those two files exist you do open Copilot chat. You do not paste the spec. You do not explain the task again. You do not add “clarifications”, but write exactly this:
Implement the feature described in
.github/specs/user-export.md
following all rules in .github/instructions/global.md.
Then you stop typing. If the result is wrong, the fix is not another prompt. The fix is correcting global.md or user-export.md.
This is the first moment where Copilot stops needing to be babysitted.
Why this already changes behavior
Copilot is extremely good at executing within boundaries and extremely bad at inferring them. When rules only exist in chat, they decay inevitably. When tasks only exist in conversation, they drift. When “done” only exists in your head, supervision becomes mandatory.
By moving rules and tasks into files, you remove ambiguity from the context window. Copilot no longer has to decide which instruction still matters. It can see, clearly and consistently, what must hold and by that you won’t need to re-prompt over and over again.

Extending the same idea to review
At this point, code-generation is stable, but review is still manual. That is the next bottleneck we need to tackle.
Review has rules and those rules already exist in your standards and architecture; they just are not written down. So let’s codify them:
Create a review spec:
.github/specs/review-backend.md
# Backend review
## Goal
Evaluate generated code against global rules and the feature spec.
## Must verify
* No breaking API changes.
* Error handling is explicit.
* Async patterns are consistent.
* Architecture is respected.
## Must report
* Any deviation from the spec.
* Any assumptions made.
* Any potential risks.
## Output format
* Short summary.
* Explicit list of deviations.
* Explicit list of risks.
Run review in a separate Copilot interaction, with fresh context:
Review the generated code using
.github/specs/review-backend.md.
Copilot is dramatically more honest when it is not reviewing its own output in the same session but to decide whether deviations are acceptable.
Refactoring becomes routine
Once rules, specs, and review criteria exist, refactoring stops being dangerous. Create a refactor spec:
.github/specs/refactor.md
# Refactoring task
## Intent
Improve internal structure without changing behavior.
## Invariants
* External APIs must not change.
* Observable behavior must remain identical.
## Constraints
* No new features.
* No performance regressions.
## Definition of done
* Existing tests still pass.
* Review spec reports no new deviations.
As a result: Copilot refactors and reviews, you approve exceptions.
How parallelism finally becomes real
Because work is fully specified in files, it no longer lives in your head. While Copilot implements one spec, it can review another and refactor a third. Meanwhile, you write the next spec.
This is not you multitasking or task-switching, but actual parallel progress. The productivity gain comes from removing the need to supervise.
Never fix output before fixing the definition. If you violate this, the entire system collapses back into prompting.
The litmus test
After a Copilot session, ask yourself this:
If I deleted the entire chat history, could I reproduce this result using only the files in the repository? If the answer is yes, you built a system. If the answer is no, you are still improvising.
At this point, you are no longer “using GitHub Copilot”. You are running an AI-assisted development process. Ready or the next part?
You May Also Like
You are holding GitHub Copilot Wrong!
If you’re constantly prompting, re-prompting, and steering GitHub Copilot, you’re using it wrong. This post explains why the problem isn’t your prompts, but the interaction model itself.
Building a Multi-Hierarchy Ticket Classification System
Learn how to build a production-ready ticket classification system that combines keyword matching, vector similarity, and LLM validation. Move beyond basic pattern matching to handle 125 subcategories …
Secretless cross-tenant dataverse access
Call Dataverse in Tenant B from Azure Functions in Tenant A without storing secrets or certificates; use a user-assigned managed identity and a federated identity credential. The app is multitenant …





