You are holding GitHub Copilot Wrong!

How To

You are holding GitHub Copilot Wrong!

by: Luise Freese
Dec 13, 2025

Most developers think that one can’t really use GitHub Copilot wrong. There is a chat interface that lets you also choose a model, so the obvious choice is to choose a model (trust the LinkedIn bros on that) write prompts, get some code, adjust a few things, maybe ask a follow-up, rinse and repeat. Some say, it’s like working with a junior dev who has amnesia, some say it feels productive, even efficient.

That feeling is exactly the problem.

What’s actually happening in most teams is not collaboration with Copilot, but a loop of prompting, re-prompting, and patching. You ask for something, the result is slightly off, so you add more detail. That fixes one thing and breaks another. So you clarify again, this time longer, more specific, more desperate. Eventually you either wrestle the output into shape or give up and finish it yourself.

From the outside, this looks like “using GitHub Copilot”. From the inside, it’s just interactive trial and error, with really nice autocomplete, too many rocket 🚀 emojis and ridiculous responses from Copilot that your system would now be ✨ production ready✨. Spoiler: It is not. Effectively, it means that you are babysitting Copilot.

The common reaction is to assume the prompt wasn’t good enough. So we respond by adding more words. More context. More instructions. More constraints. The interaction grows longer, but not necessarily clearer. Each new prompt tries to compensate for the last one, without ever fixing the underlying issue.

The ugly truth is that this isn’t a prompting skill problem. It’s a structural one.

GitHub Copilot is not confused because it lacks intelligence or memory. It produces inconsistent results because we keep changing the shape of the problem while asking it to solve it. Intent, constraints, quality bars, and validation are all mixed together in free-form text, rewritten slightly differently every time. The model does exactly what it’s supposed to do in that situation; it adapts to the latest input and optimizes for that moment, without any stable frame of reference.

The problem with the context window

The context window makes this worse, not better. Copilot does not reason over “the conversation” in the way humans intuitively imagine. It sees a sliding snapshot of recent input, where newer instructions naturally outweigh older ones and subtle phrasing differences matter more than you think. Each follow-up prompt does not refine a shared understanding; it reshapes the problem again, often implicitly undoing assumptions you thought were still in place.

This is why re-prompting feels so fragile. You believe you are adding clarification, but from the model’s perspective you are changing the weighting of intent, constraints, and priorities inside a limited window of attention. Without a stable structure anchoring those elements, the model has no way to know what must remain invariant and what is allowed to change. So it does the only reasonable thing it can do: it treats the most recent input as the most important truth.

If you did this with humans, you would get the same outcome. Imagine changing the requirements, definition of done, and architectural constraints in every conversation, then being surprised that the result varies. We accept this behavior from ourselves, then blame the tool when it responds accordingly.

prompt meme

This is why so many GitHub Copilot interactions feel oddly fragile. Small wording changes produce disproportionately different results. A follow-up meant to “just tweak one thing” suddenly derails the whole solution. You spend more time steering than building, and yet it still feels faster than starting from scratch, which makes the pattern hard to question.

Autocomplete reinforces this illusion. It works often enough to feel helpful, but not reliably enough to be trusted. For many developers, Copilot never graduates beyond that role; a slightly smarter snippet generator that occasionally needs correction. That’s not a failure of the technology, but the consequence of how it’s being used.

The real issue is that most GitHub Copilot usage is entirely ad-hoc (and chaotical as well). Every task starts from a blank chat, every prompt is handcrafted in the moment, and every success dies with the session. Nothing is reusable, nothing is stable, and nothing improves over time. The only learning that happens is in the developer’s head, not in the system. And with that, it contradicts all the good dev practices that we established over the years.

Staying engaged doesn’t help

I even read on social media that you should stay actively engaged with GitHub Copilot while it is “thinking”, avoid doing anything distracting, and carefully watch how it generates code in real time. Oh boy, please just don’t.

This advice sounds productive, but it completely misses the point of what’s actually broken in most Copilot workflows. The problem is not that developers are insufficiently attentive during generation. The problem is that the interaction model itself is wrong.

Watching Copilot generate code line by line is just prompt-by-prompt coding with better animations. It optimizes for micro-level engagement while locking you into the same fragile loop: prompt, wait, inspect, fix, re-prompt. You might feel more “in control”, but you’re still reacting to output instead of designing the system that produces it.

Staying glued to the editor during those 30–60 seconds does not scale your thinking; it fragments it. You are forced to hold the entire problem in working memory while the model explores a solution space you did not properly constrain in the first place. The moment something goes slightly off, the only lever you have is another prompt. So you interrupt, redirect, clarify, and steer, over and over and over.

The idea that you should study the model’s intermediate steps also assumes those steps are stable, meaningful, and reusable. They are not. What you are watching is an super short-lived reasoning trace for a one-off interaction that will never happen in exactly the same way again. Any “intuition” you build there is about how to rescue a weak setup, not how to avoid it.

The cognitive drain

There is also a deeper cost that rarely gets mentioned. Prompt-by-prompt coding trains you to think in very small increments, because that’s the only scale the interaction supports. You focus on the next function, the next file, the next tweak, because that’s the granularity the interaction forces on you. Architecture, constraints, invariants, and definitions of done all get deferred, because they don’t fit well into reactive steering. Over time, this erodes your ability to reason about systems at a higher level. That’s the brain-rot people are intuitively pointing at, even if they don’t always articulate it well.

A more mature approach flips this completely. Instead of babysitting the model while it types, you invest your attention upfront. You define specifications, constraints, and guardrails clearly enough that the model can work unattended for longer stretches. While it does that, you move on. You think about the next spec, the next boundary, the next decision. There is no “wait time” to fill with mindfulness exercises; waiting disappears as a concept.

Once you stop coding prompt by prompt, parallelism becomes possible. Not in the sense of typing faster, but in the sense of progressing multiple pieces of work at once without context loss. That is where the real productivity gains come from: Not from watching tokens stream by, but from designing interactions that no longer require your constant supervision.

Prompt Gru

To help you out, I put together a series: Not about writing better prompts or finding clever phrasing, but about stepping out of the prompt-repair loop altogether. The shift is from reacting to outputs to designing interactions and from improvising one-off wants to engineering instructions.

In the next part, we’ll start by dismantling a surprisingly harmful assumption; the idea that GitHub Copilot is a chatbot you talk to, rather than a system you shape.

Building a Multi-Hierarchy Ticket Classification System

Learn how to build a production-ready ticket classification system that combines keyword matching, vector similarity, and LLM validation. Move beyond basic pattern matching to handle 125 subcategories …

by: Luise
Dec 10, 2025

How To

Secretless cross-tenant dataverse access

Call Dataverse in Tenant B from Azure Functions in Tenant A without storing secrets or certificates; use a user-assigned managed identity and a federated identity credential. The app is multitenant …

by: Luise
Oct 22, 2025

How To

How Azure CLI handles your tokens and what you might be ignoring

The Azure CLI feels like magic: One az login and you’re in forever. But behind that convenience sits a cache of refresh tokens, shared across tools and tied to your Windows account. This post breaks …

by: Luise
Oct 19, 2025