Last month I wrote about my first experiences with Claude. I ended that post promising month two would get more interesting. Oh, it did.
This month I set myself a comparison activity. I’d already built a personal coaching companion as a Skill in Claude, just as a side project. A test case. Now I wanted to explore if I could build the same thing in Copilot, as an Agent, and what would that teach me about the tools. This would also help consider how the rest of us should think about all of this if we’re rolling it out to other humans? Some of what I learned was funny. Some was uncomfortable. And a key finding came from my husband, in five minutes, about a glass of water.
First, a confession about the word “agent”
I had been putting this off. I have been busy with work and family, and occasionally telling myself I’d get to it. That I needed time. That I should do some training first. But I think underneath all of that was something more honest: the word “agent” had been quietly putting me off. It sounded technical. I feel that “Skill” in Claude had sounded like something I could shape. But “Agent” in Copilot sounded like something I’d need to build.
The language we wrap around a feature changes whether people ever try it.
What got me going wasn’t training. It was a day off, a walk and some fresh air, and a today’s-the-day feeling. I opened Copilot, found New Agent, and decided to just give it a go.
The build itself was disarmingly easy
I deliberately didn’t watch any tutorials. I wanted to see how much of my Claude experience would transfer. A lot of it did. I described what I wanted – a coaching companion using a specific coach framework. An AI coach that is warm and curious, asks questions instead of giving advice – and Copilot translated that into structured instructions. I dragged in a Word document with the coaching method, added a few clarifying lines, then watched the agent build.
Note: This may have not been the ideal topic or ask for an Agent, but most of my work with Claude or Copilot is company Intellectual Property. My coaching training a few years ago, and the method, seemed like something I could explore as an interesting test case with AI to see how it could be setup and compared to the human experience. This isn’t something I intend to use or share more broadly.

Total time to a first version: less than an hour.
I’d delayed this for weeks because I thought it would be hard, and the build took less time than my walk that morning.
Then I tested it. Things got interesting.
I used the same opening prompt I’d tested with my Claude version: “I want to discuss an issue I have with my exercise”.
Claude had responded with something gentle and short. One question. One opening of the door. Copilot responded by explaining the coaching framework to me. Bolded heading. “C — Clarify the challenge.” Three questions in a bulleted list.

It wasn’t coaching me. It was telling me about coaching.
The patterns that emerged
Over the next hour I iterated. I was explicit – never name the framework, ask one question at a time, no bullets, no lists. Copilot kept finding new ways to default back to its natural self.
It loves to be thorough. Even with “one question per turn” locked down, it would write a long acknowledgement, a reframe, a meta-comment, and then the question.

Every response had the same shape. Acknowledgement, em-dash, empathetic reframe, question with an italicised emphasis word. Once you see the template, you can’t unsee it.

Bold text appeared whenever it wanted to seem clearer. When I said “what do you mean?”, it came back with “Thanks for sticking with it” and the next question in bold. The agent shouted at me when I asked for clarification! I genuinely laughed.

Every opening was the same scripted empathy line. “That sounds really frustrating”.
My family is largely neurodivergent and we have a running joke that this exact phrase is the autistic empathy script – the learned line you reach for when you want to show you care. AI empathy isn’t empathy; it’s pattern recognition wearing the costume. Sometimes you can see the seams.
Even good questions didn’t land. Later on I got a question a human coach might genuinely ask. But by then I’d watched the system process its way through every response, and seen “When you think about…” twice in a row. The good question arrived in a worn-out frame.

What I think is actually going on
Copilot is designed around the metaphor of a thorough, helpful, well-structured assistant. That is brilliant for so many things it does – drafting, summarising, briefing. Coaching needs the opposite. It’s sparse. It withholds. It varies. It pauses. A good coach has ten questions in their head and asks one, and the not-asking is half the skill.
What I kept bumping into wasn’t a prompting problem. It was a design philosophy problem. Every time I pushed Copilot toward less, it found a way to add more.
Copilot is more polished. Claude is more sophisticated. In coaching, those aren’t the same thing.
Then my husband had a glass of water
After some time testing the Agent, I asked my husband to try it. He picked his own topic: he doesn’t drink enough water across the day.
He lasted about three rounds, then came back to me baffled. “It made me do this whole long process. I just wanted to know what to try”. He hadn’t wanted to be coached. He’d wanted a suggestion. Put a bottle on your desk. Drink a glass with each meal. Done.
At this point the whole task or time on this reframed itself. I’d been testing whether the agent could deliver coaching. He’d exposed a bigger question: does the person on the receiving end even want to be coached? Coaching is a contract. The agent skips that contract because choosing to open it is treated as consent. He didn’t get bad coaching. He got coaching when he wanted advice. Perhaps I needed to be clearer what the Agent was for, and the purpose of kicking off the conversation or process.
What this exercise taught me
- Not every Skill should be rebuilt as an Agent. Coaching, with its dependence on pacing and withholding, doesn’t travel. It worked really well in one, not the other. I am curious if in the Microsoft world if Copilot Studio would result in a better experience, but not at this use level yet.
- The label changes everything. I delayed because “Agent” sounded harder than “Skill.” My husband bounced off because “coaching” wasn’t framed at the start. Same gap, twice, one afternoon.
- Onboarding matters more than the tool. Welcome screens, starter prompts, the line that says “this isn’t a place for quick answers” – that scaffolding is the change management. My Claude experience has been deliberate, considered, with lots of testing and exploring. I need to apply more time like this into Copilot.
- Building taught me more than using would have. Translating my Claude Skill into a Copilot Agent forced me to articulate what coaching actually is. No comparison article could have done that. This morning I wasn’t an Agent builder, this afternoon I can converse about the experience. Lots of learning!
I’m not retiring the Copilot Agent. There are categories of use case where its instincts toward thoroughness will be exactly right, and I want to test one of those next. But the coaching companion is staying in Claude. That’s where the shape of the conversation lives. I have found testing out this AI purpose in Claude was a smoother build, and more useful experience. As I said earlier, this isn’t a work solution or thing I intend to use, share, or benefit from. It doesn’t relate to my work, and the coaching method is the IP of the coach training provider. The purpose of this exercise was to push AI and explore how it would work with a process or method like this.
And the household joke about “that sounds really frustrating” is here to stay. I will never not hear it now.