2025-10-08
@hardin.bsky.social
How I use it:
I do not use it to mark your assignments.
The University position statement does not say much either way about AI use.
What should the policy be?
(from https://substack.com/home/post/p-160131730)
Emphasize craftsmanship. There is a difference between what is “simple” and what is “easy”.
Simple is hard work. Vibe coding is easy.
AI rewards having a structured, careful development process.
via https://www.arthropod.software
Use the AI to map the codebase, identify dependencies, help you understand the problem.
It is key to leverage your insights here to keep the AI on track.
Get the AI to create a detailed implementation plan, perhaps as a Markdown checklist.
Separates the what from the how.
Humans now conduct a thorough design review of the plan.
Here we apply our design skills looking for overly coupled components, poor modifiability, obvious design smells.
The AI translates the plan into code. Kent Beck suggests this only works if you have a rigorous test suite (i.e., TDD). The tools will optimize, which includes lying to you!
via arthropod.software
You should create a project-wide context.md file (in Copilot)
In your project/repo root, create .github/copilot-instructions.md
What goes into that file?
Use the AI to brainstorm/delineate steps in the building process. Give it your project context AND your team’s context (skills, gaps). Work one step at a time.
Then:
Now that we’ve wrapped up the brainstorming process, can you compile our findings into a comprehensive, developer-ready specification? Include all relevant requirements, architecture choices, data handling details, error handling strategies, and a testing plan so a developer can immediately begin implementation. Write this to the file
spec.md.
Get your team mates to check the proposed spec. You can also potentially ask other LLMs to revise it for you.
Then ask the AI to create a detailed set of implementation steps. Save this as something like prompt-plan.md. Add a todo.md file to let the AI check steps off as it finishes them.
Paste the prompts into Copilot CLI. It uses the Claude Sonnet 4.5 model currently, so you could also use Aider, Claude, etc.
Aside: model vs tool distinction.
You need a way to confirm the code does what you want/hope.
Get the AI - maybe a different one - to create some (integration) tests for what you are looking for.
Always ask for tests as part of the normal AI workflow, but keep the integration tests apart (read: no shared context).
What were the warning signs that told you the AI was going off track?
context venn diagram via https://www.philschmid.de/context-engineering
As Simon Willison writes,
The entire game when it comes to prompting LLMs is to carefully control their context—the inputs (and subsequent outputs) that make it into the current conversation with the model.
So how can we do that?
An LLM is trained on vast amounts of general data, like all of Wikipedia and all of GitHub.
But our problem is a particular one, and we don’t know if the LLM’s distribution matches ours.
So we need to steer it to the space of solutions that apply to us.
Another term for this is context engineering1
(derived from the article here)
Context is not free… every token influences the model’s behavior.
A lot of the challenge with software engineering is understanding what the current program is supposed to be doing—what Peter Naur called the “theory” of the program.
The LLM is no different. We need to tell it what to do.
RAG: try to figure out the most relevant documents for the user’s question and stuff as many of them as possible into the prompt. – Simon Willison
An eval is a test that the AI’s output is doing what you had hoped.
It isn’t a unit test, since unit tests are repeatable and deterministic (ideally). But an eval should give you a sense that the output is what you expect.
Evals seem to be like acceptance tests or integration tests. You can use them to check the prompt outputs directly (e.g., do all these files exist? Are there security bugs?)
This is an emerging space. One approach is described here.
It uses the PromptFoo tool to manage the workflow.
The idea is that you
Key to eval is the assertion for what the model should look for. In PromptFoo there are several types of assertions 1
**capability**: It gives accurate advice about asset limits based on your state
**question**: I am trying to figure out if I can get food stamps. I lost my job 2 months ago, so have not had any income. But I do have $10,000 in my bank account. I live in Texas. Can I be eligible for food stamps? Answer with only one of: YES, NO, REFUSE.
**__expected**: contains:NO
Problem: I don’t want the AI to speculate on how to access my API - I have a precise set of calls it can use.
I don’t want it to reinvent regular expressions – just use sed, grep, awk etc.
How to tell the AI what is available? Need to connect it to the API so it can discover what is possible.
NEVER try to edit a file by running terminal commands unless the user specifically asks for it. (Copilot instructions) 1
MCP solves this problem by providing a standardized way for AI models to discover what tools are available, understand how to use them correctly, and maintain conversation context while switching between different tools. It brings determinism and structure to agent-tool interactions, enabling reliable integration without custom code for each new tool” 1

Neil Ernst ©️ 2024-5