Skip to main content
anmdotdev
0

My Exact AI Coding Workflow, Step by Step

8 min readAI, Workflow, Developer Tools, Productivity, TutorialAnmol Mahatpurkar

The earlier posts in this series were the theory.

This is the actual operating procedure.

If I am building a real feature with AI today, this is roughly the sequence I follow. Not every task uses every step with the same intensity, but this is the baseline workflow I trust.

The point of the workflow is not to make AI look magical.

It is to make the output predictable enough that I can move quickly without pretending first drafts are perfect.


Step 1: Pick a Bounded Task

The workflow starts before the prompt.

I want a task that is clear enough to hand off and review.

That means I avoid starting with something like:

"rethink the entire editor architecture"

Instead I want something more like:

"add slash-command insertion for image and code blocks to the editor"

If the task is too broad, the model improvises too much and I lose control of the review surface.

Bounded tasks make everything else easier:

  • the prompt is clearer
  • the scenarios are clearer
  • the diff is clearer
  • the review is faster

This sounds mundane, but bad task definition poisons the whole chain.


Step 2: Dump Context Fast

Once the task is scoped, I front-load context.

Usually I dictate this rather than typing it, because dictation makes it much easier to include all the details I would otherwise trim.

I will mention things like:

  • where the relevant files probably are
  • what the current UX does
  • what I want changed
  • what edge cases I am already worried about
  • what should definitely not regress

The goal is not elegance.

The goal is transmission.

If I know something that would help the model make a better decision, I want it in the prompt instead of in my head.


Step 3: Ask for Understanding Before Changes

This is still one of the most important steps in the whole process.

Before I let the model write code, I want it to inspect the relevant part of the codebase and explain it back to me.

A typical first prompt sounds like:

Go inspect the editor block system before making changes. Read the current block renderer, the slash command menu, and the existing block operations. Then explain back how insertion currently works, what files are involved, and where you think image/code block insertion should plug in. Feel free to ask clarifying questions before implementing anything.

This gives me two things:

  • the model builds context before acting
  • I can verify its understanding before trusting the implementation

If the explanation is wrong, I correct it early.

That is far cheaper than correcting a bad implementation later.


Step 4: Write a Thin Spec When the Task Deserves It

For small changes, I skip this.

For anything with moving parts, I usually write a short spec or structured note first.

Not a giant document.

Just enough to define:

  • files or modules
  • interfaces
  • key behaviors
  • non-negotiable constraints

For example:

## Slash Command Insertions
 
### Scope
- Add slash-command insertion for image and code blocks
- Reuse existing insertion flow for text and heading blocks
 
### Constraints
- Do not change persisted block schema
- Follow the current command menu component
- Keep keyboard flow identical to existing commands
 
### Scenarios
1. Typing `/image` inserts an image block below the current block
2. Typing `/code` inserts a code block and focuses it
3. Escape closes the command menu without changing content
4. Existing slash commands still behave exactly the same

This is usually enough to keep the model inside the right boundaries.


Step 5: Define the Scenarios Explicitly

I treat scenarios as the real contract.

If the scenarios are vague, the implementation will be vague.

If the scenarios are sharp, the model has a concrete target and I have a concrete review standard.

This is also where I often ask the model to help:

"List any edge cases or scenarios I am missing."

It is usually good at finding a few that matter.

By this point, the task has three layers of structure:

  • contextual prompt
  • current-state understanding
  • behavior contract

That is a much better starting point than "build feature X."


Step 6: Let It Implement

Only after the first five steps do I actually ask for code.

Then I get out of the way.

I do not like hovering while the model writes. That tends to make me overreact to implementation details before I have seen the result as a whole.

I would rather let it produce a coherent first pass and then review the output.

This is where a lot of the leverage comes from.

Once the model has enough context and enough structure, implementation becomes the comparatively easy part.


Step 7: Dry Run the Scenarios

Before runtime testing, I often ask for a dry run.

That means:

walk through each scenario step by step through the code and tell me where anything breaks.

This catches things like:

  • missing rollback paths
  • stale state
  • bad focus handling
  • incomplete error handling
  • branch logic that never actually leads to the expected UI

It is cheap and it works.

The dry run is one of the best bug filters I have.


Step 8: Run Real Feedback Loops

After the dry run, I want actual feedback.

That means some combination of:

  • linter
  • type checker
  • tests
  • browser interaction
  • console inspection

This is where the workflow stops being about generation and becomes about convergence.

I will often give instructions like:

Run the relevant checks. Then test the scenarios in the browser. If anything fails, fix it and retest. Report back with what changed and anything you are still uncertain about.

This is the difference between "the model wrote code" and "the model participated in a development loop."

That second version is the one I care about.


Step 9: Review the Diff Like a Pull Request

Once the model has converged as much as it can, I review the result like I would review a PR from a teammate.

I am looking for:

  • architectural weirdness
  • missed edge cases
  • places where the behavior drifted from the spec
  • unnecessary scope expansion
  • anything that feels under-tested

I am not trying to prove I could have written it differently.

I am trying to decide whether this output is safe, correct, and aligned.

That is an important mindset shift.

You get much better results from AI when you treat review as review, not as a chance to reassert authorship over every line.


Step 10: Use a Second Model When It Matters

For higher-risk work, I sometimes ask another model to review the output.

That can be surprisingly effective.

Not because the second model is always better, but because it usually has different blind spots.

My prompt is usually direct:

Another model implemented this feature. Review the diff critically. Find bugs, edge cases, design issues, and anything that does not match the intended behavior.

Then I feed that review back into the original loop or handle the fixes myself.

I do not do this for every tiny change.

But for bigger features, it is a useful extra filter.


Step 11: Merge, Then Reset Context

After a feature is done, I prefer to reset cleanly.

Long-running sessions degrade.

Context drifts.

The model starts anchoring to earlier assumptions or irrelevant history.

So once a task is complete, I would usually rather:

  • merge it
  • start a fresh session
  • rebrief the next task cleanly

Fresh context with a good prompt beats stale context with accumulated noise surprisingly often.

That is especially true once you start running multiple agents in parallel.


What This Looks Like End to End

If I compress the workflow into one linear sequence, it is basically this:

  1. scope the task tightly
  2. dump the relevant context
  3. ask the model to understand first
  4. write a thin spec if needed
  5. define scenarios
  6. implement
  7. dry run
  8. run checks and runtime validation
  9. review the diff
  10. optionally cross-review with another model
  11. merge and reset

That is the workflow.

It is not complicated because each step is individually clever.

It works because each step reduces a different class of error.

Together, they create a system that is much more reliable than simple one-shot prompting.


Where the Leverage Really Comes From

The biggest misconception about this workflow is that the leverage comes from "AI writes code fast."

That is true, but it is not the whole story.

The bigger leverage comes from the fact that once the workflow is stable:

  • direction gets faster
  • iteration gets faster
  • validation gets faster
  • review gets more focused

That is a much bigger shift than autocomplete.

It changes what part of the job dominates your day.

I spend much less time typing implementation and much more time:

  • defining behavior
  • shaping tasks
  • reviewing outcomes
  • making architecture decisions

That is why this workflow feels like a genuine change in how software gets built, not just a new productivity trick.


The Most Important Part

If I had to collapse everything into one sentence, it would be this:

AI becomes reliable when you stop treating it like a generator and start treating it like a participant in a full engineering loop.

That is the whole system.

And once you feel how much better that works, it is very hard to go back.

Was this helpful?
0

Newsletter

Get future posts by email

If this piece was useful, subscribe to get the next one in your inbox.

No spam. Double opt-in. One email per post.

Discussion

0/2000