The fastest way to lose a team's trust in a tool is to be confidently wrong once. After that, every answer gets double-checked by a human, and the tool has made work, not saved it.
Cite the source
So we do not let the system be vague. Not "this area is risky." Instead: this file, this function, this line, this owner. Something a person can verify in thirty seconds.
Check the checker
Where the stakes are high, one agent describes what the code does and a second, independent agent re-reads the code and judges that description. The second agent shares none of the first one's context, so it catches the things the first one made up.
Grounding is not a feature you add at the end. It is the difference between a demo and something a team will actually rely on.