Yep. When the credentials were used earlier on in the session they'd been scrubbed from the logs - so there's some checking, but not on the code that's committed.
LLMs are not intelligent machines, they are lying engines that predict the next most likely thing to do or say. If publishing your credit card details, home address and blood type meshes with the last thing it ingested, it'll do it.
So it didn't warn user that secrets are still visible in repo history and have to rotated, it only made that revert?
Yep. When the credentials were used earlier on in the session they'd been scrubbed from the logs - so there's some checking, but not on the code that's committed.
How did you catch it — scanner, review, or just noticed manually? I treat agent-generated diffs as untrusted by default now.
I was manually reviewing when I saw it. Was looking through the PR more out of interest than worried that there'd be a problem tbh.
LLMs are not intelligent machines, they are lying engines that predict the next most likely thing to do or say. If publishing your credit card details, home address and blood type meshes with the last thing it ingested, it'll do it.
"… though to be fair, it did sincerely apologize and promised never to do it again."