Because it would end up favoring research that may or may not be better than the honestly submitted alternative which doesn't make the cut, thereby lowering the quality of the published papers for everyone.
It ends up favoring research that may or may not be better than the honestly reviewed alternative, thereby lowering the quality of published papers in journal where reviewers tend to rely on AI.
That can't happen unless reviewers dishonestly base their reviews on AI slop. If they are using AI slop, then it ends up favoring random papers regardless of quality. This is true whether or not authors decide to add countermeasures against slop.
Only reviewers can ensure that higher quality papers get accepted and no one else.
I expect a reviewer using AI tools to query papers to do a half decent job even if they don’t check the results… if we assume the AI hasn’t been prompt injected. They’re actually pretty good at this.
Which is to say, if there were four selections to be made from ten submissions, I expect that humans and AI reviewers to select the same winning 4 quite frequently. I agree with the outrage of the reviewers deferring their expertise to AI on grounds of dishonesty among other reasons. But I concur with the people that do it that it would mostly work most of the time in selecting the best papers of a bunch.
I do not expect there to be any positive correlation between papers that are important enough to publish and papers which embed prompt injections to pass review. If anything I would expect a negative correlation—cheating papers are probably trash.