I reviewed a paper recently and I had to agree not to use AI in any aspect of the reviewing process. So I didn't but it felt strange, like I wouldn't be able to use a calculator to check calculations in a paper. Large language models aren't perfect but they've gotten very good and while we shouldn't trust them to find issues in a paper, they are certainly worth listening to. What we shouldn't do is have AI just write the review with little or no human oversight, and the journal wanted me to check the box probably to ensure I wouldn't just do that, though I'm sure some do and check the box anyway.
I've been playing with OpenAI's o3 model and color me impressed especially when it comes to complexity. It solves all my old homework problems and cuts through purported P v NP proofs like butter. I've tried it on some of my favorite open problems where it doesn't make new progress but it doesn't create fake proofs and does a good job giving the state of the art, some of which I didn't even know about beforehand.
We now have AI at the level of new graduate students. We should treat them as such. Sometimes we give grad students papers to review for conferences but we need to look over what they say afterwards, the same way we should treat these new AI systems. Just because o3 can't find a bug doesn't mean there isn't one. The analogy isn't perfect, we give students papers to review so they can learn the state of the art and become better critical thinkers, in addition to getting help in our reviews.
We do have a privacy issue. Most papers under review are not for public consumption and if uploaded into a large-language model they could become training data and be revealed if someone asks a relevant question. Ideally we should use a system that doesn't train on our inputs if we use AI for reviewing but both the probability of leakage and amount of damage is low, so I wouldn't worry too much about it.
If you are an author, have AI review your paper before you submit it. Make sure you ask AI to give you a critical review and make suggestions. Maybe in the future we'd required all submitted papers to be AI-certified. It would make the conference reviewers jobs less onerous.
For now, humans alone or AI alone is just not the best way to do conference reviews. For now when you do a review, working with an AI system as an assistant will lead to a stronger review. I suspect in the future, perhaps not that far, AI alone might do a better job. We're not there yet, but we're further than you'd probably expect.
No comments:
Post a Comment