Originally published on ACM SIGPLAN Plan Perspectives.
Peer review is an essential aspect of academic research: when it works well, it provides a feedback loop that stimulates and rewards high-quality research, helping the subject advance. But, as we all know, it doesn't always work well. There never was a Golden Age when it did, of course, but it's become harder to maintain a shared view of what constitutes good reviewing as the subject grows, with larger numbers of submissions, larger program committees (PCs), and the shift to purely on-line discussion. All these have weakened the feedback loop that promoted high-quality reviewing, as PC members no longer have to explain their assessments live in front of their peers, often see nothing of the discussion of papers they did not review, and may not even have read all the submitted titles and abstracts. Instead, we load all that responsibility for maintaining review quality and common standards on our PC chairs – a nigh-on impossible task across hundreds of on-line discussions.
This note tries to spell out some of what constitutes good reviewing, to refresh and provide a little push towards improving our consensus. It's focused on PL-related research (programming languages, semantics, and verification), but much is more generally applicable. I started these lists as POPL PC Chair in 2014 and expanded them on twitter; I'll update the version on github from time to time. Comments are welcome, and many thanks to all those who have commented on previous versions.
As reviewers, what do we have to decide? Fundamentally, whether publishing the paper will advance the subject in some substantial way. In more detail:
Then, as our venues are typically competitive, we have to weigh the paper against other submissions (how competitive they should be is a question, but we won't go into that here) – so reviewers need some sense of the level of contribution appropriate to the venue, so that their scores are broadly comparable.
Bad Reasons to Reject Good Papers
Reviewing is essentially a judgement call. We've discussed our review processes at great length over the years, and those processes do matter – we've tuned them, and I think in many ways improved them – but, fundamentally, peer review relies on informed judgements from a suitably expert and sensible group of people. So this note is not about process. Instead, it identifies some of the bad forms of argument that one sees again and again. If one sees one of these, or (especially!) if one finds oneself writing one, an alarm bell should ring…
Many of these boil down to having due respect for the authors and the work they've put in – they've typically spent between one and 10 person-years on the submitted work, while the reviewer has spent maybe a day. Reviewers have to form a judgement, and sometimes will understand things better than the authors despite the mismatch of investment, but one should be cautious of assuming that one's first reaction is necessarily correct. One should also be cautious of arguing that substantially different research or exposition would be needed. The authors may already have been there and tried that, and in any case one has to review the paper at hand, not some hypothetical other. And one should be cautious of confusing suggestions (or whims!) and requirements; in writing a review, it's important to distinguish the main points justifying your view from random thoughts and suggestions.
They also highlight the need for reviewers to be dispassionate and aware of their own biases: to assess as best they can whether the subject would be best served by accepting the paper, not how much they personally like it.
Of course, none of them are absolutes – even the last reason above can be a legitimate complaint in specific circumstances, e.g. if that uncited paper renders the submitted work moot.
Another bad reason arises during discussion, after the first reviews have been written. At the end of the process, one has to arrive at accept/reject decisions, but during the process it's all too easy to regard the current scores as an objective assessment, e.g. saying "this is a "B" paper." The whole point of the discussion and author response is to consider whether reviews are wrong or miscalibrated – otherwise we'd just order papers by the original scores.
Good Reasons to Reject Bad PL Papers
On the other side, not all papers are good, unfortunately, and we shouldn't shy away from rejecting poor-quality work, lest the subject be contaminated with bogosity. Returning to the above list, in order of decreasing importance:
A clear "no" for any of these should rule the paper out from any serious venue. In more detail:
When arguing that a paper should be rejected, or summarising a PC decision for the authors, it may be useful to identify exactly which of these (or other) reasons justify that.
An aside on reviewer selection
The above is about how we review, as individuals, but before that comes the selection of reviewers, which is typically up to the PC chair(s) and the process that they and the surrounding organisation set up, and careful bidding by PC members. Finding enough reviewers with appropriate expertise and good judgement for each paper, e.g. aiming for two experts per paper, is the most important thing we can do to improve our decisions.
This note is not about process, but I do also want to remark that the ways we have typically implemented lightweight double-blind submission, good though that is to avoid first-impression bias, have also made it harder to do this. Most reviewers are now taken from a relatively small PC or ERC pool, and perhaps with automated assignment, rather than exploiting the knowledge and contacts of the whole PC to find the best experts.
So what next? The review process will always be imperfect, but we might socialise these guidelines, discussing and improving them, to encourage more thoughtful reviewing. PC chairs (or SIGPLAN as a whole) might choose to incorporate some version of them into the guidance we give to reviewers – an edited version has recently been used by Amal Ahmed and Jan Vitek as part of the OOPSLA 2022 reviewing guidelines. And as individual reviewers, and in review discussions, we might thereby focus just a bit more clearly on the legitimate and useful reasons to accept and reject papers.
Peter Sewell is a professor of computer science at the University of Cambridge. He's reviewed a few papers, and had some of his own accepted and rejected.
No entries found