How to Write a Methods Section That Survives Peer Review
Reviewers reject papers they cannot reproduce. Here's how to write a methods section detailed enough to rebuild, ordered so a stranger could follow, and defensible against the questions reviewers actually ask.
1. What the Methods Section Is Actually For
The methods section has exactly two jobs: let a competent stranger reproduce your work, and
convince a skeptical reviewer that your results mean what you claim. Everything else — elegant
prose, motivation, history — belongs in the introduction. If a sentence does not serve
reproduction or credibility, cut it.
Reviewers reach for "reject" when they cannot tell what you did. A vague methods section reads
as either carelessness or concealment, and reviewers assume the less charitable option. The bar
is concrete: could someone in your field, with your section and nothing else, rebuild the
experiment and expect the same outcome? If the honest answer is no, you have rewriting to do.
Two Tests Every Methods Section Must Pass
- Reproduction: a peer could rebuild it from your text alone.
- Credibility: a skeptic cannot find an obvious confound you ignored.
- If either fails, the strongest results section will not save the paper.
2. Order It So a Stranger Could Follow
Write the methods in the order someone would execute them, not the order you discovered them.
A reliable structure: data or materials first, then preprocessing, then the model or procedure,
then the experimental setup (hardware, hyperparameters, splits), then evaluation metrics. Each
subsection should hand the reader cleanly to the next.
Use subsection headings generously. A reviewer skims first, reads second — clear headings let
them find the one detail they want to check without rereading the whole section. If your field
has a "Materials and Methods" convention, follow it exactly; reviewers notice when you deviate
from what they expect and it costs you goodwill before they reach your results.
3. Justify Choices, Don't Just Describe Them
The weakest methods sections are pure description: "We used a learning rate of 0.001 and trained
for 100 epochs." The reviewer's immediate question is "why?" — and an unanswered why is an opening
for rejection. Strong methods sections describe and justify in the same breath: "We used a learning
rate of 0.001, selected by grid search on the validation set (Appendix B)."
You do not need a citation or experiment for every choice, but you do need to signal that the
choice was deliberate. "Following standard practice in [domain]" with a citation handles the
conventional decisions. Reserve real justification for the choices a reviewer could plausibly
attack: your baseline selection, your evaluation metric, your data split, anything non-standard.
Choices Reviewers Will Question
- Why this baseline and not the obvious stronger one?
- Why this metric — does it actually measure what you claim?
- How were train/validation/test split, and could there be leakage?
- Were hyperparameters tuned on the test set? (If yes, the paper is dead.)
4. Report Enough to Reproduce — The Checklist
Reproducibility is not a virtue you mention; it is a list of specifics you provide. The single
most common reason a result cannot be reproduced is a missing number the authors thought was
obvious. Assume nothing is obvious. Many top venues now ship a reproducibility checklist with the
submission form — read it before you write, not after, because it tells you exactly what reviewers
were told to look for.
Put exhaustive detail in an appendix or supplementary file so the main text stays readable, then
reference it explicitly. "Full hyperparameters are in Appendix C" is far stronger than burying
forty numbers in a paragraph nobody can parse. Release code and data when you can — a working
repository answers more reviewer questions than any amount of prose.
Reproducibility Minimums
- Data: source, version, size, and exact preprocessing steps.
- Splits: how train/val/test were created, with seeds if random.
- Model: architecture, all hyperparameters, and how they were chosen.
- Compute: hardware, runtime, and number of runs averaged.
- Code: a link, or a clear statement of why it is unavailable.
5. Statistics and Baselines: The Reviewer's Attack Surface
Two things draw reviewer fire more than anything else: weak baselines and missing statistics.
Compare against the strongest method a reviewer would expect, not a straw man — if you skip the
obvious strong baseline, the first reviewer will name it and ask why it is absent, and you will
spend your rebuttal on defense instead of strength. When you cannot beat a baseline, say so and
explain the tradeoff; honesty reads better than a suspicious omission.
Report variance, not just point estimates. A single run is an anecdote. Average over multiple
seeds, report standard deviation or confidence intervals, and state how many runs you used. If
you claim method A beats method B, a reviewer will ask whether the difference survives the noise —
have the answer in the paper, ideally with a significance test appropriate to your data.
6. Common Mistakes and a Final Self-Check
The recurring failures are predictable: tuning on the test set, reporting one lucky run,
describing without justifying, and omitting the detail that turns out to be load-bearing. Each is
easy to fix before submission and expensive to fix during rebuttal, when a reviewer has already
formed a negative impression.
Before you submit, hand the methods section to a labmate who has not seen the project and ask
them to list everything they would need to reproduce it. The gaps they find are the gaps a
reviewer will find. Close them now, while it costs you an afternoon instead of a resubmission
cycle.
Methods Section Self-Check
- Could a stranger reproduce this from the text plus appendix alone?
- Is every non-standard choice justified, not just stated?
- Did you compare against the strongest expected baseline?
- Do results report variance over multiple runs, not a single number?
- Is it 100% certain nothing was tuned on the test set?
PhD graduate who spent years tracking conference deadlines across computer science and engineering. Built ScholarDue after missing a submission window in the final year of candidacy and realizing no single tool tracked CFPs, extensions, and notification dates in one place.
Learn more→