Collaboration Tools Every Researcher Should Know (and How to Actually Use Them)
Most labs run on a chaotic mix of email threads, untracked Google Docs, and Slack DMs that get lost in a week. Here is the minimum collaboration stack that actually keeps a research group productive — for writing, code, data, meetings, and references — with the trade-offs nobody mentions until you have already standardized on the wrong tool.
1. Pick a Stack Before You Pick Tools
The reason most lab tooling discussions go nowhere is that people argue
about Notion vs Obsidian or Zotero vs Mendeley as if the choice mattered
in isolation. It does not. What matters is whether the five layers of
research work — writing, code, data, communication, and references — have
one canonical tool each, and whether those tools survive the next student
rotation. A great tool nobody else in the lab uses is worse than a
mediocre one everyone defaults to.
Decide the stack first, then the specific tools. Write it down. Two
sentences per layer is enough: "We write papers in Overleaf, code lives
on the lab GitHub org, datasets sit on the shared S3 bucket, async
communication in Slack, references in Zotero with the lab group library."
That paragraph is the single most useful artifact a research group can
produce. Without it, every new student rebuilds the stack from scratch.
The Five Layers (Pick One Per Layer)
- Writing: Overleaf, Google Docs, or LaTeX + GitHub
- Code: GitHub, GitLab, or self-hosted Gitea
- Data: shared bucket (S3/GCS), lab NAS, or DVC + remote
- Communication: Slack, Discord, Mattermost, or email-only (rare)
- References: Zotero (group library), Mendeley, or Paperpile
2. Writing: Overleaf for Submissions, Docs for Drafts
The split most productive labs land on: Google Docs (or Notion) for
early drafts, outlines, and brainstorms — Overleaf or local LaTeX +
GitHub for anything heading to a venue. The reason is that early-stage
writing is paragraph rewriting, comment threads, and reorganization,
which Google Docs does better than any LaTeX tool. Late-stage writing
is equations, references, and figure placement, which only LaTeX does
properly. Trying to do both in one tool means losing one of them.
If you are using Overleaf, pay for the institutional or Pro plan that
gives you Git integration. The free plan caps collaborators and makes
version history painful. Set up a convention for tracked changes:
either "use Overleaf's review mode" or "wrap revisions in
\\todo{} and \\edit{}" macros — pick one and stick with it. Mixed
conventions inside one document cost more time than the writing itself.
Writing Stack Defaults That Actually Work
- Brainstorm / outline: Google Docs or Notion (comments matter)
- Draft sections: Google Docs until the math density makes it painful
- Camera-ready: Overleaf or local LaTeX + git, with one tracking convention
- Figures: source files (Illustrator, Figma, matplotlib .py) checked into git
- Always: one canonical bib file in the repo, referenced everywhere
3. Code: GitHub Is the Default, but Discipline Is the Point
Use GitHub (or your institution's GitLab) and stop debating. The choice
of platform matters less than three discipline issues that 90% of
research repos fail at: meaningful commits, a working README, and one
command that reproduces the main result. If your repo cannot be cloned
and run by a new student in under 30 minutes, it is not collaboration —
it is a backup of your laptop.
Two practices to standardize across the lab. First, a top-level
reproduce.sh (or Makefile target) that runs the pipeline from raw data
to main figure. Second, a CONTRIBUTING.md that says "before pushing,
run X" and is honestly only 10 lines. The point is not best-practice
theater; it is the next student inheriting the project. The single
best test of a research repo is whether someone who joined the lab last
week can reproduce Figure 3 without messaging you.
Research Repo Minimums
- README with one-paragraph project description and entry-point command
- requirements.txt / environment.yml / pyproject.toml — pinned versions
- reproduce.sh that runs end-to-end on a fresh clone in under 30 min
- data/ directory with a README pointing to where the real data lives
- Tag the commit used for each paper submission (e.g., v1.0-neurips)
4. Data: Stop Emailing CSVs
The single most common collaboration failure is data on someone's
laptop. The fix is one shared storage location and a naming convention.
It does not matter whether you use S3, Google Cloud Storage, an
institutional NAS, or Dropbox — pick one and make it the only place
raw and processed data lives. If your lab has compute clusters, the
storage layer is usually decided for you; just write down the path
conventions.
For datasets that change over time, DVC (Data Version Control) is
worth the setup cost. It works alongside git and lets you version
datasets the same way you version code, with the actual bytes living
in your shared bucket. The alternative is filename suffixes like
data_v2_final_FINAL.csv, which is how labs lose six months of analysis.
DVC has a learning curve of a few hours; the payoff is months of saved
"wait, which version of the data did I use" investigations.
Data Hygiene Rules That Save Your Future Self
- One canonical location for raw data — never email, never local-only
- Read-only permissions on raw data; processed data goes in a separate prefix
- Filename or path includes date and version (e.g., 2026-05-25/v3/)
- A data/README.md documenting columns, units, and provenance
- Consider DVC if datasets change weekly; skip it if they barely change
5. Communication: Slack Is Fine, but Async Is the Skill
Most labs settle on Slack (or Discord for the cost-conscious; Mattermost
for the self-hosted-conscious). The platform matters far less than
whether the group has rules about what goes where. Without rules, the
DM volume explodes, public channels become noisy, decisions get lost,
and new members cannot find context. The fix is a one-page channel
guide: which channel is for what, and what does not belong in DMs.
The async skill is more important than the tool. A well-written
asynchronous message — context, the specific question, what you have
already tried, what response you need — gets a useful answer in hours.
A vague "got a sec?" DM costs the recipient a meeting. Teach this
pattern early to new lab members; it compounds. The same applies to
decisions: write them down in a channel, not a DM, so the next student
can search for them in a year.
Channel Setup That Scales
- #general: lab-wide announcements, low volume, no chit-chat
- #random: chit-chat, links, memes — opt-in
- #paper-{name}: one channel per active paper, archive when submitted
- #reading-group: weekly paper discussions, links, slides
- #help: technical questions; encourage public asks over DMs
6. References: Zotero Group Libraries Are Underrated
Pick one reference manager for the whole lab. Zotero is the safest
default in 2026: free, open-source, with group libraries that let
everyone share one bib file, and a browser connector that captures
papers in one click. Mendeley still works but Elsevier's ownership
makes long-term trust uncertain. Paperpile is excellent if your lab
lives in Google Docs but locks you in.
The trick that most labs miss: maintain a single lab group library
organized by project, and export the .bib file directly into your
paper repo with a script. That way, every paper draws from one
canonical reference list, citations stay consistent across drafts, and
a new student inherits five years of curated bibliography on day one.
Without this, every paper rebuilds its bibliography from scratch and
the same paper gets cited with three different formatting variations
across the lab.
Reference Workflow That Saves Hours per Paper
- One lab group library — every student joins on day one
- Subfolders per active project or paper, plus a 'shared core' folder
- Better BibTeX plugin for stable citation keys (e.g., {auth}{year}{title})
- Export .bib into the paper repo via script, regenerate on commit
- Tag papers with reading-group date so you can find what was discussed when
7. Meetings and Notes: The Layer Most Labs Forget
Meetings without notes are work that did not happen. Pick a meeting
notes home — Notion, a shared Google Drive folder, a wiki, or even
one markdown file per meeting in the lab repo — and make it the
default. The format matters less than the habit. Three sections is
usually enough: agenda (sent the day before), decisions (written
during), action items (with owner and date).
Two specific patterns to copy. First, individual one-on-one notes:
each student keeps a single rolling doc with their advisor, organized
by date, that both can edit before meetings. This eliminates the "wait,
what did we agree last time?" loop. Second, a "decisions log" channel
or page where the lab records non-obvious choices ("we standardized on
PyTorch over JAX in March 2026 because…"). Future students will thank
you when they inherit a codebase and want to know why.
Meeting Notes Discipline
- Agenda sent the working day before, even for recurring meetings
- One rolling doc per advisor-student pair, ordered by date
- Action items: who, what, by when — written during the meeting, not after
- Lab-wide 'decisions log' for non-obvious technical or process calls
- Archive paper-specific notes into the paper repo when submitted
PhD graduate who spent years tracking conference deadlines across computer science and engineering. Built ScholarDue after missing a submission window in the final year of candidacy and realizing no single tool tracked CFPs, extensions, and notification dates in one place.
Learn more→