Research12 minMay 25, 2026

Collaboration Tools Every Researcher Should Know (and How to Actually Use Them)

Most labs run on a chaotic mix of email threads, untracked Google Docs, and Slack DMs that get lost in a week. Here is the minimum collaboration stack that actually keeps a research group productive — for writing, code, data, meetings, and references — with the trade-offs nobody mentions until you have already standardized on the wrong tool.

Jin Park
Founder & Editorial Lead

1. Pick a Stack Before You Pick Tools

The reason most lab tooling discussions go nowhere is that people argue

about Notion vs Obsidian or Zotero vs Mendeley as if the choice mattered

in isolation. It does not. What matters is whether the five layers of

research work — writing, code, data, communication, and references — have

one canonical tool each, and whether those tools survive the next student

rotation. A great tool nobody else in the lab uses is worse than a

mediocre one everyone defaults to.

Decide the stack first, then the specific tools. Write it down. Two

sentences per layer is enough: "We write papers in Overleaf, code lives

on the lab GitHub org, datasets sit on the shared S3 bucket, async

communication in Slack, references in Zotero with the lab group library."

That paragraph is the single most useful artifact a research group can

produce. Without it, every new student rebuilds the stack from scratch.

The Five Layers (Pick One Per Layer)

  • Writing: Overleaf, Google Docs, or LaTeX + GitHub
  • Code: GitHub, GitLab, or self-hosted Gitea
  • Data: shared bucket (S3/GCS), lab NAS, or DVC + remote
  • Communication: Slack, Discord, Mattermost, or email-only (rare)
  • References: Zotero (group library), Mendeley, or Paperpile

2. Writing: Overleaf for Submissions, Docs for Drafts

The split most productive labs land on: Google Docs (or Notion) for

early drafts, outlines, and brainstorms — Overleaf or local LaTeX +

GitHub for anything heading to a venue. The reason is that early-stage

writing is paragraph rewriting, comment threads, and reorganization,

which Google Docs does better than any LaTeX tool. Late-stage writing

is equations, references, and figure placement, which only LaTeX does

properly. Trying to do both in one tool means losing one of them.

If you are using Overleaf, pay for the institutional or Pro plan that

gives you Git integration. The free plan caps collaborators and makes

version history painful. Set up a convention for tracked changes:

either "use Overleaf's review mode" or "wrap revisions in

\\todo{} and \\edit{}" macros — pick one and stick with it. Mixed

conventions inside one document cost more time than the writing itself.

Writing Stack Defaults That Actually Work

  • Brainstorm / outline: Google Docs or Notion (comments matter)
  • Draft sections: Google Docs until the math density makes it painful
  • Camera-ready: Overleaf or local LaTeX + git, with one tracking convention
  • Figures: source files (Illustrator, Figma, matplotlib .py) checked into git
  • Always: one canonical bib file in the repo, referenced everywhere

3. Code: GitHub Is the Default, but Discipline Is the Point

Use GitHub (or your institution's GitLab) and stop debating. The choice

of platform matters less than three discipline issues that 90% of

research repos fail at: meaningful commits, a working README, and one

command that reproduces the main result. If your repo cannot be cloned

and run by a new student in under 30 minutes, it is not collaboration —

it is a backup of your laptop.

Two practices to standardize across the lab. First, a top-level

reproduce.sh (or Makefile target) that runs the pipeline from raw data

to main figure. Second, a CONTRIBUTING.md that says "before pushing,

run X" and is honestly only 10 lines. The point is not best-practice

theater; it is the next student inheriting the project. The single

best test of a research repo is whether someone who joined the lab last

week can reproduce Figure 3 without messaging you.

Research Repo Minimums

  • README with one-paragraph project description and entry-point command
  • requirements.txt / environment.yml / pyproject.toml — pinned versions
  • reproduce.sh that runs end-to-end on a fresh clone in under 30 min
  • data/ directory with a README pointing to where the real data lives
  • Tag the commit used for each paper submission (e.g., v1.0-neurips)

4. Data: Stop Emailing CSVs

The single most common collaboration failure is data on someone's

laptop. The fix is one shared storage location and a naming convention.

It does not matter whether you use S3, Google Cloud Storage, an

institutional NAS, or Dropbox — pick one and make it the only place

raw and processed data lives. If your lab has compute clusters, the

storage layer is usually decided for you; just write down the path

conventions.

For datasets that change over time, DVC (Data Version Control) is

worth the setup cost. It works alongside git and lets you version

datasets the same way you version code, with the actual bytes living

in your shared bucket. The alternative is filename suffixes like

data_v2_final_FINAL.csv, which is how labs lose six months of analysis.

DVC has a learning curve of a few hours; the payoff is months of saved

"wait, which version of the data did I use" investigations.

Data Hygiene Rules That Save Your Future Self

  • One canonical location for raw data — never email, never local-only
  • Read-only permissions on raw data; processed data goes in a separate prefix
  • Filename or path includes date and version (e.g., 2026-05-25/v3/)
  • A data/README.md documenting columns, units, and provenance
  • Consider DVC if datasets change weekly; skip it if they barely change

5. Communication: Slack Is Fine, but Async Is the Skill

Most labs settle on Slack (or Discord for the cost-conscious; Mattermost

for the self-hosted-conscious). The platform matters far less than

whether the group has rules about what goes where. Without rules, the

DM volume explodes, public channels become noisy, decisions get lost,

and new members cannot find context. The fix is a one-page channel

guide: which channel is for what, and what does not belong in DMs.

The async skill is more important than the tool. A well-written

asynchronous message — context, the specific question, what you have

already tried, what response you need — gets a useful answer in hours.

A vague "got a sec?" DM costs the recipient a meeting. Teach this

pattern early to new lab members; it compounds. The same applies to

decisions: write them down in a channel, not a DM, so the next student

can search for them in a year.

Channel Setup That Scales

  • #general: lab-wide announcements, low volume, no chit-chat
  • #random: chit-chat, links, memes — opt-in
  • #paper-{name}: one channel per active paper, archive when submitted
  • #reading-group: weekly paper discussions, links, slides
  • #help: technical questions; encourage public asks over DMs

6. References: Zotero Group Libraries Are Underrated

Pick one reference manager for the whole lab. Zotero is the safest

default in 2026: free, open-source, with group libraries that let

everyone share one bib file, and a browser connector that captures

papers in one click. Mendeley still works but Elsevier's ownership

makes long-term trust uncertain. Paperpile is excellent if your lab

lives in Google Docs but locks you in.

The trick that most labs miss: maintain a single lab group library

organized by project, and export the .bib file directly into your

paper repo with a script. That way, every paper draws from one

canonical reference list, citations stay consistent across drafts, and

a new student inherits five years of curated bibliography on day one.

Without this, every paper rebuilds its bibliography from scratch and

the same paper gets cited with three different formatting variations

across the lab.

Reference Workflow That Saves Hours per Paper

  • One lab group library — every student joins on day one
  • Subfolders per active project or paper, plus a 'shared core' folder
  • Better BibTeX plugin for stable citation keys (e.g., {auth}{year}{title})
  • Export .bib into the paper repo via script, regenerate on commit
  • Tag papers with reading-group date so you can find what was discussed when

7. Meetings and Notes: The Layer Most Labs Forget

Meetings without notes are work that did not happen. Pick a meeting

notes home — Notion, a shared Google Drive folder, a wiki, or even

one markdown file per meeting in the lab repo — and make it the

default. The format matters less than the habit. Three sections is

usually enough: agenda (sent the day before), decisions (written

during), action items (with owner and date).

Two specific patterns to copy. First, individual one-on-one notes:

each student keeps a single rolling doc with their advisor, organized

by date, that both can edit before meetings. This eliminates the "wait,

what did we agree last time?" loop. Second, a "decisions log" channel

or page where the lab records non-obvious choices ("we standardized on

PyTorch over JAX in March 2026 because…"). Future students will thank

you when they inherit a codebase and want to know why.

Meeting Notes Discipline

  • Agenda sent the working day before, even for recurring meetings
  • One rolling doc per advisor-student pair, ordered by date
  • Action items: who, what, by when — written during the meeting, not after
  • Lab-wide 'decisions log' for non-obvious technical or process calls
  • Archive paper-specific notes into the paper repo when submitted
Jin Park
About the author
Jin Park
Founder & Editorial Lead

PhD graduate who spent years tracking conference deadlines across computer science and engineering. Built ScholarDue after missing a submission window in the final year of candidacy and realizing no single tool tracked CFPs, extensions, and notification dates in one place.

Learn more