Writing a PhD thesis with Org Mode

This is a work-in-progress post. Expect updates.

TLDR: I started using Emacs about 3 years ago. I couldn't be more grateful to have seen the light, and to have been rescued from the darkness of Windoze, Goggle and/or friends. After enlightenment, I've taken upon myself the task of customising an environment to write my PhD thesis with Org Mode.


Post created in response to the current thread in r/emacs on thesis writing with Org Mode. I see most people's reason to avoid Org mode for scientific writing is the fact that supervisors or co-authors use Mic. Word. I'll try to argue that that's not enough reason to accept subpar tools.

What I'll talk about

I'll mention a bit of my motivations, and then I'll discuss how to make use of (mostly) built in Org functionality such as tagging, export, setupfiles and includes, reference management, keyboard shortcuts and advanced searching; all with the purpose of building a useful thesis writing environment. Readers should have a minimum knowledge of Org mode, the Org export system and LaTeX.

My requirements

Here in the Netherlands, most PhD thesis consist of an introduction, 3 to 4 research chapters (as submitted for publication), a summary, bibliography and appendices. What this means for me is that my writing environment has to necessarily satisfy the following minimum requirements:

Failure to comply with any of these means the editor is unfit for purpose1. Unfortunately, this set of requirements are not seamlessly satisfied by likes of Mic. Word or G. Docs. I reckon they can probably be configured to satisfy them, but why bother.

Additionally, a PhD thesis writing environment should also provide the following features:

To the best of my knowledge, only Emacs with Org Mode + ox-latex provide all of these out of the box.

Moulding Org Mode for thesis writing

Most of my inspiration comes from reading Kitchin's blogs and code, and reading the Org Mode documentation, mailing list and Emacs Stack Exchange. Here' I'll go one by way by all of the requirements listed above, and how to deal with them.

Prelude: File structure

I have a main thesis.org document, with latex heading declarations and a commented setupfile. I also have research.org files, in different directories, with their own latex heading declarations and commented setupfiles.

The first lines of thesis.org look like the following:

#  -*- mode: org; org-latex-title-command: ""; org-latex-toc-command: "" -*-
#+TITLE: Thesis Title
#+LATEX_CLASS: mimosis
# Setupfile contains #+LATEX_HEADER and #+OPTIONS and their explanations.
#+SETUPFILE: thesis.setup
#+LATEX_HEADER: \KOMAoptions{fontsize=12pt,headings=small}
#+LATEX_HEADER: \bibliography{~/Papers/bibtex/Publications}
#+EXCLUDE_TAGS: journal noexport

And the first lines of the multiple research.org files:

#+TITLE: Research
#+LATEX_CLASS: elsarticle
#+LATEX_CLASS_OPTIONS: [authoryear,preprint,11pt]
#+SETUPFILE: paper.setup
#+EXCLUDE_TAGS: thesis noexport

Inserting (parts of) external files

I write my research chapters with LaTeX classes targeting the journal's format. That means that a research chapter may be written with the elsarticle class, whereas the thesis as a whole is written with the mimosis class, a derivative of KOMA scrbook. Here's the class configuration for both:

(add-to-list 'org-latex-classes
                   ("\\section{%s}" . "\\section*{%s}")
                   ("\\subsection{%s}" . "\\subsection*{%s}")
                   ("\\subsubsection{%s}" . "\\subsubsection*{%s}")
                   ("\\paragraph{%s}" . "\\paragraph*{%s}")
                   ("\\subparagraph{%s}" . "\\subparagraph*{%s}")))
(add-to-list 'org-latex-classes
                   ("\\chapter{%s}" . "\\chapter*{%s}")
                   ("\\section{%s}" . "\\section*{%s}")
                   ("\\subsection{%s}" . "\\subsection*{%s}")
                   ("\\subsubsection{%s}" . "\\subsubsection*{%s}")
                   ("\\mboxparagraph{%s}" . "\\mboxparagraph*{%s}")
                   ("\\mboxsubparagraph{%s}" . "\\mboxsubparagraph*{%s}")))

Research chapters print the bibliography on their own, and they may contain acknowledgements that shouldn't be present in the middle of the thesis, so they should be excluded. In other to insert research chapters into my thesis, I use Org's #+INCLUDE derivative:

#+INCLUDE: file.org

In order to not include the some parts of the file, i.e., to exclude the title, setupfile and headers, I narrow down the lines:

# Include line 5 until the end of the file
#+INCLUDE: file.org :lines 5-

In order to exclude parts of the file, I tag research chapter headings that are only meant for publication with a :journal: tag (such as the bibliography or acknowledgements). This way they are automatically excluded from the thesis (see the #+EXCLUDE_TAGS: derivative in the thesis.org file). Also, I could have thesis specific content in the research.org document tagged with :thesis:, and it would be excluded in the research.org export, but I currently don't have any.

Now, the most important piece of advice I can give anyone is to learn how to use tags, EXCLUDE_TAGS and the org-plus-contributions ignore tag. With the ignore tag we separate the structuring of the text as a physical document from the structuring of the text as a semantic unity. This allows an extremely fine control over pieces of text to include into another document. For example, in a research chapter written with the elsarticle class, the abstract has to be included in the Frontmatter. By tagging a headline as follows (without the #,):

#, * Abstract :ignore:

I can write the research abstract in it's own heading, pretend that the heading itself does not exist (so it does not trigger /begin{document} 2), only its contents, and then include the contents in the thesis in an arbitrary location:

# in thesis.org
#+INCLUDE: "research.org::*Abstract" :only-contents t

The :ignore: tag is one of the best Org mode features, in my opinion. It's key to my workflow, and a shame to see it's not a part of Org core, but rather a contribution to be found in ox-extra.el. To activate it, add the following to your init:

(require 'ox-extra)
(ox-extras-activate '(latex-header-blocks ignore-headlines))

The realisation that it's possible to have such fine control over where to include or exclude pieces of text opens the door to all sort of interesting experiments: putting figures and captions directly into beamer or org-reveal presentations, creating conference posters, writing blog posts, etc.

Keep track of references

For backwards compatibility I still use Mendeley to track literature. I export bibtex files for each research project individually, and also a master bibtex for use in the thesis. To insert citations, use org-ref. It's documentation says it all. After you set up your bibliography file, press C-c ] to see a list of publications and insert them in place.

Include and reference figures

You can include figures in Org mode by using the following syntax:

#+NAME: figurename
#+CAPTION: This is a figure caption

Currently there is a bug in the ELPA version of Org mode, such that relative paths to figures in #+INCLUDE 'd files aren't adapted with respect to the including file, so the latex export cannot find them. I've submitted a fix which should land in the next release of Org.

Version control documents

Magit. I currently don't, but I though about having the research chapters as git submodules in a thesis git project directory. This would allow me to always have the thesis code in a saved state, even if I further work on my research chapters to answer to reviewers questions.

Support for sharing with my supervisor

Unfortunately, my supervisor likes to write comments in Mic. Word. I give in that sharing your writing with colleagues is a fundamental part of research. Fortunately, ox-word export via Pandoc & LaTeX is capable of creating nice looking, structured Word files which I send to my supervisor. I then manually work through each comment step by step, though I'm looking for a way to improve this part of my workflow.

I may update this post with more information later.

Extended search facilities

By extended search facilities I mean the ability to quickly search for information in references, and to keep notes linked to the literature. For searching I make use of org-ref + pdfgrep. For notes linked to documents I've recently started to use Org-noter.

Simple syntax for tables and equations

Org tables are a pleasure to work with:

a b c
1 2 3

Equations can be written in LaTeX:

$\frac{d \vec{M} (t)}{dt} = \vec{M} (t) × γ \vec{B} (t) $

Support within a proper text editor

No need to talk about the synergy of using Emacs to edit text.

Shortcuts to reach my files and build the thesis

I have a hydra (defined in Spacemacs as a transient-state) to move between my Thesis files:

    ;; Spacemacs hydra.
    (spacemacs|define-transient-state thesis-menu
      :title "P.h.D. Thesis Menu"
^Main Files^       ^Chapters^       ^Actions^
_m_: Thesis        _1_: Research 1  _o_: Open Thesis.pdf externally
_t_: Title page    _2_: Research 2  _c_: Async compile file
_i_: Introduction  _3_: Research 3  _a_: things
_s_: thesis.setup  _4_: Research 4  ^ ^
      ("a" things :exit t)
      ("m" (find-file "~/thesis/thesis.org") :exit t)
      ("t" (find-file "~/thesis/titlepage.org") :exit t)
      ("s" (find-file "~/thesis/thesis.setup") :exit t)
      ("i" (find-file "~/thesis/intro/intro.org") :exit t)
      ("1" (find-file "~/thesis/ch1/research.org") :exit t)
      ("2" (find-file "~/thesis/ch2/research.org") :exit t)
      ("3" (find-file "~/thesis/ch3/research.org") :exit t)
      ("4" (find-file "~/thesis/ch4/research.org") :exit t)
      ("o" (shell-command "open ~/thesis/thesis.pdf" :exit t))
      ("c" (org-latex-export-to-pdf :async t) :exit t))

    (global-set-key (kbd "H-t") 'spacemacs/thesis-menu-transient-state/body)


I'm considering writing a thesis template repository. I might do it when I finish my P.h.D.

This post was generated with a library I've talked about in a previous post.


1 See item 9 from this blogpost.

2 Headlines will tell ox-latex to start the document sectioning, and therefore trigger the beginning of the document environment