How to make awesome PDFs with markdown using Eisvogel
Are you like me?
Do you hate WYSIWYG document editors like Word, Page, or LibreOffice Writer?
Are you more of a markdown person?
Do you like text-based projects?
Being a linux hardcore freak (or a wanabee linux hardcore freak), are you unafraid of the command line?
Well I’ve got just the workflow for you :
- Write in markdown
- Generate a PDF with black magic
- Conquer the world
Let’s dive in this together, in a drunken and self-deprecating mindset, otherwise it won’t be fun.
Install Pandoc
Pandoc is a crazy swiss-army knife haskell-written library that converts any markup-formatted document into another markup-formatted document. Pandoc converts html into json, man pages into html, epubs into docx, csv into dokuwiki, aaaaannnd markdown into PDF !
Install pandoc using your favorite package manager,
on ubuntu or anything debian-based,
on arch linux with the pandoc
package,
on macOS if you’re brave.
According to some legends, you can even install it on windows.
But that’s just the easy part.
Install LaTeX and join the real cult
Enter LaTeX, one of the oldest pieces of software still in use today. LaTeX started a bit like the GNU project. Some nitpicking bloke with too many free time on his hands and a lot of brain cells to burn decided that his book was not properly typeset. Yes, a guy actually complained about spacing between letters, margins around text, that kind of thing ordinary folks never worry about. So this Donald (yes, that’s his name, Donald Knuth) did what every (in)sane computer freak does: he went on to invent his own thing to solve his problem.
His typesetting system is called TeX
, pronounced “tek”, and went on to be wrapped and renamed
and repackaged in an obscure way only encyclopedias and annoying know-it-alls understand.
What we talk about now is LaTeX
, but you have to pronounce it “latek” or you can’t be part of the gang.
We’re in the freak zone.
LaTeX is the king of markup languages,
theoretically it is even turing-compatible, with if
statements and whatnot.
LaTeX is the assembly of document composing.
LaTeX is the language of the gods.
One example. This latex-written formula:
|
|
becomes
You can’t really beat that.
We will need LaTeX. There’s no way around that if we want to brag in the classroom and impress… no one actually. No one will care. But let’s do it anyway.
Installing LaTeX, or how to download way too heavy packages
There’s no easy way around this. Tex and LaTeX are heavy.
- For Arch Linux fans:
sudo pacman -S texlive-most
- ubuntu instructions
Chances are,
those command won’t work and you’ll end up searching the internet like a lost soul,
deseperate for a way to understand how any of this is supposed to work.
As a rule of thumb, install anything that ends with ex
, be it pdflatex
or xelatex
or texlive
and so on.
You should get around 5 GB of packages to install.
Yes.
I told you we’re in the freak zone.
What pandoc does is black magic
Please play along or it won’t stick in your brain. Open this terminal and retype these commands. Copy-pasting is for the weak.
Write in markdown
Start a new project, shoot a markdown-written readme.
mkdir my_awesome_pdf_project
cd my_awesome_pdf_project
touch README.md
Write in README.md
:
|
|
And then customize it with your name:
echo "$USER." >> README.md
Get started with pandoc
To gain a good understanding of how pandoc works, let’s convert this markdown to HTML. This will help understand how pandoc works. Everyone knows some HTML, even my 4 year-old nephew knows some HTML. You won’t have any excuse.
We will:
- ask pandoc to take
README.md
as an input - churn out a
README.html
file as an output.
Type after me:
pandoc README.md --output README.html
Then open the html file
firefox README.html
or if you will
chrome README.html
or safari, or chromium. You get the idea. Here’s how the HTML displays in the browser:
What happened is,
pandoc went through README.md
,
converted markdown markup into HTML markup. #
became <h1>
, etc., and voilà!
If you want a complete HTML page with a <head>
and some default CSS, you can do:
pandoc README.md --output README.html --standalone
That’s pretty neat, but how about actual PDFs?
How pandoc converts markdown to PDF
“I see were you’re going, Emmanuel. I just need to type
pandoc README.md -o README.pdf
and I will have my PDF”.
Well, yes, but also, no.
You may succeed at first and be joyful.
And you may just as well witness your enthusiasm crash splendidly, like a wave on the solid rock of reality.
If you think this whole markdown-to-html stuff was easy and you wish for some more serious headache, you’re about to be served.
Let’s go slowly about it. I have gained this knowledge with sweat and tears, it must be passed on. If you want to do it the easy way, you have to understand how it is done the hard way.
From markdown to LaTeX
Please type:
pandoc README.md -o README.tex
and have a look at README.tex
that looks like this:
|
|
This is what LaTeX markup looks like. A bit verbose, but it looks serious.
However, if you are even remotely familiar with LaTeX, you’ll notice some important markup is missing,
like the very important \documentclass
line that basically says:
“This document is a book / a memoire / an article, please typeset it accordingly”.
If you’re clever enough, you’ll have the intuition of typing instead:
pandoc README.md -o README.tex --standalone
And boom ! Your README.tex
looks a whole lot different now,
with a \documentClass{article}
and a lot of \usepackage
statements and so on.
That will make a neat PDF later on.
Pandoc uses templates
What happened here? Well, the mission of pandoc is to convert from one markup to another.
A markdown’s *word*
becomes an HTML <em>word</em>
becomes a TeX \emph{word}
.
Pretty straightforward.
But when dealing with complex document formats like HTML or LaTeX,
we need a bunch of lines that say <DOCTYPE html>
or \documentClass{article}
,
and give precisions about the fonts used, the CSS or JavaScript files attached (in case of HTML)
or the font size, the page format, the indentation of paragraphs (in case of LaTeX).
So pandoc has this --standalone
flag that summons default values.
In the pandoc world, these defaults are stored in templates.
Have a look at the HTML default template by doing:
pandoc -D html
You should recognize a lot of things. Head, meta, style, body… within some weird dollar-sign-surrounded if statements. This is pandoc’s own way of generating boilerplate HTML headers.
Same if you do:
pandoc -D latex
You will get a much more frightening blurb of interwoven LaTeX and pandoc-specific syntax.
Please know that this exists, and that it is used when converting markdown to LaTeX with the --standalone
flag.
Please type this command if you haven’t yet:
pandoc README.md -o README.tex --standalone
and have a longer, curious look at the generated README.tex
.
From LaTeX to PDF
From here on I’m not a specialist. The world of LaTeX is weird and intricate, full of weird and intricate history, populated by weird and intricate people (gosh, I am becoming one of them). The learning curve is hard and I feel like an impostor even writing about it.
Be kind with yourself. If anything fails, skip this paragraph and go on with the tutorial.
Let’s make an actual PDF. Try those commands randomly untif one of them work:
latex README.tex
xelatex README.tex
pdflatex README.tex
Those are all PDF engines. Try to install them, or one of them.
If the gods are on your side, one of those commands won’t fail and produce a README.pdf
.
Look it up. Admire it. Be proud of yourself.
To sum up what pandoc does
When you type pandoc README.md -o README.pdf
, pandoc does those things:
- Convert the markdown markup to LaTeX markup
- Summon a default template to create LaTeX-specific statements like
\documentClass
- Merge everything in a neat
.tex
document - Invoke a PDF engine necromant,
to raise the
.tex
file from the dead realm of text onto the battlefield of our paper-dominated world.
Make it easy please
I have told you what I have learned. And for a while, what I have done was this:
- Write in markdown
- Store the default pandoc template within a
template.tex
file - Edit
template.tex
to attain desired margins, fonts, etc (difficult) - Do everything detailed above, in a bash script, plus the pandoc option:
--template=templatex.tex
This was great for learning purposes, but I now deserve simplicity, and so do you, dear reader.
Let us discover Eisvogel.
It boils down to replacing the default template used by pandoc when creating .tex
files.
Step by step
Create another new project folder called use_eisvogel
.
mkdir use_eisvogel
cd use_eisvogel
Copy-paste the eisvogel.tex template from the official repository into you project folder.
wget https://raw.githubusercontent.com/Wandmalfarbe/pandoc-latex-template/master/eisvogel.tex
(or with curl if you prefer)
curl https://raw.githubusercontent.com/Wandmalfarbe/pandoc-latex-template/master/eisvogel.tex > eisvogel.tex
Create a document.md
, in which you will write:
|
|
Create a bash script, make it executable:
touch build.sh
chmod +x build.sh
Write in it:
|
|
Execute the script (make sure you made it executable):
./build.sh
If all went smooth, you should have a nice document.pdf
.
Go on your own journey
The Eisvogel repository is full of examples and of easy-to-use variables to play around.
You’re on your own now.
Coding is hard, do only what you enjoy.
Be kind to yourself.