This tutorial aims to introduce RMarkdown, a working environment for creating documents in data science.
In a single RMarkdown file, it is possible to write code and execute it, and then produce reports (mixing text, code, and code evaluation results) to be shared. The code can be R, but not only: it is possible to evaluate instructions from other languages such as Python or SQL, among others. The possible output formats are numerous. Among the most used: html, pdf, Word, notebook, ioslides.
The {rmarkdown} package can be installed using the following instruction.
install.packages("rmarkdown")
The main reference document on which this guide is based is the RMarkdown Cookbook, written by Yihui Xie, Christophe Dervieux and Emily Riederer (Chapman & Hall/CRC, 2020). An electronic version is available free of charge at the following address https://bookdown.org/yihui/rmarkdown-cookbook/.
A two-page cheat sheet on R Markdown has been produced by RStudio: https://www.rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf.
On RStudio, it is good practice to work with projects. First, create an RStudio project, following the tree structure shown in the image below.
Structure basique pour les projets. Source: https://martinctc.github.io/.
Création d’un projet
In RStudio:
File
menu, then New Project...
.New Directory
, then New Project
.Browse...
button.Open
button.Create Project
button to create the project. This opens a new RStudio session. The current directory becomes the one in which you created the project. An extension file .Rproj
has been created in this directory. Simply open this file in the future to open RStudio to work on this project.On the University computers, you will want to make sure you create the project in the C:/Workout
folder. Compiling on VDI (virtual desktop infrastructure) seems to be impossible at the moment.
Please note: at the end of the session, you should remember to copy and paste the entire directory containing your project into your Documents folder. The contents of the C:/Workout
folder are deleted when you log out.
Now that the project is created, it is time to create an R Markdown document.
Creating an R Markdown document
In RStudio:
Click on the File
menu, then New File...
.
Click on R Markdown...
.
In the window that appears :
HTML
option so that the report that will be created afterwards is an html document (a language designed for presenting web pages).OK
button.Save the file created by giving it a name of your choice (e.g. first_report.Rm
).
Window for creating a new RMarkdown document.
An RMarkdown document, with the extension .Rmd
is then created. This document consists of three parts:
In the document you have just created, the YAML header indicates :
title: "Mon premier document R Markdown"
author: "Ewen Gallic"
date: "2/7/2022"
output: html_document
The document title, author and date are specified in this header. When the Rmd file is converted to an html file by Pandoc (a document conversion software), this information will be stored in variables and will appear in one or more places in the html file (depending on the template used). The output: html_document
line indicates that the output document will be an html document. Other elements can be indicated, notably in the output
part: presence of a table of contents, numbering of sections, addition of a style sheet, etc.
In a few words, the conversion steps are as follows:
knit()
function of the package {knitr
} executes the codes contained in the chunks and prepares a Markdow file (.md
)If the output format is pdf, an additional step is added: the .md
file is converted into a LaTeX file (.tex
). A compilation of the .tex
file is then performed by LaTeX to obtain the final pdf file. This requires LaTeX or TinyTeX to be installed on your system.
To add a table of contents to an html file, key-value pairs are added:
toc: yes
: the creation of a table of contents is desired (table of contents) ;toc_depth: 3
: the integer given as a value defines the depth of the table of contents (1: only sections, 2: sections and sub-sections, 3: sections, sub-sections and sub-sub-sections, etc.)toc_float: true
: the table of contents will be inserted as a floating object and permanently visible throughout the document.---
title: "Mon premier document R Markdown"
author: "Ewen Gallic"
date: "2/7/2022"
output: html_document:
toc: yes
toc_depth: 3
toc_float: true
---
Please note that indentations must be respected, as in the previous example.
For a table of contents on a final pdf document, the YAML must contain the pairs toc: yes
and toc_depth:3
.
---
title: "Mon premier document R Markdown"
author: "Ewen Gallic"
date: "2/7/2022"
output: pdf_document:
toc: yes
toc_depth: 3
---
The .Rmd
file can contain text that is written in markdown. More information will be given later in this sheet about markdown syntax (which is very simple).
The pieces of code contain two parts:
To be executed, the code calls on an environment (in which variables can be created). This environment can be modified after the code is executed.
To compile an R Markdown document, once the YAML is well specified, you need to:
call the render()
function of the {rmarkdown
} package:
Rmd
file via the input
argument: rmarkdown::render(input = "your_document_rmarkdown.Rmd")
click on the Knit
button (you can easily spot it with its knitting needle icon and ball of wool);
press the keyboard keys Ctrl / Cmd + Shift + K
simultaneously.
The last two solutions lead to displaying the result in a window that opens at the end of the compilation.
Compile your first Markdown R file in one of the three ways shown, then look at the result.
The text parts that add narrative to the reports can be written in markdown. The syntax is very simple.
Simply write as in a notepad and the text will be displayed in the final report.
Ending a line with two spaces allows you to go to the next line.
Leaving a blank line between two texts creates a new paragraph.
Style | Syntax | Example | Rendering |
---|---|---|---|
Bold | ** ** or __ __ |
**bold** text |
bold text |
Italic | * * or _ _ |
A word in *italic* |
A word in italic |
Strikethrough text | ~~ ~~ |
I ~~like~~ love R |
I |
A part in italics in bold | **- -** |
A **_very_ important text** |
A very important text |
All in bold and italics | *** *** |
***新年快樂*** (Xin nian kuai le) |
新年快樂 (Xin nian kuai le) |
Exponent | ^ ^ |
January 1^st^ |
January 1st |
There are six levels of headings in Markdown R documents. The title text is preceded by as many braces (#
) as the desired level.
# Level 1 title
## Level 2 title
### Level 3 title
#### Level 4 title
##### Level 5 title
###### Level 6 title
Notes :
To make a quotation in a block, the quotation must be preceded by the symbol >
, which is placed at the beginning of the line. Example with> chúc mừng năm mới
:
chúc mừng năm mới
To make a quotation contain several paragraphs, a chevron (>
) must be added at the beginning of empty lines.
> “How can two people hate so much without knowing each other?”
>
> --- Alan Moore, _The Killing Joke_
“How can two people hate so much without knowing each other?”
— Alan Moore, The Killing Joke
To insert a long dash (cadratine), three dashes are used: ---
; for a short dash (or semi-cadratine), two dashes are used: --
.
Desired symbol | Syntax | Example | Rendering | |
---|---|---|---|---|
Long dash (cadratine) | --- |
--- a line |
— a line | |
Middle dash (half-cadratine) | -- |
The France--Italy border |
The France–Italy border |
By typing three dashes ---
and passing immediately to the line, a horizontal line is inserted.
To write ellipses, just write three dots (...
) in a row…
A hyperlink is created using two elements: a text to be clicked on, which must be enclosed in square brackets []
, and an address to which the link points, which must be enclosed in brackets (()
).
[wonderful video](https://www.youtube.com/watch?v=dQw4w9WgXcQ). Look at this
Look at this wonderful video.
To create a link without defining specific text to replace the URL, it is possible to simply write the URL. However, it is preferable to enclose the URL in chevrons. The same applies to an email address.
<https://www.youtube.com/watch?v=oavMtUWDBTM>
<ewen.gallic@univ-amu.fr>
https://www.youtube.com/watch?v=oavMtUWDBTM
ewen.gallic@univ-amu.fr
To create an anchor (a link to a specific location on the page already displayed) to a title of the document, you need to know the reference to the anchor point. A simple way is to define it yourself, using the following syntax:
# Title {#name-of-the-ref}
The name of the reference must not contain spaces or underscores (_
). It may, however, as in the example, contain dashes (-
).
In this document, the sub-section in which this text is included is defined as follows:
## Hyperlinks {#liens-hypertextes}
[section](#liens-hypertextes). This makes it easy to link to this
This makes it easy to link to this section.
Numbered footnotes are inserted using square brackets ([]
) containing a circumferential accent and a reference which can be either a number or text (but without spaces or other blank characters).
The footnote number is a link to the footnote. A return arrow is proposed to go back to the text when the document created is an html document.
[^1] followed by a longer note[^long-note].
A simple footnote
[^1]: the footnote.
[^long-note]: a longer footnote.
In which a paragraph can be written.
`{ some code }`
Several paragraphs can even be written.
There are two types of lists: ordered and unordered.
To create an ordered list, a number is placed at the beginning of the line in front of each item in the list, followed immediately by a full stop and a space.
1. A first element
2. A second element
3. A third element.
It is not necessary to respect the numbering:
1. A first element
10. A second element
5. A third element.
The number of the first element in the ordered list defines the counter value:
4. A first element
10. A second element
5. A third element.
To insert an unordered list, precede each element with the -
symbol or the *
symbol.
A list including :
* A first element.
* A second element.
* A third element.
A list including :
To add a list within a list, either a tab stop or 4 spaces must be added before the dash or star.
- A first element.
- A secibd element:
- Which contains a sub-element.
- And a second sub-element.
- And a third one.
- A third element.
To write a paragraph inside a list, a tab or 4 spaces must be added to maintain the continuity of the list. The paragraph must also be preceded by an empty list (an empty line can also be added after the paragraph, but this is optional).
- A first element.
- A second element:
This element contains a paragraph.
- A third element.
A first element.
A second element:
This element contains a paragraph.
A third element.
It is perfectly possible to nest an ordered list in an unordered list and conversely.
1. A first element:
- With a sub-element.
2. A second element.
Adding an image is done by inserting an exclamation mark (!
), followed by a title in square brackets, and then the path to the image in brackets (()
). A description of the image can be added in inverted commas (""
) after the path, still within the brackets (this description is visible when the mouse pointer hovers over the image for a few seconds, and can be read aloud by systems designed for people with disabilities). Finally, to specify image size parameters, it is possible to add information in square brackets ({}
).
{width="200px"}