# r markdown：什么，为什么和如何？

r markdown. allows to generate a report (most of the time in PDF, HTML, Word or as a beamer presentation) that is automatically generated from a file written within RStudio. The generated documents can serve as a neat record of your analysis that can be shared and published in a detailed and complete report. Even if you never expect to present the results to someone else, it can also be used as a personal notebook to look back so you can see what you did at that time. A R Markdown file has the extension .Rmd, while a R script file has the extension .R.

• r码来展示如何完成分析。例如，数据和您使用的功能。这允许读者遵循您的代码并检查是否正确执行了分析。
• 代码的结果，即分析的输出。例如，您刚刚编码的假设测试的线性模型，绘图或结果的输出。这允许读者查看分析结果。
• 结果的文本，评论和解释。例如，计算主要后 描述性统计 并绘制一些图形，您可以在问题的上下文中解释它们并突出重要发现。这使读者能够通过您的解释和您的评论来了解您的结果，就像您编写的文件解释您的工作一样。

Another advantage of R Markdown is that the reports are dynamic and reproducible by anyone who has access to the .Rmd file (and the data if external data are used of course), making it perfectly suited to collaboration and dissemination of results. By dynamic, we mean that if your data changes, your results and your interpretations will change accordingly, without any work from your side.

1. The .Rmd file which contains blocks of R code (called chunks) and text is provided to the {knitr} package which will execute the R code to get the output, and create a document in markdown (.md) format. This document then contains the R code, the results (or outputs), and the text.
2. This .md file is then converted to the desired format (HTML, PDF or Word), by the markdown package based on pandoc (i.e., a document conversion tool).

# 在你开始之前

To create a new R Markdown document (.Rmd), you first need to install and load the following packages:

install.packages(c("knitr", "rmarkdown", "markdown"))

library(knitr)
library(rmarkdown)
library(markdown)

After you have clicked on OK, a new .Rmd file which serves as example has been created. We are going to use this file as starting point to our more complex and more personalized file.

To compile your R Markdown document into a HTML document, click on the Knit button located at the top:

# Components of a .Rmd file

## yaml标题

A .Rmd file starts with the YAML header, enclosed by two series of ---. By default, this includes the title, author, date and the format of the report. If you want to generate the report in a PDF document, replace output: html_document byoutput: pdf_document. These information from the YAML header will appear at the top of the generated report after you compile it (i.e., after knitting the document).

To add a table of contents to your documents, replace output: html_document by

output:
html_document:
toc: true

Here are my usual settings regarding the format of a HTML document (remove everything after number_sections: true if you render the document in PDF, as PDF documents do not accept these options in the YAML header):

output:
html_document:
toc: true
toc_depth: 6
number_sections: true
toc_float: true
code_folding: hide
theme: flatly
code_download: true

In addition to adding a table of contents, it sets its depth, adds a section numbering, the table of contents is floating when scrolling down the document, the code is hidden by default, the flatly theme is used and it adds the possibility to download the .Rmd document.

You can visualize your table of contents even before knitting the document, or go directly to a specific section by clicking on the small icon in the top right corner. Your table of contents will appear, click on a section to go to this section in your .Rmd document:

## 代码块

In the example file, you can see that the firs R code chunk (except the setup code chunk) includes the function summary() of the preloaded dataset cars: summary(cars). If you look at the HTML document that is generated from this example file, you will see that the summary measures are displayed just after the code chunk.

The next code chunk in this example file is plot(pressure), which will produce a plot. Try writing other R codes and knit (i.e., compile the document by clicking on the knit button) the document to see if your code is generated correctly.

As you can see, there are two additional arguments in the code chunk of the plot compared to my code chunk of the mean presented above. The first argument following the letter r (without comma between the two) is used to set the name of the chunk. In general, do not bother with this, it is mainly used to refer to a specific code chunk. You can remove the name of the chunk, but do not remove the letter r between the {} as it tells R that the code that follows corresponds to R code (yes you read it well, that also means you can include code from another programming language, e.g., Python, SQL, etc.).

After the name of the chunk (after pressure in the example file), you can see that there is an additional argument: echo = FALSE. This argument, called an option, indicates that you want to hide the code, and display only the output of the code. Try removing it (or change it to echo = TRUE), and you will see that after knitting the document, both the code AND the output will appear, while only the results appeared previously.

By default, the only setup option when you open a new R Markdown file is knitr::opts_chunk$set(echo = TRUE), meaning that by default, all outputs will be accompanied by its corresponding code. If you want to display only the results without the code for the whole document, replace it by knitr::opts_chunk$set(echo = FALSE). Two other options often passed to this setup code chunk are warning = FALSEmessage = FALSE to prevent warnings and messages to be displayed on the report. If you want to pass several options, do not forget to separate them with a comma:

You can also choose to display the code, but not the result. For this, pass the option results = "hide". Alternatively, with the option include = FALSE, you can prevent code and results from appearing in the finished file while R still runs the code in order to use it at a later stage. If you want to prevent the code and the results to appear, and do not want R to run the code, use eval = FALSE. To edit the width and height of figures, use the options fig.widthfig.height. Another very interesting option is the tidy = 'styler' option which automatically 重新格式化R代码 shown in the output.

$A = \pi*r^{2}$

Enclose your LaTeX equation with two $$ to have it centered on a new line: $$A = \pi*r^{2}

\ [e = mc ^ 2 \]

• 无序列表，第1项： * Unordered list, item 1
• 无序列表，第2项： * Unordered list, item 2
1. 订购列表，第1项： 1. Ordered list, item 1
2. 订购列表，第2项： 2. Ordered list, item 2

### 里面的代码

Before going further, I would like to introduce an important feature of R Markdown. It is often the case that, when writing interpretations or detailing an analysis, we would like to refer to a result directly in our text. For instance, suppose we work on the iris dataset (preloaded in R). We may want to explain in words, that the mean of the length of the petal is a certain value, while the median is another value.

We can insert results directly in the interpretations (i.e., in the text) by placing a backward apostrophe, the letter r, a space, the code, and then close it with another backward apostrophe:

Here is an illustration with the mean and median of the length of the sepal for the iris dataset integrated in a sentence:

### 突出显示像它是代码的文本

For this, surround your text with back ticks (the same backward apostrophe used for inline code) without the letter r. Writing the following:

For example, in this sentence I would like to highlight the variable name Species from the dataframe iris as if it is a piece of code.

“物种”和“虹膜”这个词出现，并突出显示，好像它是一段代码。

## 图片

![](path_to_your_image.jpg)

Note that the the file/url path is NOT quoted. To add an alt text to your image, add it between the square brackets []:

![alt text here](path_to_your_image.jpg)

## 桌子

1. the kable() function from the {knitr} package
2. the pander() function from the {pander} package

Here are an example of a table without any formatting, and the same code with the two functions applied on the iris dataset:

# without formatting
summary(iris)
##   Sepal.Length    Sepal.Width     Petal.Length    Petal.Width
##  Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100
##  1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300
##  Median :5.800   Median :3.000   Median :4.350   Median :1.300
##  Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199
##  3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800
##  Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500
##        Species
##  setosa    :50
##  versicolor:50
##  virginica :50
##
##
## 
# with kable()
library(knitr)
kable(summary(iris))

# with pander()
library(pander)
pander(summary(iris))

The advantage of pander() over kable() is that it can be used for many more different outputs than table. Try on your own code, with results of a linear regression or a simple vector for example.