Pandoc 学习笔记

ZhuYuanxiang 2020-05-01 00:00:00
Categories: Tags:

Q&A

Installation

Output Chinese Document

PDF

pandoc test.md -o test.pdf --pdf-engine=xelatex -V mainfont=SimSun

Beamer

pandoc test.md -o test.pdf -t beamer --pdf-engine=xelatex -V mainfont=SimSun

DOCX

pandoc test.md -o test.docx

PPTX

pandoc test.md -o test.pptx

HTML

pandoc test.md -o test.html

LaTeX

pandoc test.md -o test.tex

Standalone Document

生成 HTML 和 LaTeX 文档时,会发现只有文档的主体(body)部分,没有完整的文件头和尾。
如果需要生成完整的文档,可以增加 -s 在命令最后。

注:灵活地应用这个功能,可以提高调试文档代码的效率。

Snatch at WEB

pandoc -f html -t markdown -o web.md --request-header=User-Agent:"Mozilla/5.0" https://zhuyuanxiang.github.io/

Code Highlight

pandoc test.md -o test-yahei.pdf --pdf-engine=xelatex -V mainfont="Microsoft YaHei" --highlight-style=breezeDark

Templates

Latex Template

1
pandoc test.md -o test.pdf --pdf-engine=xelatex -V mainfont=SimSun --template=template/zYxTom.tex

zYxTom.tex 是模板文件。

DOCX Reference

1
pandoc test.md -o test.docx --reference-doc=template/zYxTom.docx

zYxTom.docx 是提供格式的文档。

PPTX Reference

1
pandoc test.md -o test.pptx --reference-doc=template/zYxTom.pptx

zYxTom.pptx 是提供格式的文档。

Default Template

没有提供模板时,pandoc 使用自带的默认模板,如果想参考默认模板,可以:

Metadata blocks

Extension: pandoc_title_block

文件可以由标题块开始,并且标题块会被解析为书目信息。

1
2
3
% title
% author(s) (separated by semicolons)
% date

如果某个信息不想输入,可以保持空白。详情参考翻译文档

Extension: yaml_metadata_block

YAML 元数据块是有效的 YAML 对象,开头使用三个连字符(---) ,结尾使用三个边字符(---) 或者 三个句点 (...) 分隔。YAML 元数据块可以放在文档的任意位置,但是如果它不在最开始,那么数据块前面一定要留空白行。 (注:因为 pandoc 可以将输入文件连接起来,因此也可以将元数据块放在一个独立的 YAML 文件中。作为一个参数,接在所有的 Markdown 文件后面传给 pandoc )

1
pandoc chap1.md chap2.md chap3.md metadata.yaml -s -o book.html

Default files

--defaults 选项可以被用于定义一个选项包。将各种选项和变量都定义在里面

1
pandoc --defaults=test.yaml

文件中的内容参考如下,也可以参考PRML的应用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
from: markdown+emoji
# reader: may be used instead of from:
to: html5
# writer: may be used instead of to:

# leave blank for output to stdout:
output-file:
# leave blank for input from stdin, use [] for no input:
input-files:
- preface.md
- content.md
# or you may use input-file: with a single value

template: letter
standalone: true
self-contained: false

# note that structured variables may be specified:
variables:
documentclass: book
classoption:
- twosides
- draft

# metadata values specified here are parsed as literal
# string text, not markdown:
metadata:
author:
- Sam Smith
- Julie Liu
metadata-files:
- boilerplate.yaml
# or you may use metadata-file: with a single value

# Note that these take files, not their contents:
include-before-body: []
include-after-body: []
include-in-header: []
resource-path: [.]

# filters will be assumed to be Lua filters if they have
# the .lua extension, and json filters otherwise. But
# the filter type can also be specified explicitly, as shown:
filters:
- pandoc-citeproc
- wordcount.lua
- type: json
path: foo.lua

file-scope: false

data-dir:

# ERROR, WARNING, or INFO
verbosity: INFO
log-file: log.json

# citeproc, natbib, or biblatex
cite-method: citeproc
# part, chapter, section, or default:
top-level-division: chapter
abbreviations:

pdf-engine: pdflatex
pdf-engine-opts:
- -shell-escape
# you may also use pdf-engine-opt: with a single option
# pdf-engine-opt: "-shell-escape"

# auto, preserve, or none
wrap: auto
columns: 78
dpi: 72

extract-media: mediadir

table-of-contents: true
toc-depth: 2
number-sections: false
# a list of offsets at each heading level
number-offset: [0, 0, 0, 0, 0, 0]
# toc: may also be used instead of table-of-contents:
shift-heading-level-by: 1
section-divs: true
identifier-prefix: foo
title-prefix: ''
strip-empty-paragraphs: true
# lf, crlf, or native
eol: lf
strip-comments: false
indented-code-classes: []
ascii: true
default-image-extension: .jpg

# either a style name of a style definition file:
highlight-style: pygments
syntax-definitions:
- c.xml
# or you may use syntax-definition: with a single value
listings: false

reference-doc: myref.docx

# method is plain, webtex, gladtex, mathml, mathjax, katex
# you may specify a url with webtex, mathjax, katex
html-math-method:
method: mathjax
url: https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js
# none, references, or javascript
email-obfuscation: javascript

tab-stop: 8
preserve-tabs: true

incremental: false
slide-level: 2

epub-subdirectory: EPUB
epub-metadata: meta.xml
epub-fonts:
- foobar.otf
epub-chapter-level: 1
epub-cover-image: cover.jpg

reference-links: true
# block, section, or document
reference-location: block
atx-headers: false

# accept, reject, or all
track-changes: accept

html-q-tags: false
css:
- site.css

# none, all, or best
ipynb-output: best

# A list of two-element lists
request-headers:
- [User-Agent, Mozilla/5.0]

fail-if-warnings: false
dump-args: false
ignore-args: false
trace: false

Debug

Debug Output

pandoc test.md -o test.pdf --pdf-engine=xelatex -V mainfont=SimSun --verbose

给出详细的调试输出。这个仅对 PDF 输出时有效。

Quite Model

pandoc test.md -o test.pdf --pdf-engine=xelatex -V mainfont=SimSun --verbose --quiet

安静模式。抑制所有的信息。

注:哪怕是打开调试输出,也不会产生任何信息。

Fail if Warnings

pandoc test.md -o test.pdf --pdf-engine=xelatex -V mainfont=SimSun --fail-if-warnings

遇到警告就退出转换过程。

Log File

pandoc test.md -o test.pdf --pdf-engine=xelatex -V mainfont=SimSun --log=log.json

注:因为生成 PDF 的过程中,会遇到许多不可知问题,保存日志方便调试。

References

pandoc 提供了三种方式: