Markdown转Word

Markdown转Word

1.下载安装pandoc

选择合适的安装包并安装,这里建议选择3.8以上的安装包:https://github.com/jgm/pandoc/releases

安装完成后检查版本号:

1
pandoc --version

3.8:可以转换\( ... \)格式和$...$格式
1.9:只能转换$...$格式
注意:这里$和公式间没有空格

提示:
为了兼容性考虑,pandoc版本要求3.8以上
如果pandoc的版本号不符合预期,是因为电脑内已存在旧版本的pandoc。先执行以下命令找出pandoc的位置,再手动卸载:

1
where pandoc

2.转换Markdown文件

Markdown文件的当前文件夹,在地址栏输入CMD,打开命令行窗口输入:

1
pandoc example.md -f markdown+tex_math_single_backslash -o example.docx

输入: example.md
输出: example.docx

提示:请自行修改命令行中文件的名称

3.MD文件转换器(bat程序):

新建MD文件转换.txt文件,填入以下代码,然后重命名为MD文件转换.bat,即可得到便于使用的bat程序:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
@echo off
setlocal enabledelayedexpansion

echo ========================================
echo Pandoc Document Converter
echo ========================================
echo.

:: ========================================
:: 自动清理文件名中的英文括号
:: ========================================
echo Cleaning up filenames...
for %%i in (*.md) do (
set "oldname=%%i"
set "newname=!oldname:(=!"
set "newname=!newname:)=!"
if not "!oldname!"=="!newname!" (
echo Rename: "!oldname!" -^> "!newname!"
ren "!oldname!" "!newname!" 2>nul
)
)
echo.

:: ========================================
:: 列出清理后的文件
:: ========================================
echo Markdown files in current directory:
echo ----------------------------------------
set file_count=0
for %%i in (*.md) do (
set /a file_count+=1
echo !file_count!. %%i
)
echo ----------------------------------------

if %file_count% equ 0 (
echo [ERROR] No .md file found!
pause
exit /b 1
)

:: ========================================
:: 选择文件
:: ========================================
set /p INPUT_FILE="Enter file name (number or full name, Enter for first one): "

if "%INPUT_FILE%"=="" (
set file_num=1
for %%i in (*.md) do (
if !file_num! equ 1 set INPUT_FILE=%%i
set /a file_num+=1
)
echo Auto-selected: !INPUT_FILE!
) else (
echo %INPUT_FILE%|findstr /r "^[0-9]*$" >nul
if not errorlevel 1 (
set file_num=1
for %%i in (*.md) do (
if !file_num! equ %INPUT_FILE% set INPUT_FILE=%%i
set /a file_num+=1
)
echo Selected: !INPUT_FILE!
) else (
if not exist "%INPUT_FILE%" (
echo [ERROR] File "%INPUT_FILE%" not found!
pause
exit /b 1
)
)
)

:: 获取文件名(不含扩展名)
set BASENAME=%~n1
if "%BASENAME%"=="" (
for %%i in ("%INPUT_FILE%") do set BASENAME=%%~ni
)

:: ========================================
:: 选择输出格式
:: ========================================
echo.
echo ========================================
echo Select Output Format
echo ========================================
echo [1] docx - Word document
echo [2] html - Web page
echo ========================================
echo [Enter] default: docx
echo ========================================
echo.

set /p FORMAT_CHOICE="Enter option [1/2] (Enter for docx): "

:: ========================================
:: 执行转换(通过 PowerShell 避免括号问题)
:: ========================================
if "%FORMAT_CHOICE%"=="2" (
echo.
echo Converting to HTML...
echo Input: %INPUT_FILE%
echo Output: %BASENAME%.html
echo.
powershell -Command "& { pandoc '%INPUT_FILE%' -o '%BASENAME%.html' --standalone }"
) else (
echo.
echo Converting to DOCX...
echo Input: %INPUT_FILE%
echo Output: %BASENAME%.docx
echo.
powershell -Command "& { pandoc '%INPUT_FILE%' -f markdown+tex_math_single_backslash -o '%BASENAME%.docx' }"
)

if errorlevel 1 (
echo.
echo [ERROR] Conversion failed!
) else (
set OUTPUT_EXT=docx
if "%FORMAT_CHOICE%"=="2" set OUTPUT_EXT=html
echo.
echo [SUCCESS] Output: %BASENAME%.%OUTPUT_EXT%
)

echo.
pause

4.Word后处理:

转换得到的word文档,存在一些问题:

  1. 标题文本颜色默认是蓝色
  2. 文本字体默认是等线
  3. 表格没有框线。如果表格的数量比较多,手动添加框线会是一件麻烦的事情
  4. 正文没有首行缩进。如果直接设置正文样式,致表格内的文字也会一起缩进,需要手动修改

打开word,按alt+F11进入宏界面,点击插入->模块,在弹出的文本框中输入以下代码,然后按F5运行:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
Sub 文档格式统一处理_兼容版()
' 功能:
' 1. 所有文本颜色改为黑色
' 2. 所有文本字体:先宋体,再新罗马(中文宋体,英文新罗马)
' 3. 表格全框线(不含斜线)- 兼容版
' 4. 正文首行缩进(排除标题和表格)

Dim tbl As Table
Dim para As Paragraph
Dim doc As Document
Dim rng As Range

Set doc = ActiveDocument

' 关闭屏幕更新
Application.ScreenUpdating = False

' ==================== 1. 所有文本颜色改为黑色 ====================
Set rng = doc.Range
rng.Font.ColorIndex = wdBlack

' ==================== 2. 字体设置 ====================
Set rng = doc.Range
With rng.Font
.NameFarEast = "宋体"
.NameAscii = "Times New Roman"
.NameOther = "Times New Roman"
End With

' ==================== 3. 表格全框线(兼容版) ====================
For Each tbl In doc.Tables
' 逐个设置边框,避免常量不存在的问题
With tbl
.Borders(wdBorderLeft).LineStyle = wdLineStyleSingle
.Borders(wdBorderRight).LineStyle = wdLineStyleSingle
.Borders(wdBorderTop).LineStyle = wdLineStyleSingle
.Borders(wdBorderBottom).LineStyle = wdLineStyleSingle
.Borders(wdBorderHorizontal).LineStyle = wdLineStyleSingle
.Borders(wdBorderVertical).LineStyle = wdLineStyleSingle
' 禁用斜线
.Borders(wdBorderDiagonalDown).LineStyle = wdLineStyleNone
.Borders(wdBorderDiagonalUp).LineStyle = wdLineStyleNone
End With
Next tbl

' ==================== 4. 正文首行缩进 ====================
For Each para In doc.Paragraphs
If Not para.Range.Information(wdWithInTable) Then
Select Case para.Style.NameLocal
Case "标题 1", "标题 2", "标题 3", "标题 4", "标题 5", _
"标题 6", "标题 7", "标题 8", "标题 9", _
"Heading 1", "Heading 2", "Heading 3", "Heading 4", "Heading 5", _
"Heading 6", "Heading 7", "Heading 8", "Heading 9", _
"标题", "Heading", "Subtitle", "副标题", "Title"
' 跳过标题
Case Else
para.Range.ParagraphFormat.CharacterUnitFirstLineIndent = 2
End Select
End If
Next para

' 恢复屏幕更新
Application.ScreenUpdating = True

MsgBox "格式处理完成!(兼容版)", vbInformation, "完成"
End Sub