diff options
| author | John MacFarlane <jgm@berkeley.edu> | 2022-12-28 17:16:30 -0800 |
|---|---|---|
| committer | John MacFarlane <jgm@berkeley.edu> | 2022-12-28 17:16:30 -0800 |
| commit | ce7d1d1c2029d7a248c1a84958d464dd45f332a2 (patch) | |
| tree | 60038da09005f02a3dd4dc74d36faded729f4c49 | |
| parent | 6c96340bf63df36c91d11d405af96da8b736eb56 (diff) | |
Man writer: use UTF-8 by default for non-ascii characters.
Only use groff escapes if `--ascii` has been specified on the
command line (`writerPreferAscii`).
Closes #8507.
| -rw-r--r-- | MANUAL.txt | 4 | ||||
| -rw-r--r-- | src/Text/Pandoc/Writers/Man.hs | 4 | ||||
| -rw-r--r-- | test/writer.man | 18 |
3 files changed, 14 insertions, 12 deletions
diff --git a/MANUAL.txt b/MANUAL.txt index c8d949364..330faff37 100644 --- a/MANUAL.txt +++ b/MANUAL.txt @@ -990,9 +990,9 @@ header when requesting a document from a URL: : Use only ASCII characters in output. Currently supported for XML and HTML formats (which use entities instead of UTF-8 when this option is selected), CommonMark, gfm, and Markdown (which use - entities), roff ms (which use hexadecimal escapes), and to a + entities), roff man and ms (which use hexadecimal escapes), and to a limited degree LaTeX (which uses standard commands for accented - characters when possible). roff man output uses ASCII by default. + characters when possible). `--reference-links` diff --git a/src/Text/Pandoc/Writers/Man.hs b/src/Text/Pandoc/Writers/Man.hs index 4e1651e53..859378dce 100644 --- a/src/Text/Pandoc/Writers/Man.hs +++ b/src/Text/Pandoc/Writers/Man.hs @@ -83,7 +83,9 @@ pandocToMan opts (Pandoc meta blocks) = do Just tpl -> renderTemplate tpl context escString :: WriterOptions -> Text -> Text -escString _ = escapeString AsciiOnly -- for better portability +escString opts = escapeString (if writerPreferAscii opts + then AsciiOnly + else AllowUTF8) -- | Return man representation of notes. notesToMan :: PandocMonad m => WriterOptions -> [[Block]] -> StateT WriterState m (Doc Text) diff --git a/test/writer.man b/test/writer.man index 752852322..c476c35aa 100644 --- a/test/writer.man +++ b/test/writer.man @@ -541,11 +541,11 @@ Ellipses\&...and\&...and\&.... .SH LaTeX .IP \[bu] 2 .IP \[bu] 2 -2\[u2005]+\[u2005]2\[u2004]=\[u2004]4 +2 + 2 = 4 .IP \[bu] 2 -\f[I]x\f[R]\[u2004]\[mo]\[u2004]\f[I]y\f[R] +\f[I]x\f[R] ∈ \f[I]y\f[R] .IP \[bu] 2 -\f[I]\[*a]\f[R]\[u2005]\[AN]\[u2005]\f[I]\[*w]\f[R] +\f[I]α\f[R] ∧ \f[I]ω\f[R] .IP \[bu] 2 223 .IP \[bu] 2 @@ -557,7 +557,7 @@ $$\[rs]frac{d}{dx}f(x)=\[rs]lim_{h\[rs]to 0}\[rs]frac{f(x+h)-f(x)}{h}$$ .RE .IP \[bu] 2 Here\[cq]s one that has a line break in it: -\f[I]\[*a]\f[R]\[u2005]+\[u2005]\f[I]\[*w]\f[R]\[u2005]\[tmu]\[u2005]\f[I]x\f[R]^2^. +\f[I]α\f[R] + \f[I]ω\f[R] × \f[I]x\f[R]^2^. .PP These shouldn\[cq]t be math: .IP \[bu] 2 @@ -578,15 +578,15 @@ Here\[cq]s a LaTeX table: .PP Here is some unicode: .IP \[bu] 2 -I hat: \[^I] +I hat: Î .IP \[bu] 2 -o umlaut: \[:o] +o umlaut: ö .IP \[bu] 2 -section: \[sc] +section: § .IP \[bu] 2 -set membership: \[mo] +set membership: ∈ .IP \[bu] 2 -copyright: \[co] +copyright: © .PP AT&T has an ampersand in their name. .PP |
