1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
|
= Optimize the PDF
:url-hexapdf: https://hexapdf.gettalong.org/
By default, Asciidoctor PDF does not optimize the PDF it generates or compress its streams.
This page covers several approaches you can take to optimize your PDF.
IMPORTANT: If you're creating a PDF for Amazon's Kindle Direct Publishing (KDP), GitLab repository preview, or other online publishers, you'll likely need to optimize the file before uploading.
In their words, you must tidy up the reference tree and https://kdp.amazon.com/en_US/help/topic/G201953020#check[flatten all transparencies^] (mostly likely referring to images).
If you don't do this step, the platform may reject your upload or fail to display it properly.
(For KDP, `-a optimize` works best.
For GitLab repository preview, either `-a optimize` or `hexapdf optimize` will do the trick.)
== Enable stream compression
A simple way to reduce the size of the PDF file is to enable stream compression (using the FlateDecode method).
You can enable this feature by setting the `compress` attribute on the document:
$ asciidoctor-pdf -a compress filename.adoc
For a more thorough optimization, you can use the <<rghost,integrated optimizer>> or <<hexapdf>>.
[#rghost]
== rghost
Asciidoctor PDF also provides a flag (and bin script) that uses GhostScript (via rghost) to optimize and compress the generated PDF with minimal impact on its quality.
You must have Ghostscript (command: `gs`) and the `rghost` gem installed to use it.
=== Install rghost
To install the rghost gem, open a terminal and type the following command.
$ gem install rghost
=== Convert and optimize
Here's an example usage that converts your document and optimizes it:
$ asciidoctor-pdf -a optimize filename.adoc
The command will generate an optimized PDF 1.4 file.
In addition to optimizing the PDF file, it also converts it from plain PDF to a PDF/X-1a.
If this command fails because the `gs` command cannot be found, you'll need to set it using the `GS` environment variable.
On Windows, this step is almost always required since the Ghostscript installer does not install the `gs` command into a standard location.
Here's an example that shows how you can override the `gs` command path:
$ GS=/path/to/gs asciidoctor-pdf -a optimize filename.adoc
You'll need to use the technique for assigning an environment variable that's relevant for your system.
The one limitation of generating a PDF/X-1a file is that it does not allow non-ASCII characters in the document metadata fields (i.e., title, author, subject, etc.).
To work around this limitation, you can force Ghostscript to generate a PDF 1.3 file using the `pdf-version` attribute:
$ asciidoctor-pdf -a optimize -a pdf-version=1.3 filename.adoc
CAUTION: Downgrading the PDF version may break the PDF if it contains an image that uses color blending or transparency.
Specifically, the text on the page can become rasterized, which causes links to stop working and prevents text from being selected.
If you're in this situation, it might be best to try <<hexapdf>> instead.
If you're looking for a smaller file size, you can try reducing the quality of the output file by passing a quality keyword to the `optimize` attribute (e.g., `--optimize=screen`).
The `optimize` attribute accepts the following keywords: `default` (default, same if value is empty), `screen`, `ebook`, `printer`, and `prepress`.
Refer to the https://www.ghostscript.com/doc/current/VectorDevices.htm#PSPDF_IN[Ghostscript documentation^] to learn what settings these presets affect.
If you've already generated the PDF, and want to optimize it directly, you can use the bin script:
$ asciidoctor-pdf-optimize filename.pdf
The command will overwrite the PDF file with an optimized version.
You can also try reducing the quality of the output file using the `--quality` flag (e.g., `--quality screen`).
The `--quality` flag accepts the following keywords: `default` (default), `screen`, `ebook`, `printer`, and `prepress`.
In both cases, if a file is found with the extension `.pdfmark` and the same rootname as the input file, it will be used to add metadata to the generated PDF document.
This file is necessary when using versions of Ghostscript < 8.54, which did not automatically preserve this metadata.
You can instruct the converter to automatically generate a pdfmark file by setting the `pdfmark` attribute (i.e., `-a pdfmark`)
When using a more recent version of Ghostscript, you do not need to generate a `.pdfmark` file for this purpose.
IMPORTANT: The `asciidoctor-pdf-optimize` is not guaranteed to reduce the size of the PDF file.
It may actually make the PDF larger.
You should probably only consider using it if the file size of the original PDF is several megabytes.
If you have difficulty getting the `rghost` gem installed, or you aren't getting the results you expect, you can try the optimizer provided by hexapdf instead.
[#hexapdf]
== hexapdf
Another option to optimize the PDF is {url-hexapdf}[hexapdf^] (gem: hexapdf, command: hexapdf).
Before introducing it, though, it's important to point out that its license is AGPL.
If that's okay with you, read on to learn how to use it.
=== Install hexapdf
First, install the hexapdf gem using the following command:
$ gem install hexapdf
=== Compress a PDF
You can then use it to optimize your PDF as follows:
$ hexapdf optimize --compress-pages --force filename.pdf filename.pdf
This command does not manipulate the images in any way.
It merely compresses the objects in the PDF and prunes any unreachable references.
But given how much waste Prawn leaves behind, this turns out to reduce the file size substantially.
You can hook this command directly into the converter by providing your own implementation of the `Optimizer` class.
Start by creating a Ruby file named [.path]_optimizer-hexapdf.rb_, then populate it with the following code:
.optimizer-hexapdf.rb
[source,ruby]
----
require 'hexapdf/cli'
class Asciidoctor::PDF::Optimizer
def initialize(*)
app = HexaPDF::CLI::Application.new
app.instance_variable_set :@force, true
@optimize = app.main_command.commands['optimize']
end
def optimize_file path
options = @optimize.instance_variable_get :@out_options
options.compress_pages = true
#options.object_streams = :preserve
#options.xref_streams = :preserve
#options.streams = :preserve # or :uncompress
@optimize.execute path, path
nil
rescue
# retry without page compression, which can sometimes fail
options.compress_pages = false
@optimize.execute path, path
nil
end
end
----
To activate your custom optimizer, load this file when invoking the `asciidoctor-pdf` using the `-r` flag and set the `optimize` attribute as well using the `-a` flag.
$ asciidoctor-pdf -r ./optimizer-hexapdf.rb -a optimize filename.adoc
Now you can convert and optimize all in one go.
To see more options that `hexapdf optimize` offers, run:
$ hexapdf help optimize
For example, to make the source of the PDF a bit more readable (though less optimized), set the stream-related options to `preserve` (e.g.,, `--streams preserve` from the CLI or `options.streams = :preserve` from the API).
You can also disable page compression (e.g., `--no-compress-pages` from the CLI or `options.compress_pages = false` from the API).
hexapdf also allows you to add password protection to your PDF, if that's something you're interested in doing.
== Rasterizing the PDF
Instead of optimizing the objects in the vector PDF, you may want to rasterize the PDF instead.
Rasterizing the PDF prevents any of the text or other objects from being selected, similar to a scanned document.
Asciidoctor PDF doesn't provide built-in support for rasterizing the generated PDF.
However, you can use Ghostscript to flatten all the text in the PDF, thus preventing it from being selected.
$ gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -dNoOutputFonts -r300 -o output.pdf input.pdf
You can adjust the value of the `-r` option (the density) to get a higher or lower quality result.
Alternately, you can use the `convert` command from ImageMagick to convert each page in the PDF to an image.
$ convert -density 300 -quality 100 input.pdf output.pdf
Yet another option is to combine Ghostscript and ImageMagick to produce a PDF with pages converted to images.
$ gs -dBATCH -dNOPAUSE -sDEVICE=png16m -o /tmp/tmp-%02d.png -r300 input.pdf
convert /tmp/tmp-*.png output.pdf
rm -f /tmp/tmp-*.png
Using Ghostscript to handle the rasterization produces a much smaller output file.
The drawback of using Ghostscript in this way is that it has to use intermediate files.
|