| Age | Commit message (Collapse) | Author |
|
With the release of TLS 2.0.0, the TLS library started requiring
Extended Main Secret for the TLS handshake. This caused problems
connecting to zotero's server and others that do not support TLS 1.3.
This commit relaxes this requirement.
Closes #9483.
|
|
|
|
This reverts commit 6625e9655ed2bb0c4bd4dd91b5959a103deab1cb.
base64 is currently buggy on 32-bit systems. Closes #9233.
|
|
- Only treat as base64 if ';base64' is present.
- Otherwise treat as UTF-8 (not 100% reliable but should cover most
other cases).
- Strip off ';base64' (or ';charset=...' or whatever) from mime type.
This last change addresses #9195 (problems with data URIs in
conversion to docx).
|
|
Guilhem Moulin noticed that the fix to CVE-2023-35936 was incomplete.
An attacker could get around it by double-encoding the malicious
extension to create or override arbitrary files.
$ echo '' >b.md
$ .cabal/bin/pandoc b.md --extract-media=bar
<p><img
src="bar/2a0eaa89f43fada3e6c577beea4f2f8f53ab6a1d.lua+%2f%2e%2e%2f%2e%2e%2fb%2elua" /></p>
$ cat b.lua
print "hello"
$ find bar
bar/
bar/2a0eaa89f43fada3e6c577beea4f2f8f53ab6a1d.lua+
This commit adds a test case for this more complex attack and fixes
the vulnerability. (The fix is quite simple: if the URL-unescaped
filename or extension contains a '%', we just use the sha1 hash of the
contents as the canonical name, just as we do if the filename contains
'..'.)
|
|
|
|
|
|
We forgot to filter out CRs as we do in toText.
|
|
This is like `Text.Pandoc.UTF8.toText`, except:
- it takes a file path as first argument, in addition to
bytestring contents
- it raises an informative error with source position if
the contents are not UTF8-encoded
[API change]
This replaces `utf8ToText` in `Text.Pandoc.App.Input`.
See #8884.
|
|
These changes recognize that parseURI does not unescape the path.
Another change is that the canonical form of the path used as the
MediaBag key retains percent-encoding, if present; we only unescape
the string when writing to a file.
See #8918.
Some tests are needed before the issue can be closed.
|
|
In the new code a comma mysteriously turned into a period.
This would have prevented proper separation of the mime type and
content in data uris. Thanks to @hseg for catching this.
|
|
This vulnerability, discovered by Entroy C, allows users to write
arbitrary files to any location by feeding pandoc a specially crafted
URL in an image element. The vulnerability is serious for anyone
using pandoc to process untrusted input. The vulnerability does
not affect pandoc when run with the `--sandbox` flag.
|
|
This message will also be triggered when media is being
extracted to a temporary location, e.g. in PDF production.
|
|
This is useful for the `pandoc.mediabag` module.
|
|
|
|
|
|
|
|
This will no doubt produce a bunch of warnings and hence CI
failures, which we'll need to work around with explicit imports.
|
|
`SOURCE_DATE_EPOCH` environment variable if set. (`getTimestamp` was
already sensitive.) This ensures that EPUB builds are reproducible.
Closes #7093.
|
|
This is still an unexported internal module.
Export `urlEncode`, `escapeURI`, `isURI`, `schemes`, `uriPathToPath`.
Re-export `escapeURI` and `isURI` from T.P.Shared (as they
were exported before); drop exports of `schemes` and `uriPathToPath`
[API change].
With this change, T.P.Class no longer depends on T.P.Shared.
|
|
This makes T.P.Class more self-contained, and suitable for extraction
into a separate package if desired.
[API changes]
- T.P.Data is now an exported module, providing `readDataFile`,
`readDefaultDataFile` (both formerly provided by T.P.Class),
and also `getDataFileNames` (formerly unexported in
T.P.App.CommandLineOptions).
- T.P.Translations is now an exported module (along with
T.P.Translations.Types), providing `readTranslations`,
`getTranslations`, `setTranslations`, `translateTerm`,
`lookupTerm`, `readTranslations`, `Term(..)`, and `Translations`.
- T.P.Class: `readDataFile`, `readDefaultDataFile`, `setTranslations`,
and `translateTerm` are no longer exported.
`checkUserDataDir` is now exported.
- Text.Pandoc now exports Text.Pandoc.Data and `setTranslations`
and `translateTerm`.
|
|
|
|
In 2.19.1 we used the base64URL encoding rather than base64.
This works in Safari, apparently, but not in other browsers.
Closes #8239.
|
|
It is supposed to be faster and more standards-compliant.
|
|
Images that cannot be fetched are replaced with a Span that contains the
image's description. The span now also retains all original image
attributes and inherits all attributes of the image. Furthermore, the
classes `image` and `placeholder` are added, and path and title are
store in attributes `original-image-src` and `original-image-title`,
respectively.
Closes: #8099
|
|
* PandocMonad: add new function `findFileWithDataFallback` [API Change]
* Custom readers: allow files to be placed in "readers" data dir
* Custom writers: allow files to be placed in "writers" data dir
|
|
Previously we used System.FilePath's isRelative to
determine when paths are relative (since absolute
paths need to get a new name based on the sha1 hash).
But this has an OS-specific behavior and actually
returns True on Windows for paths like `/media/file.png`.
This ought to fix #7881.
|
|
This reverts commit 3dcb526b9b084976bfb5ef2f02a6bf009fd78750.
|
|
PReviously if the directory argument ended in slash,
we'd get a doubled slash in the path. This may help
with #7881.
|
|
If a file path does not exist relative to the working directory, but
it does exist relative to the user data directory, and it exists outside
of the user data directory, do not read it. This applies to readDataFile
and readMetadataFile in PandocMonad and, by extension, any module that
uses these by passing them relative paths.
|
|
As an example, prior to this commit, "../../file" would evaluate to
"file", when it should be unchanged.
|
|
If files specified with `--metadata-file` are not found in the working
directory, look in `$DATADIR/metadata`.
Expose new `readMetadataFile` function from Text.Pandoc.Class
[API change].
Expose new `PandocCouldNotFindMetadataFileError` constructor for
`PandocError` from Text.Pandoc.Error [API change].
Closes #5876.
|
|
Closes #7819 (problem with spaces in image filenames when creating
PDFs).
|
|
|
|
This ensures that when `SOURCE_DATE_EPOCH` is set, the
modification times of files taken from the reference.docx will
be set deterministically, allowing for reproducible builds.
Closes #7654.
|
|
+ Add sandbox feature for readers. When this option is used,
readers and writers only have access to input files (and
other files specified directly on command line). This restriction
is enforced in the type system.
+ Filters, PDF production, custom writers are unaffected. This
feature only insulates the actual readers and writers, not
the pipeline around them in Text.Pandoc.App.
+ Note that when `--sandboxed` is specified, readers won't have
access to the resource path, nor will anything have access to
the user data directory.
+ Add module Text.Pandoc.Class.Sandbox, defining
`sandbox`. Exported via Text.Pandoc.Class. [API change]
Closes #5045.
|
|
[API change]
|
|
It was uselessly restricted to PandocIO, instead of any
instance of PandocMonad and MonadIO.
[API change]
|
|
from PandocIO to any instance of MonadIO and PandocMonad.
[API change]
|
|
This will allow us to use withTempDir.
|
|
Even on Windows.
May help with #7431.
|
|
With the 2.14 release `--extract-media` stopped working as before;
there could be mismatches between the paths in the rendered document and
the extracted media.
This patch makes several changes (while keeping the same API).
The `mediaPath` in 2.14 was always constructed from the SHA1 hash of
the media contents. Now, we preserve the original path unless it's
an absolute path or contains `..` segments (in that case we use a path
based on the SHA1 hash of the contents).
When constructing a path from the SHA1 hash, we always use the
original extension, if there is one. Otherwise we look up an
appropriate extension for the mime type.
`mediaDirectory` and `mediaItems` now use the `mediaPath`, rather
than the mediabag key, for the first component of the tuple.
This makes more sense, I think, and fits with the documentation
of these functions; eventually, though, we should rework the API so that
`mediaItems` returns both the keys and the MediaItems.
Rewriting of source paths in `extractMedia` has been fixed.
`fillMediaBag` has been modified so that it doesn't modify
image paths (that was part of the problem in #7345).
We now do path normalization (e.g. `\` separators on Windows) only
in writing the media; the paths are left unchanged in the image
links (sensibly, since they might be URLs and not file paths).
These changes should restore the original behavior from before 2.14.
Closes #7345.
|
|
This ensures that we get `\` separators on Windows.
|
|
The immediate reason for this is to allow the test output of #3752
to work on both windows and linux.
|
|
indicating what path local resources have been loaded from.
|
|
In the current dev version, we will sometimes add
a version of an image with a hashed name, keeping
the original version with the original name, which
would leave to undesirable duplication.
This change separates the media's filename from the
media's canonical name (which is the path of the link
in the document itself). Filenames are based on SHA1
hashes and assigned automatically.
In Text.Pandoc.MediaBag:
- Export MediaItem type [API change].
- Change MediaBag type to a map from Text to MediaItem [API change].
- `lookupMedia` now returns a `MediaItem` [API change].
- Change `insertMedia` so it sets the `mediaPath` to
a filename based on the SHA1 hash of the contents.
This will be used when contents are extracted.
In Text.Pandoc.Class.PandocMonad:
- Remove `fetchMediaResource` [API change].
Lua MediaBag module has been changed minimally. In the future
it would be better, probably, to give Lua access to the full
MediaItem type.
|
|
|
|
|
|
instead of `[FilePath]`.
We normalize the path and use `/` separators for consistency.
|
|
Update citeproc test.
|