delta/libgit2.git/src/patch_parse.c, branch ethomson/github_actions

tree-wide: mark local functions as static

2020-06-09T12:57:06+00:00

We've accumulated quite some functions which are never used outside of
their respective code unit, but which are lacking the `static` keyword.
Add it to reduce their linkage scope and allow the compiler to optimize
better.

patch: correctly handle mode changes for renames

2020-03-26T13:21:35+00:00

When generating a patch for a renamed file whose mode bits have changed
in addition to the rename, then we currently fail to parse the generated
patch. Furthermore, when generating a diff we output mode bits after the
similarity metric, which is different to how upstream git handles it.

Fix both issues by adding another state transition that allows
similarity indices after mode changes and by printing mode changes
before the similarity index.

patch_parse: fix undefined behaviour due to arithmetic on NULL pointers

2019-12-13T12:22:56+00:00

Doing arithmetic with NULL pointers is undefined behaviour in the C
standard. We do so regardless when parsing patches, as we happily add a
potential prefix length to prefixed paths. While this works out just
fine as the prefix length is always equal to zero in these cases, thus
resulting in another NULL pointer, it still is undefined behaviour and
was pointed out to us by OSSfuzz.

Fix the issue by checking whether paths are NULL, avoiding the
arithmetic if they are.

patch_parse: fix out-of-bounds reads caused by integer underflow

2019-11-28T14:26:36+00:00

The patch format for binary files is a simple Base85 encoding with a
length byte as prefix that encodes the current line's length. For each
line, we thus check whether the line's actual length matches its
expected length in order to not faultily apply a truncated patch. This
also acts as a check to verify that we're not reading outside of the
line's string:

	if (encoded_len > ctx->parse_ctx.line_len - 1) {
		error = git_parse_err(...);
		goto done;
	}

There is the possibility for an integer underflow, though. Given a line
with a single prefix byte, only, `line_len` will be zero when reaching
this check. As a result, subtracting one from that will result in an
integer underflow, causing us to assume that there's a wealth of bytes
available later on. Naturally, this may result in an out-of-bounds read.

Fix the issue by checking both `encoded_len` and `line_len` for a
non-zero value. The binary format doesn't make use of zero-length lines
anyway, so we need to know that there are both encoded bytes and
remaining characters available at all.

This patch also adds a test that works based on the last error message.
Checking error messages is usually too tightly coupled, but in fact
parsing the patch failed even before the change. Thus the only
possibility is to use e.g. Valgrind, but that'd result in us not
catching issues when run without Valgrind. As a result, using the error
message is considered a viable tradeoff as we know that we didn't start
decoding Base85 in the first place.

Merge pull request #5306 from herrerog/patchid

2019-11-28T13:41:58+00:00

diff: complete support for git patchid

odb: use `git_object_size_t` for object size

2019-11-22T04:14:13+00:00

Instead of using a signed type (`off_t`) use a new `git_object_size_t`
for the sizes of objects.

patch_parse: correct parsing of patch containing not shown binary data.

2019-11-19T08:33:12+00:00

When not shown binary data is added or removed in a patch, patch parser
is currently returning 'error -1 - corrupt git binary header at line 4'.
Fix it by correctly handling case where binary data is added/removed.

Signed-off-by: Gregory Herrero

patch_parse: use paths from "---"/"+++" lines for binary patches

2019-11-10T17:54:01+00:00

For some patches, it is not possible to derive the old and new file
paths from the patch header's first line, most importantly when they
contain spaces. In such a case, we derive both paths from the "---" and
"+++" lines, which allow for non-ambiguous parsing. We fail to use these
paths when parsing binary patches without data, though, as we always
expect the header paths to be filled in.

Fix this by using the "---"/"+++" paths by default and only fall back to
header paths if they aren't set. If neither of those paths are set, we
just return an error. Add two tests to verify this behaviour, one of
which would have previously caused a segfault.

patch_parse: fix segfault when header path contains whitespace only

2019-11-05T21:50:41+00:00

When parsing header paths from a patch, we reject any patches with empty
paths as malformed patches. We perform the check whether a path is empty
before sanitizing it, though, which may lead to a path becoming empty
after the check, e.g. if we have trimmed whitespace. This may lead to a
segfault later when any part of our patching logic actually references
such a path, which may then be a `NULL` pointer.

Fix the issue by performing the check after sanitizing. Add tests to
catch the issue as they would have produced a segfault previosuly.

patch_parse: detect overflow when calculating old/new line position

2019-10-21T18:07:42+00:00

When the patch contains lines close to INT_MAX, then it may happen that
we end up with an integer overflow when calculating the line of the
current diff hunk. Reject such patches as unreasonable to avoid the
integer overflow.

As the calculation is performed on integers, we introduce two new
helpers `git__add_int_overflow` and `git__sub_int_overflow` that perform
the integer overflow check in a generic way.