delta/libgit2.git/src/patch_parse.c, branch ethomson/winhttp

Explicitly mark fallthrough cases with comments

2018-02-16T11:42:28+00:00

A lot of compilers nowadays generate warnings when there are cases in a
switch statement which implicitly fall through to the next case. To
avoid this warning, the last line in the case that is falling through
can have a comment matching a regular expression, where one possible
comment body would be `/* fall through */`.

An alternative to the comment would be an explicit attribute like e.g.
`[[clang::fallthrough]` or `__attribute__ ((fallthrough))`. But GCC only
introduced support for such an attribute recently with GCC 7. Thus, and
also because the fallthrough comment is supported by most compilers, we
settle for using comments instead.

One shortcoming of that method is that compilers are very strict about
that. Most interestingly, that comment _really_ has to be the last line.
In case a closing brace follows the comment, the heuristic will fail.

Merge pull request #4285 from pks-t/pks/patches-with-whitespace

2017-12-23T23:30:29+00:00

patch_parse: fix parsing unquoted filenames with spaces

refcount: make refcounting conform to aliasing rules

2017-11-18T16:06:14+00:00

Strict aliasing rules dictate that for most data types, you are not
allowed to cast them to another data type and then access the casted
pointers. While this works just fine for most compilers, technically we
end up in undefined behaviour when we hurt that rule.

Our current refcounting code makes heavy use of casting and thus
violates that rule. While we didn't have any problems with that code,
Travis started spitting out a lot of warnings due to a change in their
toolchain. In the refcounting case, the code is also easy to fix:
as all refcounting-statements are actually macros, we can just access
the `rc` field directly instead of casting.

There are two outliers in our code where that doesn't work. Both the
`git_diff` and `git_patch` structures have specializations for generated
and parsed diffs/patches, which directly inherit from them. Because of
that, the refcounting code is only part of the base structure and not of
the children themselves. We can help that by instead passing their base
into `GIT_REFCOUNT_INC`, though.

patch_parse: allow parsing ambiguous patch headers

2017-11-11T20:29:28+00:00

The git patch format allows for having unquoted paths with whitespaces
inside. This format becomes ambiguous to parse, e.g. in the following
example:

    diff --git a/file b/with spaces.txt b/file b/with spaces.txt

While we cannot parse this in a correct way, we can instead use the
"---" and "+++" lines to retrieve the file names, as the path is not
followed by anything here but spans the complete remaining line. Because
of this, we can simply bail outwhen parsing the "diff --git" header here
without an actual error and then proceed to just take the paths from the
other headers.

patch_parse: treat complete line after "---"/"+++" as path

2017-11-11T20:29:28+00:00

When parsing the "---" and "+++" line, we stop after the first
whitespace inside of the filename. But as files containing whitespaces
do not need to be quoted, we should instead use the complete line here.

This fixes parsing patches with unquoted paths with whitespaces.

parse: always initialize line pointer

2017-11-11T17:06:23+00:00

Upon initializing the parser context, we do not currently initialize the
current line, line length and line number. Do so in order to make the
interface easier to use and more obvious for future consumers of the
parsing API.

parse: implement `git_parse_peek`

2017-11-11T17:06:23+00:00

Some code parts need to inspect the next few bytes without actually
consuming it yet, for example to examine what content it has to expect
next. Create a new function `git_parse_peek` which returns the next byte
without modifying the parsing context and use it at multiple call sites.

parse: implement and use `git_parse_advance_digit`

2017-11-11T17:06:23+00:00

The patch parsing code has multiple recurring patterns where we want to
parse an actual number. Create a new function `git_parse_advance_digit`
and use it to avoid code duplication.

patch_parse: use git_parse_contains_s

2017-11-11T17:06:23+00:00

Instead of manually checking the parsing context's remaining length and
comparing the leading bytes with a specific string, we can simply re-use
the function `git_parse_ctx_contains_s`. Do so to avoid code duplication
and to further decouple patch parsing from the parsing context's struct
members.

parse: extract parse module

2017-11-11T17:06:20+00:00

The `git_patch_parse_ctx` encapsulates both parser state as well as
options specific to patch parsing. To advance this state and keep it
consistent, we provide a few functions which handle advancing the
current position and accessing bytes of the patch contents. In fact,
these functions are quite generic and not related to patch-parsing by
themselves. Seeing that we have similar logic inside of other modules,
it becomes quite enticing to extract this functionality into its own
parser module.

To do so, we create a new module `parse` with a central struct called
`git_parse_ctx`. It encapsulates both the content that is to be parsed
as well as its lengths and the current position. `git_patch_parse_ctx`
now only contains this `parse_ctx` only, which is then accessed whenever
we need to touch the current parser. This is the first step towards
re-using this functionality across other modules which require parsing
functionality and remove code-duplication.