|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | |  | 
| | |  | 
| |\  
| | 
| | 
| | | into imkaka-fix/deepsource-issues | 
| | |\ |  | 
| | | | |  | 
| | |/  
|/| |  | 
| | | |  | 
| | | |  | 
| | | |  | 
| | | |  | 
| | | |  | 
| |/ |  | 
| | 
| 
| 
| 
| 
| | The renamed_file function contains the following which ends up on readthedocs: 
  :note: This property is deprecated, please use ``renamed_file`` instead.
Removed the line | 
| | |  | 
| | 
| 
| 
| | Also store the rename score | 
| | |  | 
| | 
| 
| 
| 
| | I did keep some "bare" except with catch all Exception: , while tried to disable
flake8 complaints where clearly all exceptions are to be catched | 
| |\  
| | 
| | 
| | | ankostis-cygwin | 
| | | |  | 
| |/ |  | 
| | 
| 
| 
| 
| | + No WindowsError exception.
+ Add `test_exc.py` for unicode issues.
+ Single-arg for decoding-streams in pump-func. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | + CAUSE: In Windows, Diffs freeze while reading Popen streams,
probably buffers smaller; good-thin(TM) in this case because reading a
Popen-proc from the launching-thread freezes GIL.  The alternative to
use `proc.communicate()` also relies on big buffers.
+ SOLUTION: Use `cmd.handle_process_output()` to consume Diff-proc
streams.
+ Retroffited `handle_process_output()` code to support also
byte-streams, both Threading(Windows) and Select/Poll (Posix) paths
updated.
- TODO: Unfortunately, `Diff._index_from_patch_format()` still slurps
input; need to re-phrase header-regexes linewise to resolve it. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | That way, we do not have to figure the change type out by
examining the diff object.
It's implemented in a way that should yield more desireable results
as we keep the change-type that git is providing us with.
Fixes #493 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Previously, the following fields on Diff instances were assumed to be
passed in as unicode strings:
  - `a_path`
  - `b_path`
  - `rename_from`
  - `rename_to`
However, since Git natively records paths as bytes, these may
potentially not have a valid unicode representation.
This patch changes the Diff instance to instead take the following
equivalent fields that should be raw bytes instead:
  - `a_rawpath`
  - `b_rawpath`
  - `raw_rename_from`
  - `raw_rename_to`
NOTE ON BACKWARD COMPATIBILITY:
The original `a_path`, `b_path`, etc. fields are still available as
properties (rather than slots).  These properties now dynamically decode
the raw bytes into a unicode string (performing the potentially
destructive operation of replacing invalid unicode chars by "�"'s).
This means that all code using Diffs should remain backward compatible.
The only exception is when people would manually construct Diff
instances by calling the constructor directly, in which case they should
now pass in bytes rather than unicode strings.
See also the discussion on
https://github.com/gitpython-developers/GitPython/pull/467 | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The diff --patch parser was missing some edge case where Git would
encode non-ASCII chars in path names as octals, but these weren't
decoded properly.
    \360\237\222\251.txt
Decoded via utf-8, that will return:
    💩.txt | 
| | 
| 
| 
| | Fixes #426 | 
| | 
| 
| 
| | Specifically "string_escape" does not exist as an encoding anymore. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This specifically covers the cases where unsafe chars occur in path
names, and git-diff -p will escape those.
From the git-diff-tree manpage:
> 3. TAB, LF, double quote and backslash characters in pathnames are
>    represented as \t, \n, \" and \\, respectively. If there is need
>    for such substitution then the whole pathname is put in double
>    quotes.
This patch checks whether or not this has happened and will unescape
those paths accordingly.
One thing to note here is that, depending on the position in the patch
format, those paths may be prefixed with an a/ or b/.  I've specifically
made sure to never interpret a path that actually starts with a/ or b/
incorrectly.
Example of that subtlety below.  Here, the actual file path is
"b/normal".  On the diff file that gets encoded as "b/b/normal".
     diff --git a/b/normal b/b/normal
     new file mode 100644
     index 0000000000000000000000000000000000000000..eaf5f7510320b6a327fb308379de2f94d8859a54
     --- /dev/null
     +++ b/b/normal
     @@ -0,0 +1 @@
     +dummy content
Here, we prefer the "---" and "+++" lines' values.  Note that these
paths start with a/ or b/.  The only exception is the value "/dev/null",
which is handled as a special case.
Suppose now the file gets moved "b/moved", the output of that diff would
then be this:
     diff --git a/b/normal b/b/moved
     similarity index 100%
     rename from b/normal
     rename to b/moved
We prefer the "rename" lines' values in this case (the "diff" line is
always a last resort).  Take note that those lines are not prefixed with
a/ or b/, but the ones in the "diff" line are (just like the ones in
"---" or "+++" lines). | 
| | 
| 
| 
| 
| | When both old/new mode and rename from/to lines are found, they will
appear in different order. | 
| | 
| 
| 
| 
| 
| | This makes sure we're not matching a \n here by accident.  It's now
almost the same as the original that used \S+, except that spaces are
not eaten at the end of the string (for files that end in a space). | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The a_path and b_path cannot reliably be read from the first diff line
as it's ambiguous.  From the git-diff manpage:
  > The a/ and b/ filenames are the same unless rename/copy is involved.
  > Especially, **even for a creation or a deletion**, /dev/null is not
  > used in place of the a/ or b/ filenames.
This patch changes the a_path and b_path detection to read it from the
more reliable locations further down the diff headers.  Two use cases
are fixed by this:
  - As the man page snippet above states, for new/deleted files the a
    or b path will now be properly None.
  - File names with spaces in it are now properly parsed.
Working on this patch, I realized the --- and +++ lines really belong to
the diff header, not the diff contents.  This means that when parsing
the patch format, the --- and +++ will now be swallowed, and not end up
anymore as part of the diff contents.  The diff contents now always
start with an @@ line.
This may be a breaking change for some users that rely on this
behaviour.  However, those users could now access that information more
reliably via the normal Diff properties a_path and b_path now. | 
| | |  | 
| | |  | 
| | 
| 
| 
| 
| | This alternative API does not prevent users from using the valid treeish
"root". | 
| | 
| 
| 
| | This enabled getting diff patches for root commits. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This protects against `.split(None)` which uses consecutive whitespace
as a separator to overlook paths where a single space is the filename.
For example, in this diff line:
line = ':100644 000000 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
0000000000000000000000000000000000000000 D       '
The deleted file is a file named ' ' (just one space).  It's entirely
possible to commit this, remove, and to produce the following output
from `git diff`:
git diff --name-status <SHA1> <SHA2>
D
M       path/to/another/file.py
...
This would cause the initial `.split(None, 5)` to fail as it will count
all consecutive whitespace as a separator, disregarding the ' ' (single
space)  filename. | 
| | 
| 
| 
| 
| 
| 
| 
| | If the file was not present, the mode seen in a diff can be legally '0',
which previously caused an assertion to fail for no good reason.
Now the assertion tests for None instead.
Closes #323 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | If a file in a commit contains no changes (for example, if only the
file mode is changed) there will be no blob attached.  This is
usually where the filename is stored, so without it, the calling
context can not tell what file was changed.  Instead, always
store a_path and b_path on the Diff object so that information
is available. | 
| | 
| 
| 
| 
| 
| | That way they are protected from regression.
Fixes #239 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Additionally, unicode handling was improved to the point where we deal
with all diff(create_path=True) data as binary.
Therefore we don't claim to know all encodings of all textfiles in the world,
even though we still assume that everything git throws at us is utf-8 encoded.
Fixes #113 | 
| | 
| 
| 
| 
| 
| | future
There is still some work todo in terms of how we handle the encoding | 
| | 
| 
| 
| | Related to #74 | 
| | 
| 
| 
| | Fixes #36 | 
| | 
| 
| 
| 
| 
| 
| | Helps fixing #35
Also, the production status was changed to 'stable', which should
have been done much earlier. | 
| | |  | 
| | |  | 
| | 
| 
| 
| | More to come, especially when it's about strings | 
| | 
| 
| 
| 
| | There is more work to do though, as many imports are still incorrect.
Also, there are still print statements | 
| | 
| 
| 
| 
| | Commandline was
autopep8 -j 8 --max-line-length 120 --in-place --recursive --exclude "*gitdb*,*async*" git/ |