summaryrefslogtreecommitdiff
path: root/src/pack.c
Commit message (Collapse)AuthorAgeFilesLines
* offmap: introduce high-level setter for key/value pairsPatrick Steinhardt2019-02-151-5/+2
| | | | | | | | | | | | Currently, there is only one caller that adds entries into an offset map, and this caller first uses `git_offmap_put` to add a key and then set the value at the returned index by using `git_offmap_set_value_at`. This is just too tighlty coupled with implementation details of the map as it exposes the index of inserted entries, which we really do not care about at all. Introduce a new function `git_offmap_set`, which takes as parameters the map, key and value and directly returns an error code. Convert the caller to make use of it instead.
* offmap: introduce high-level getter for valuesPatrick Steinhardt2019-02-151-5/+2
| | | | | | | | | | | | | | The current way of looking up an entry from a map is tightly coupled with the map implementation, as one first has to look up the index of the key and then retrieve the associated value by using the index. As a caller, you usually do not care about any indices at all, though, so this is more complicated than really necessary. Furthermore, it invites for errors to happen if the correct error checking sequence is not being followed. Introduce a new high-level function `git_offmap_get` that takes a map and a key and returns a pointer to the associated value if such a key exists. Otherwise, a `NULL` pointer is returned. Adjust all callers that can trivially be converted.
* oidmap: introduce high-level getter for valuesPatrick Steinhardt2019-02-151-4/+3
| | | | | | | | | | | | | | The current way of looking up an entry from a map is tightly coupled with the map implementation, as one first has to look up the index of the key and then retrieve the associated value by using the index. As a caller, you usually do not care about any indices at all, though, so this is more complicated than really necessary. Furthermore, it invites for errors to happen if the correct error checking sequence is not being followed. Introduce a new high-level function `git_oidmap_get` that takes a map and a key and returns a pointer to the associated value if such a key exists. Otherwise, a `NULL` pointer is returned. Adjust all callers that can trivially be converted.
* maps: use uniform lifecycle management functionsPatrick Steinhardt2019-02-151-2/+2
| | | | | | | | | | | | | | | | Currently, the lifecycle functions for maps (allocation, deallocation, resize) are not named in a uniform way and do not have a uniform function signature. Rename the functions to fix that, and stick to libgit2's naming scheme of saying `git_foo_new`. This results in the following new interface for allocation: - `int git_<t>map_new(git_<t>map **out)` to allocate a new map, returning an error code if we ran out of memory - `void git_<t>map_free(git_<t>map *map)` to free a map - `void git_<t>map_clear(git<t>map *map)` to remove all entries from a map This commit also fixes all existing callers.
* Allow bypassing check '.keep' files using libgit2 option ↵Dhruva Krishnamurthy2019-02-021-3/+8
| | | | 'GIT_OPT_IGNORE_PACK_KEEP_FILE_CHECK'
* git_error: use new names in internal APIs and usageEdward Thomson2019-01-221-24/+24
| | | | | Move to the `git_error` name in the internal API for error-related functions.
* object_type: GIT_OBJECT_BAD is now GIT_OBJECT_INVALIDEdward Thomson2019-01-171-2/+2
| | | | | | | We use the term "invalid" to refer to bad or malformed data, eg `GIT_REF_INVALID` and `GIT_EINVALIDSPEC`. Since we're changing the names of the `git_object_t`s in this release, update it to be `GIT_OBJECT_INVALID` instead of `BAD`.
* object_type: use new enumeration namesethomson/index_fixesEdward Thomson2018-12-011-26/+26
| | | | Use the new object_type enumeration names within the codebase.
* khash: remove intricate knowledge of khash typesPatrick Steinhardt2018-11-281-3/+3
| | | | | | | Instead of using the `khiter_t`, `git_strmap_iter` and `khint_t` types, simply use `size_t` instead. This decouples code from the khash stuff and makes it possible to move the khash includes into the implementation files.
* Convert usage of `git_buf_free` to new `git_buf_dispose`Patrick Steinhardt2018-06-101-3/+3
|
* pack: rename `git_packfile_stream_free`Patrick Steinhardt2018-06-101-2/+2
| | | | | | | | | | | | The function `git_packfile_stream_free` frees all state of the packfile stream without freeing the structure itself. This naming makes it hard to spot whether it will try to free the pointer itself or not, causing potential future errors. Due to this reason, we have decided to name a function freeing state without freeing the actual struture a "dispose" function. Rename `git_packfile_stream_free` to `git_packfile_stream_dispose` as a first example following this rule.
* Fix unpack double freelhchavez2017-12-231-1/+4
| | | | | | | | | | | If an element has been cached, but then the call to packfile_unpack_compressed() fails, the very next thing that happens is that its data is freed and then the element is not removed from the cache, which frees the data again. This change sets obj->data to NULL to avoid the double-free. It also stops trying to resolve deltas after two continuous failed rounds of resolution, and adds a test for this.
* Simplified overflow conditionlhchavez2017-12-151-3/+1
|
* Using unsigned insteadlhchavez2017-12-091-6/+8
|
* libFuzzer: Prevent a potential shift overflowlhchavez2017-12-081-1/+1
| | | | | | | | | The type of |base_offset| in get_delta_base() is `git_off_t`, which is a signed `long`. That means that we need to make sure that the 8 most significant bits are zero (instead of 7) to avoid an overflow when it is shifted by 7 bits. Found using libFuzzer.
* Merge pull request #4288 from pks-t/pks/include-fixupsEdward Thomson2017-08-151-2/+2
|\ | | | | Include fixups
| * Make sure to always include "common.h" firstPatrick Steinhardt2017-07-031-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
* | sha1_lookup: drop sha1_entry_pos functionpeff/drop-sha1-entry-posJeff King2017-08-091-4/+0
|/ | | | | | | | | | | | | This was pulled over from git.git, and is an experiment in making binary-searching lists of sha1s faster. It was never compiled by default (nor was it used upstream by default without a special environment variable). Unfortunately, it is actually slower in practice, and upstream is planning to drop it in git/git@f1068efefe6dd3beaa89484db5e2db730b094e0b (which has some timing results). It's worth doing the same here for simplicity.
* buffer: use `git_buf_init` with lengthPatrick Steinhardt2017-06-081-2/+5
| | | | | | | | The `git_buf_init` function has an optional length parameter, which will cause the buffer to be initialized and allocated in one step. This can be used instead of static initialization with `GIT_BUF_INIT` followed by a `git_buf_grow`. This patch does so for two functions where it is applicable.
* buffer: rely on `GITERR_OOM` set by `git_buf_try_grow`Patrick Steinhardt2017-06-081-1/+0
| | | | | | | | The function `git_buf_try_grow` consistently calls `giterr_set_oom` whenever growing the buffer fails due to insufficient memory being available. So in fact, we do not have to do this ourselves when a call to any buffer-growing function has failed due to an OOM situation. But we still do so in two functions, which this patch cleans up.
* pack: fix looping over cache entriesJason Haslam2017-02-221-3/+3
| | | | | | Fixes a regression from #4092. This is a crash on 32-bit and I assume that it doesn't do the right thing on 64-bit either. MSVC emits a warning for this, but of course, it's easy to get lost among all of the similar 'possible loss of data' warnings.
* offmap: remove GIT__USE_OFFMAP macroPatrick Steinhardt2017-02-171-2/+0
|
* oidmap: remove GIT__USE_OIDMAP macroPatrick Steinhardt2017-02-171-1/+0
|
* khash: avoid using `kh_key`/`kh_val` as lvaluePatrick Steinhardt2017-02-171-1/+1
|
* khash: avoid using `kh_put` directlyPatrick Steinhardt2017-02-171-1/+1
|
* khash: avoid using `kh_del` directlyPatrick Steinhardt2017-02-171-1/+1
|
* khash: avoid using `kh_val`/`kh_value` directlyPatrick Steinhardt2017-02-171-3/+3
|
* khash: avoid using `kh_get` directlyPatrick Steinhardt2017-02-171-2/+2
|
* khash: avoid using `kh_end` directlyPatrick Steinhardt2017-02-171-2/+2
|
* khash: use `git_map_exists` where applicablePatrick Steinhardt2017-02-171-1/+1
|
* khash: avoid using `kh_foreach`/`kh_foreach_value` directlyPatrick Steinhardt2017-02-171-12/+6
|
* indexer: introduce `git_packfile_close`Edward Thomson2017-01-211-4/+13
| | | | Encapsulation!
* giterr_set: consistent error messagesEdward Thomson2016-12-291-6/+6
| | | | | | | | Error messages should be sentence fragments, and therefore: 1. Should not begin with a capital letter, 2. Should not conclude with punctuation, and 3. Should not end a sentence and begin a new one
* Merge pull request #4027 from pks-t/pks/pack-deref-cache-on-errorCarlos Martín Nieto2016-12-191-1/+4
|\ | | | | pack: dereference cached pack entry on error
| * pack: dereference cached pack entry on errorPatrick Steinhardt2016-12-121-1/+4
| | | | | | | | | | | | | | | | | | | | When trying to uncompress deltas in a packfile's delta chain, we try to add object bases to the packfile cache, subsequently decrementing its reference count if it has been added successfully. This may lead to a mismatched reference count in the case where we exit the loop early due to an encountered error. Fix the issue by decrementing the reference count in error cleanup.
* | Fix potential use of uninitialized valuesPatrick Steinhardt2016-12-121-1/+3
|/
* pack: fix race in pack_entry_find_offsetPatrick Steinhardt2016-11-021-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | In `pack_entry_find_offset`, we try to find the offset of a certain object in the pack file. To do so, we first assert if the packfile has already been opened and open it if not. Opening the packfile is guarded with a mutex, so concurrent access to this is in fact safe. What is not thread-safe though is our calculation of offsets inside the packfile. Assume two threads calling `pack_entry_find_offset` at the same time. We first calculate the offset and index location and only then determine if the pack has already been opened. If so, we re-calculate the offset and index address. Now the case for two threads: thread 1 first calculates the addresses and is subsequently suspended. The second thread will now call `pack_index_open` and initialize the pack file, calculating its addresses correctly. When the first thread is resumed now, he'll see that the pack file has already been initialized and will happily proceed with the addresses it has already calculated before the check. As the pack file was not initialized before, these addresses are bogus. Fix the issue by only calculating the addresses after having checked if the pack file is open.
* delta: move delta application to delta.cEdward Thomson2016-05-261-3/+4
| | | | | | | Move the delta application functions into `delta.c`, next to the similar delta creation functions. Make the `git__delta_apply` functions adhere to other naming and parameter style within the library.
* odb: avoid inflating the full delta to read the headercmn/faster-headerCarlos Martín Nieto2016-05-021-6/+5
| | | | | | | | | | When we read the header, we want to know the size and type of the object. We're currently inflating the full delta in order to read the first few bytes. This can mean hundreds of kB needlessly inflated for large objects. Instead use a packfile stream to read just enough so we can read the two varints in the header and avoid inflating most of the delta.
* Merge pull request #3575 from pmq20/master-13jan16Carlos Martín Nieto2016-03-311-3/+0
|\ | | | | Remove duplicated calls to git_mwindow_close
| * Remove duplicated calls to git_mwindow_closeP.S.V.R2016-01-131-3/+0
| |
* | odb: improved not found error messagesEdward Thomson2016-03-071-5/+5
| | | | | | | | | | When looking up an abbreviated oid, show the actual (abbreviated) oid the caller passed instead of a full (but ambiguously truncated) oid.
* | pack: don't allow a negative offsetcmn/idx-extra-checkCarlos Martín Nieto2016-02-251-0/+5
| |
* | pack: make sure we don't go out of bounds for extended entriesCarlos Martín Nieto2016-02-251-1/+13
| | | | | | | | | | | | A corrupt index might have data that tells us to go look past the end of the file for data. Catch these cases and return an appropriate error message.
* | pack: do not free passed in poiter on errorPatrick Steinhardt2016-02-091-1/+0
| | | | | | | | | | | | | | | | | | | | | | The function `git_packfile_stream_open` tries to free the passed in stream when an error occurs. The only call site is `git_indexer_append`, though, which passes in the address of a stream struct which has not been allocated on the heap. Fix the issue by simply removing the call to free. In case of an error we did not allocate any memory yet and otherwise it should be the caller's responsibility to manage it's object's lifetime.
* | Make packfile_unpack_compressed a private APIP.S.V.R2016-01-131-2/+2
|/
* Remove extra semicolon outside of a functionStefan Widgren2015-07-311-2/+2
| | | | | Without this change, compiling with gcc and pedantic generates warning: ISO C does not allow extra ‘;’ outside of a function.
* pack: use git_buf when building the index nameCarlos Martín Nieto2015-06-101-10/+11
| | | | | | The way we currently do it depends on the subtlety of strlen vs sizeof and the fact that .pack is one longer than .idx. Let's use a git_buf so we can express the manipulation we want much more clearly.
* indexer: don't look for the index we're creatingEdward Thomson2015-05-221-0/+7
| | | | | | When creating an index, know that we do not have an index for our own packfile, preventing some unnecessary file opens and error reporting.
* Reorder some khash declarationsCarlos Martín Nieto2015-03-111-0/+3
| | | | | | Keep the definitions in the headers, while putting the declarations in the C files. Putting the function definitions in headers causes them to be duplicated if you include two headers with them.