summaryrefslogtreecommitdiff
path: root/src/odb_loose.c
Commit message (Collapse)AuthorAgeFilesLines
* path: separate git-specific path functions from utilEdward Thomson2021-11-091-10/+10
| | | | | | Introduce `git_fs_path`, which operates on generic filesystem paths. `git_path` will be kept for only git-specific path functionality (for example, checking for `.git` in a path).
* str: introduce `git_str` for internal, `git_buf` is externalethomson/gitstrEdward Thomson2021-10-171-58/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.
* hash: accept the algorithm in inputsEdward Thomson2021-10-011-1/+1
|
* odb_loose: use GIT_ASSERTEdward Thomson2020-11-271-18/+28
|
* Make the tests pass cleanly with MemorySanitizerlhchavez2020-06-301-3/+3
| | | | | | | | | This change: * Initializes a few variables that were being read before being initialized. * Includes https://github.com/madler/zlib/pull/393. As such, it only works reliably with `-DUSE_BUNDLED_ZLIB=ON`.
* odb: use `git_object_size_t` for object sizeEdward Thomson2019-11-221-2/+2
| | | | | Instead of using a signed type (`off_t`) use a new `git_object_size_t` for the sizes of objects.
* fileops: rename to "futils.h" to match function signaturesPatrick Steinhardt2019-07-201-1/+1
| | | | | | | | | Our file utils functions all have a "futils" prefix, e.g. `git_futils_touch`. One would thus naturally guess that their definitions and implementation would live in files "futils.h" and "futils.c", respectively, but in fact they live in "fileops.h". Rename the files to match expectations.
* odb loose: only read at most INT_MAXEdward Thomson2019-06-241-7/+14
|
* odb_loose: explicitly cast to size_tEdward Thomson2019-01-251-1/+1
| | | | | | Quiet down a warning from MSVC about how we're potentially losing data. This is safe since we've explicitly tested that it's positive and less than SIZE_MAX.
* git_error: use new names in internal APIs and usageEdward Thomson2019-01-221-22/+22
| | | | | Move to the `git_error` name in the internal API for error-related functions.
* object_type: GIT_OBJECT_BAD is now GIT_OBJECT_INVALIDEdward Thomson2019-01-171-3/+3
| | | | | | | We use the term "invalid" to refer to bad or malformed data, eg `GIT_REF_INVALID` and `GIT_EINVALIDSPEC`. Since we're changing the names of the `git_object_t`s in this release, update it to be `GIT_OBJECT_INVALID` instead of `BAD`.
* Merge pull request #4906 from QBobWatson/bugfixEdward Thomson2018-12-061-5/+9
|\ | | | | Fix segfault in loose_backend__readstream
| * Typesetting conventionsJoe Rabinoff2018-12-061-9/+9
| |
| * Removed one null checkJoe Rabinoff2018-12-041-3/+2
| |
| * Fix segfault in loose_backend__readstreamJoe Rabinoff2018-12-041-5/+10
| | | | | | | | | | If the routine exits with error before stream or hash_ctx is initialized, the program will segfault when trying to free them.
* | object_type: use new enumeration namesethomson/index_fixesEdward Thomson2018-12-011-10/+10
|/ | | | Use the new object_type enumeration names within the codebase.
* Convert usage of `git_buf_free` to new `git_buf_dispose`Patrick Steinhardt2018-06-101-13/+13
|
* odb_loose: only close file descriptor if it was opened successfullyPatrick Steinhardt2018-02-091-1/+2
|
* odb: fix memory leaks due to not freeing hash contextPatrick Steinhardt2018-02-091-0/+1
|
* odb: error when we can't create object headerEdward Thomson2018-02-091-4/+10
| | | | | Return an error to the caller when we can't create an object header for some reason (printf failure) instead of simply asserting.
* odb_loose: HEADER_LEN -> MAX_HEADER_LENethomson/odb_loose_readstreamEdward Thomson2018-02-011-7/+7
| | | | `MAX_HEADER_LEN` is a more descriptive constant name.
* odb_loose: validate length when checking for zlib contentEdward Thomson2018-02-011-4/+7
| | | | | When checking to see if a file has zlib deflate content, make sure that we actually have read at least two bytes before examining the array.
* odb_loose: `read_header` for packlike loose objectsEdward Thomson2018-02-011-20/+46
| | | | | | | | | | | Support `read_header` for "packlike loose objects", which were a temporarily and uncommonly used format loose object format that encodes the header before the zlib deflate data. This will never actually be seen in the wild, but add support for it for completeness and (more importantly) because our corpus of test data has objects in this format, so it's easier to support it than to try to special case it.
* odb_loose: read_header should use zstreamEdward Thomson2018-02-011-85/+24
| | | | | Make `read_header` use the common zstream implementation. Remove the now unnecessary zlib wrapper in odb_loose.
* odb_loose: packlike loose objects use `git_zstream`Edward Thomson2018-02-011-88/+71
| | | | | Refactor packlike loose object reads to use `git_zstream` for simplification.
* odb: loose object streaming for packlike loose objectsEdward Thomson2018-02-011-37/+84
| | | | | | | A "packlike" loose object was a briefly lived loose object format where the type and size were encoded in uncompressed space at the beginning of the file, followed by the compressed object contents. Handle these in a streaming manner as well.
* odb: introduce streaming loose object readerEdward Thomson2018-02-011-7/+148
| | | | Provide a streaming loose object reader.
* odb_loose: stream -> writestreamEdward Thomson2018-02-011-8/+8
| | | | | | | There are two streaming functions; one for reading, one for writing. Disambiguate function names between `stream` and `writestream` to make allowances for a read stream.
* odb_loose: reject objects that cannot fit in memoryEdward Thomson2017-12-201-0/+5
| | | | | | | Check the size of objects being read from the loose odb backend and reject those that would not fit in memory with an error message that reflects the actual problem, instead of error'ing later with an unintuitive error message regarding truncation or invalid hashes.
* odb: support large loose objectsEdward Thomson2017-12-201-98/+92
| | | | | | | | zlib will only inflate/deflate an `int`s worth of data at a time. We need to loop through large files in order to ensure that we inflate the entire file, not just an `int`s worth of data. Thankfully, we already have this loop in our `git_zstream` layer. Handle large objects using the `git_zstream`.
* Make sure to always include "common.h" firstPatrick Steinhardt2017-07-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
* settings: rename `GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION`Patrick Steinhardt2017-06-081-1/+1
| | | | | | | | | | | Initially, the setting has been solely used to enable the use of `fsync()` when creating objects. Since then, the use has been extended to also cover references and index files. As the option is not yet part of any release, we can still correct this by renaming the option to something more sensible, indicating not only correlation to objects. This commit renames the option to `GIT_OPT_ENABLE_FSYNC_GITDIR`. We also move the variable from the object to repository source code.
* object validation: free some memleaksethomson/memleakEdward Thomson2017-05-011-0/+6
|
* fsync: call it "synchronous" object writingEdward Thomson2017-02-281-1/+1
| | | | | Rename `GIT_OPT_ENABLE_SYNCHRONIZED_OBJECT_CREATION` -> `GIT_OPT_ENABLE_SYNCHRONOUS_OBJECT_CREATION`.
* Add `ENABLE_SYNCHRONIZED_OBJECT_CREATION` optionEdward Thomson2017-02-281-1/+2
| | | | Allow users to enable `SYNCHRONIZED_OBJECT_CREATION` with a setting.
* odb_loose: actually honor the fsync optionEdward Thomson2017-02-281-6/+13
| | | | | We've had an fsync option for a long time, but it was "ignored". Stop ignoring it.
* giterr_set: consistent error messagesEdward Thomson2016-12-291-6/+6
| | | | | | | | Error messages should be sentence fragments, and therefore: 1. Should not begin with a capital letter, 2. Should not conclude with punctuation, and 3. Should not end a sentence and begin a new one
* odb: only freshen pack files every 2 secondsethomson/refresh_objectsEdward Thomson2016-08-041-1/+1
| | | | | | Since writing multiple objects may all already exist in a single packfile, avoid freshening that packfile repeatedly in a tight loop. Instead, only freshen pack files every 2 seconds.
* odb: freshen existing objects when writingEdward Thomson2016-08-041-0/+18
| | | | | | When writing an object, we calculate its OID and see if it exists in the object database. If it does, we need to freshen the file that contains it.
* delta: move delta application to delta.cEdward Thomson2016-05-261-1/+1
| | | | | | | Move the delta application functions into `delta.c`, next to the similar delta creation functions. Make the `git__delta_apply` functions adhere to other naming and parameter style within the library.
* odb_loose: fix undefined behavior when computing sizePatrick Steinhardt2016-05-021-1/+1
| | | | | | | | | | | | | | | | | An object's size is computed by reading the object header's size field until the most significant bit is not set anymore. To get the total size, we increase the shift on each iteration and add the shifted value to the total size. We read the current value into a variable of type `unsigned char`, from which we then take all bits except the most significant bit and shift the result. We will end up with a maximum shift of 60, but this exceeds the width of the value's type, resulting in undefined behavior. Fix the issue by instead reading the values into a variable of type `unsigned long`, which matches the required width. This is equivalent to git.git, which uses an `unsigned long` as well.
* odb: improved not found error messagesEdward Thomson2016-03-071-8/+12
| | | | | When looking up an abbreviated oid, show the actual (abbreviated) oid the caller passed instead of a full (but ambiguously truncated) oid.
* git_futils_mkdir_*: make a relative-to-base mkdirEdward Thomson2015-09-171-2/+2
| | | | | | | | | | | | Untangle git_futils_mkdir from git_futils_mkdir_ext - the latter assumes that we own everything beneath the base, as if it were being called with a base of the repository or working directory, and is tailored towards checkout and ensuring that there is no bogosity beneath the base that must be cleaned up. This is (at best) slow and (at worst) unsafe in the larger context of a filesystem where we do not own things and cannot do things like unlink symlinks that are in our way.
* odb: make the writestream's size a git_off_tcmn/stream-sizeCarlos Martín Nieto2015-05-131-2/+2
| | | | | | | | | | Restricting files to size_t is a silly limitation. The loose backend writes to a file directly, so there is no issue in using 63 bits for the size. We still assume that the header is going to fit in 64 bytes, which does mean quite a bit smaller files due to the run-length encoding, but it's still a much larger size than you would want Git to handle.
* Make our overflow check look more like gcc/clang'sEdward Thomson2015-02-131-20/+22
| | | | | | | | | Make our overflow checking look more like gcc and clang's, so that we can substitute it out with the compiler instrinsics on platforms that support it. This means dropping the ability to pass `NULL` as an out parameter. As a result, the macros also get updated to reflect this as well.
* allocations: test for overflow of requested sizeEdward Thomson2015-02-121-1/+11
| | | | | Introduce some helper macros to test integer overflow from arithmetic and set error message appropriately.
* Spelling fixesWill Stamper2014-12-041-1/+1
|
* Factor 40 and 41 constants from source.Ciro Santilli2014-09-161-1/+1
|
* odb: ignore files in the objects dircmn/file-in-objects-dirCarlos Martín Nieto2014-05-051-0/+4
| | | | | | | | We assume that everything under GIT_DIR/objects/ is a directory. This is not necessarily the case if some process left a stray file in there. Check beforehand if we do have a directory and ignore the entry otherwise.
* Check short OID len in odb, not in backendsRussell Belfer2014-03-051-9/+3
|