From d55fc5128b26a64c2e7b6612d0442c9e924696e8 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Thu, 1 Feb 2024 08:51:56 +0100 Subject: [PATCH 01/51] reftable/reader: be more careful about errors in indexed seeks When doing an indexed seek we first need to do a linear seek in order to find the index block for our wanted key. We do not check the returned error of the linear seek though. This is likely not an issue because the next call to `table_iter_next()` would return error, too. But it very much is a code smell when an error variable is being assigned to without actually checking it. Safeguard the code by checking for errors. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- reftable/reader.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/reftable/reader.c b/reftable/reader.c index 64dc366fb15935..278f727a3dfe33 100644 --- a/reftable/reader.c +++ b/reftable/reader.c @@ -509,6 +509,9 @@ static int reader_seek_indexed(struct reftable_reader *r, goto done; err = reader_seek_linear(&index_iter, &want_index); + if (err < 0) + goto done; + while (1) { err = table_iter_next(&index_iter, &index_result); table_iter_block_done(&index_iter); From 9ebb2d7b08c9f17a846b7c90082c9d15b9f6c9d2 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Thu, 1 Feb 2024 08:52:00 +0100 Subject: [PATCH 02/51] reftable/writer: use correct type to iterate through index entries The reftable writer is tracking the number of blocks it has to index via the `index_len` variable. But while this variable is of type `size_t`, some sites use an `int` to loop through the index entries. Convert the code to consistently use `size_t`. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- reftable/writer.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/reftable/writer.c b/reftable/writer.c index 92935baa703692..5a0b87b40665a6 100644 --- a/reftable/writer.c +++ b/reftable/writer.c @@ -379,20 +379,21 @@ int reftable_writer_add_logs(struct reftable_writer *w, static int writer_finish_section(struct reftable_writer *w) { + struct reftable_block_stats *bstats = NULL; uint8_t typ = block_writer_type(w->block_writer); uint64_t index_start = 0; int max_level = 0; - int threshold = w->opts.unpadded ? 1 : 3; + size_t threshold = w->opts.unpadded ? 1 : 3; int before_blocks = w->stats.idx_stats.blocks; - int err = writer_flush_block(w); - int i = 0; - struct reftable_block_stats *bstats = NULL; + int err; + + err = writer_flush_block(w); if (err < 0) return err; while (w->index_len > threshold) { struct reftable_index_record *idx = NULL; - int idx_len = 0; + size_t i, idx_len; max_level++; index_start = w->next; @@ -630,11 +631,8 @@ int reftable_writer_close(struct reftable_writer *w) static void writer_clear_index(struct reftable_writer *w) { - int i = 0; - for (i = 0; i < w->index_len; i++) { + for (size_t i = 0; i < w->index_len; i++) strbuf_release(&w->index[i].last_key); - } - FREE_AND_NULL(w->index); w->index_len = 0; w->index_cap = 0; From b66e006ff534c72c70132105fff06601aa60a625 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Thu, 1 Feb 2024 08:52:04 +0100 Subject: [PATCH 03/51] reftable/writer: simplify writing index records When finishing the current section some index records might be written for the section to the table. The logic that adds these records to the writer duplicates what we already have in `writer_add_record()`, making this more complicated than it really has to be. Simplify the code by using `writer_add_record()` instead. While at it, drop the unneeded braces around a loop to make the code conform to our code style better. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- reftable/writer.c | 18 +++--------------- 1 file changed, 3 insertions(+), 15 deletions(-) diff --git a/reftable/writer.c b/reftable/writer.c index 5a0b87b40665a6..b5bcce029225d4 100644 --- a/reftable/writer.c +++ b/reftable/writer.c @@ -412,26 +412,14 @@ static int writer_finish_section(struct reftable_writer *w) .idx = idx[i], }, }; - if (block_writer_add(w->block_writer, &rec) == 0) { - continue; - } - err = writer_flush_block(w); + err = writer_add_record(w, &rec); if (err < 0) return err; - - writer_reinit_block_writer(w, BLOCK_TYPE_INDEX); - - err = block_writer_add(w->block_writer, &rec); - if (err != 0) { - /* write into fresh block should always succeed - */ - abort(); - } } - for (i = 0; i < idx_len; i++) { + + for (i = 0; i < idx_len; i++) strbuf_release(&idx[i].last_key); - } reftable_free(idx); } From e7485601ca4da6a09c6488add181609cffec5799 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Thu, 1 Feb 2024 08:52:08 +0100 Subject: [PATCH 04/51] reftable/writer: fix writing multi-level indices When finishing a section we will potentially write an index that makes it more efficient to look up relevant blocks. The index records written will encode, for each block of the indexed section, what the offset of that block is as well as the last key of that block. Thus, the reader would iterate through the index records to find the first key larger or equal to the wanted key and then use the encoded offset to look up the desired block. When there are a lot of blocks to index though we may end up writing multiple index blocks, too. To not require a linear search across all index blocks we instead end up writing a multi-level index. Instead of referring to the block we are after, an index record may point to another index block. The reader will then access the highest-level index and follow down the chain of index blocks until it hits the sought-after block. It has been observed though that it is impossible to seek ref records of the last ref block when using a multi-level index. While the multi-level index exists and looks fine for most of the part, the highest-level index was missing an index record pointing to the last block of the next index. Thus, every additional level made more refs become unseekable at the end of the ref section. The root cause is that we are not flushing the last block of the current level once done writing the level. Consequently, it wasn't recorded in the blocks that need to be indexed by the next-higher level and thus we forgot about it. Fix this bug by flushing blocks after we have written all index records. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- reftable/readwrite_test.c | 56 +++++++++++++++++++++++++++++++++++++++ reftable/writer.c | 8 +++--- 2 files changed, 60 insertions(+), 4 deletions(-) diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c index 6b99daeaf2a9b9..853923397e5a4b 100644 --- a/reftable/readwrite_test.c +++ b/reftable/readwrite_test.c @@ -866,6 +866,61 @@ static void test_write_multiple_indices(void) strbuf_release(&buf); } +static void test_write_multi_level_index(void) +{ + struct reftable_write_options opts = { + .block_size = 100, + }; + struct strbuf writer_buf = STRBUF_INIT, buf = STRBUF_INIT; + struct reftable_block_source source = { 0 }; + struct reftable_iterator it = { 0 }; + const struct reftable_stats *stats; + struct reftable_writer *writer; + struct reftable_reader *reader; + int err; + + writer = reftable_new_writer(&strbuf_add_void, &noop_flush, &writer_buf, &opts); + reftable_writer_set_limits(writer, 1, 1); + for (size_t i = 0; i < 200; i++) { + struct reftable_ref_record ref = { + .update_index = 1, + .value_type = REFTABLE_REF_VAL1, + .value.val1 = {i}, + }; + + strbuf_reset(&buf); + strbuf_addf(&buf, "refs/heads/%03" PRIuMAX, (uintmax_t)i); + ref.refname = buf.buf, + + err = reftable_writer_add_ref(writer, &ref); + EXPECT_ERR(err); + } + reftable_writer_close(writer); + + /* + * The written refs should be sufficiently large to result in a + * multi-level index. + */ + stats = reftable_writer_stats(writer); + EXPECT(stats->ref_stats.max_index_level == 2); + + block_source_from_strbuf(&source, &writer_buf); + err = reftable_new_reader(&reader, &source, "filename"); + EXPECT_ERR(err); + + /* + * Seeking the last ref should work as expected. + */ + err = reftable_reader_seek_ref(reader, &it, "refs/heads/199"); + EXPECT_ERR(err); + + reftable_iterator_destroy(&it); + reftable_writer_free(writer); + reftable_reader_free(reader); + strbuf_release(&writer_buf); + strbuf_release(&buf); +} + static void test_corrupt_table_empty(void) { struct strbuf buf = STRBUF_INIT; @@ -916,5 +971,6 @@ int readwrite_test_main(int argc, const char *argv[]) RUN_TEST(test_write_object_id_length); RUN_TEST(test_write_object_id_min_length); RUN_TEST(test_write_multiple_indices); + RUN_TEST(test_write_multi_level_index); return 0; } diff --git a/reftable/writer.c b/reftable/writer.c index b5bcce029225d4..0349094d299d6c 100644 --- a/reftable/writer.c +++ b/reftable/writer.c @@ -418,15 +418,15 @@ static int writer_finish_section(struct reftable_writer *w) return err; } + err = writer_flush_block(w); + if (err < 0) + return err; + for (i = 0; i < idx_len; i++) strbuf_release(&idx[i].last_key); reftable_free(idx); } - err = writer_flush_block(w); - if (err < 0) - return err; - writer_clear_index(w); bstats = writer_reftable_block_stats(w, typ); From 4950acae7d0db40c327003eff2621aaa2172322c Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Thu, 1 Feb 2024 08:52:12 +0100 Subject: [PATCH 05/51] reftable: document reading and writing indices The way the index gets written and read is not trivial at all and requires the reader to piece together a bunch of parts to figure out how it works. Add some documentation to hopefully make this easier to understand for the next reader. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- reftable/reader.c | 27 +++++++++++++++++++++++++++ reftable/writer.c | 23 +++++++++++++++++++++++ 2 files changed, 50 insertions(+) diff --git a/reftable/reader.c b/reftable/reader.c index 278f727a3dfe33..6011d0aa04370c 100644 --- a/reftable/reader.c +++ b/reftable/reader.c @@ -508,11 +508,38 @@ static int reader_seek_indexed(struct reftable_reader *r, if (err < 0) goto done; + /* + * The index may consist of multiple levels, where each level may have + * multiple index blocks. We start by doing a linear search in the + * highest layer that identifies the relevant index block as well as + * the record inside that block that corresponds to our wanted key. + */ err = reader_seek_linear(&index_iter, &want_index); if (err < 0) goto done; + /* + * Traverse down the levels until we find a non-index entry. + */ while (1) { + /* + * In case we seek a record that does not exist the index iter + * will tell us that the iterator is over. This works because + * the last index entry of the current level will contain the + * last key it knows about. So in case our seeked key is larger + * than the last indexed key we know that it won't exist. + * + * There is one subtlety in the layout of the index section + * that makes this work as expected: the highest-level index is + * at end of the section and will point backwards and thus we + * start reading from the end of the index section, not the + * beginning. + * + * If that wasn't the case and the order was reversed then the + * linear seek would seek into the lower levels and traverse + * all levels of the index only to find out that the key does + * not exist. + */ err = table_iter_next(&index_iter, &index_result); table_iter_block_done(&index_iter); if (err != 0) diff --git a/reftable/writer.c b/reftable/writer.c index 0349094d299d6c..e23953e498a86b 100644 --- a/reftable/writer.c +++ b/reftable/writer.c @@ -391,6 +391,24 @@ static int writer_finish_section(struct reftable_writer *w) if (err < 0) return err; + /* + * When the section we are about to index has a lot of blocks then the + * index itself may span across multiple blocks, as well. This would + * require a linear scan over index blocks only to find the desired + * indexed block, which is inefficient. Instead, we write a multi-level + * index where index records of level N+1 will refer to index blocks of + * level N. This isn't constant time, either, but at least logarithmic. + * + * This loop handles writing this multi-level index. Note that we write + * the lowest-level index pointing to the indexed blocks first. We then + * continue writing additional index levels until the current level has + * less blocks than the threshold so that the highest level will be at + * the end of the index section. + * + * Readers are thus required to start reading the index section from + * its end, which is why we set `index_start` to the beginning of the + * last index section. + */ while (w->index_len > threshold) { struct reftable_index_record *idx = NULL; size_t i, idx_len; @@ -427,6 +445,11 @@ static int writer_finish_section(struct reftable_writer *w) reftable_free(idx); } + /* + * The index may still contain a number of index blocks lower than the + * threshold. Clear it so that these entries don't leak into the next + * index section. + */ writer_clear_index(w); bstats = writer_reftable_block_stats(w, typ); From e94dec0c1d7709282c2bff168bdad5485a95baa8 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Fri, 2 Feb 2024 12:39:34 -0800 Subject: [PATCH 06/51] GitHub Actions: update to checkout@v4 We seem to be getting "Node.js 16 actions are deprecated." warnings for jobs that use checkout@v3. Except for the i686 containers job that is kept at checkout@v1 [*], update to checkout@v4, which is said to use Node.js 20. [*] 6cf4d908 (ci(main): upgrade actions/checkout to v3, 2022-12-05) refers to https://github.com/actions/runner/issues/2115 and explains why container jobs are kept at checkout@v1. We may want to check the current status of the issue and move it to the same version as other jobs, but that is outside the scope of this step. Signed-off-by: Junio C Hamano --- .github/workflows/check-whitespace.yml | 2 +- .github/workflows/coverity.yml | 2 +- .github/workflows/main.yml | 16 ++++++++-------- 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/.github/workflows/check-whitespace.yml b/.github/workflows/check-whitespace.yml index a58e2dc8ad6279..a241a63428bd20 100644 --- a/.github/workflows/check-whitespace.yml +++ b/.github/workflows/check-whitespace.yml @@ -19,7 +19,7 @@ jobs: check-whitespace: runs-on: ubuntu-latest steps: - - uses: actions/checkout@v3 + - uses: actions/checkout@v4 with: fetch-depth: 0 diff --git a/.github/workflows/coverity.yml b/.github/workflows/coverity.yml index e5532d381bcb15..a81a7566d10735 100644 --- a/.github/workflows/coverity.yml +++ b/.github/workflows/coverity.yml @@ -38,7 +38,7 @@ jobs: COVERITY_LANGUAGE: cxx COVERITY_PLATFORM: overridden-below steps: - - uses: actions/checkout@v3 + - uses: actions/checkout@v4 - name: install minimal Git for Windows SDK if: contains(matrix.os, 'windows') uses: git-for-windows/setup-git-for-windows-sdk@v1 diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 9fdbd5402898bf..d119a63b93b445 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -112,7 +112,7 @@ jobs: group: windows-build-${{ github.ref }} cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }} steps: - - uses: actions/checkout@v3 + - uses: actions/checkout@v4 - uses: git-for-windows/setup-git-for-windows-sdk@v1 - name: build shell: bash @@ -173,10 +173,10 @@ jobs: group: vs-build-${{ github.ref }} cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }} steps: - - uses: actions/checkout@v3 + - uses: actions/checkout@v4 - uses: git-for-windows/setup-git-for-windows-sdk@v1 - name: initialize vcpkg - uses: actions/checkout@v3 + uses: actions/checkout@v4 with: repository: 'microsoft/vcpkg' path: 'compat/vcbuild/vcpkg' @@ -297,7 +297,7 @@ jobs: runs_on_pool: ${{matrix.vector.pool}} runs-on: ${{matrix.vector.pool}} steps: - - uses: actions/checkout@v3 + - uses: actions/checkout@v4 - run: ci/install-dependencies.sh - run: ci/run-build-and-tests.sh - name: print test failures @@ -331,7 +331,7 @@ jobs: runs-on: ubuntu-latest container: ${{matrix.vector.image}} steps: - - uses: actions/checkout@v3 + - uses: actions/checkout@v4 if: matrix.vector.jobname != 'linux32' - uses: actions/checkout@v1 if: matrix.vector.jobname == 'linux32' @@ -362,7 +362,7 @@ jobs: group: static-analysis-${{ github.ref }} cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }} steps: - - uses: actions/checkout@v3 + - uses: actions/checkout@v4 - run: ci/install-dependencies.sh - run: ci/run-static-analysis.sh - run: ci/check-directional-formatting.bash @@ -385,7 +385,7 @@ jobs: artifact: sparse-20.04 - name: Install the current `sparse` package run: sudo dpkg -i sparse-20.04/sparse_*.deb - - uses: actions/checkout@v3 + - uses: actions/checkout@v4 - name: Install other dependencies run: ci/install-dependencies.sh - run: make sparse @@ -400,6 +400,6 @@ jobs: jobname: Documentation runs-on: ubuntu-latest steps: - - uses: actions/checkout@v3 + - uses: actions/checkout@v4 - run: ci/install-dependencies.sh - run: ci/test-documentation.sh From c4ddbe043ebfecba68943e1b38b9d6c179e734da Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Fri, 2 Feb 2024 12:39:35 -0800 Subject: [PATCH 07/51] GitHub Actions: update to github-script@v7 We seem to be getting "Node.js 16 actions are deprecated." warnings for jobs that use github-script@v6. Update to github-script@v7, which is said to use Node.js 20. Signed-off-by: Junio C Hamano --- .github/workflows/main.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index d119a63b93b445..c7b04a521b7b2d 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -63,7 +63,7 @@ jobs: echo "skip_concurrent=$skip_concurrent" >>$GITHUB_OUTPUT - name: skip if the commit or tree was already tested id: skip-if-redundant - uses: actions/github-script@v6 + uses: actions/github-script@v7 if: steps.check-ref.outputs.enabled == 'yes' with: github-token: ${{secrets.GITHUB_TOKEN}} From 7c01878eeb15e8dd75f0262bdfb3249c85a30a4a Mon Sep 17 00:00:00 2001 From: Taylor Blau Date: Mon, 5 Feb 2024 17:50:19 -0500 Subject: [PATCH 08/51] t5332-multi-pack-reuse.sh: extract pack-objects helper functions Most of the tests in t5332 perform some setup before repeating a common refrain that looks like: : >trace2.txt && GIT_TRACE2_EVENT="$PWD/trace2.txt" \ git pack-objects --stdout --revs --all >/dev/null && test_pack_reused $objects_nr Signed-off-by: Junio C Hamano --- t/t5332-multi-pack-reuse.sh | 71 +++++++++++++++---------------------- 1 file changed, 29 insertions(+), 42 deletions(-) diff --git a/t/t5332-multi-pack-reuse.sh b/t/t5332-multi-pack-reuse.sh index 2ba788b0421b6a..d516062f84fb7b 100755 --- a/t/t5332-multi-pack-reuse.sh +++ b/t/t5332-multi-pack-reuse.sh @@ -23,6 +23,27 @@ pack_position () { grep "$1" objects | cut -d" " -f1 } +# test_pack_objects_reused_all +test_pack_objects_reused_all () { + : >trace2.txt && + GIT_TRACE2_EVENT="$PWD/trace2.txt" \ + git pack-objects --stdout --revs --all --delta-base-offset \ + >/dev/null && + + test_pack_reused "$1" +test_pack_objects_reused () { + : >trace2.txt && + GIT_TRACE2_EVENT="$PWD/trace2.txt" \ + git pack-objects --stdout --revs >/dev/null && + + test_pack_reused "$1" trace2.txt && - GIT_TRACE2_EVENT="$PWD/trace2.txt" \ - git pack-objects --stdout --revs --all >/dev/null && - - test_pack_reused 3 trace2.txt && - GIT_TRACE2_EVENT="$PWD/trace2.txt" \ - git pack-objects --stdout --revs /dev/null && - - test_pack_reused 6 trace2.txt && - GIT_TRACE2_EVENT="$PWD/trace2.txt" \ - git pack-objects --stdout --revs --all >/dev/null && - - test_pack_reused 9 trace2.txt && - GIT_TRACE2_EVENT="$PWD/trace2.txt" \ - git pack-objects --stdout --delta-base-offset --revs /dev/null && - - test_pack_reused 3 trace2.txt && - GIT_TRACE2_EVENT="$PWD/trace2.txt" \ - git pack-objects --stdout --delta-base-offset --revs /dev/null && - - test_pack_reused 3 trace2.txt && - GIT_TRACE2_EVENT="$PWD/trace2.txt" \ - git pack-objects --stdout --delta-base-offset --revs /dev/null && - # We can only reuse the 3 objects corresponding to "other" from # the latest pack. # @@ -175,8 +168,7 @@ test_expect_success 'omit delta with uninteresting base (same pack)' ' # The remaining objects from the other pack are similarly not # reused because their objects are on the uninteresting side of # the query. - test_pack_reused 3 trace2.txt && - GIT_TRACE2_EVENT="$PWD/trace2.txt" \ - git pack-objects --stdout --delta-base-offset --all >/dev/null && - packs_nr="$(find $packdir -type f -name "pack-*.pack" | wc -l)" && objects_nr="$(git rev-list --count --all --objects)" && - test_pack_reused $(($objects_nr - 1)) Date: Mon, 5 Feb 2024 17:50:23 -0500 Subject: [PATCH 09/51] pack-objects: enable multi-pack reuse via `feature.experimental` Now that multi-pack reuse is supported, enable it via the feature.experimental configuration in addition to the classic `pack.allowPackReuse`. This will allow more users to experiment with the new behavior who might not otherwise be aware of the existing `pack.allowPackReuse` configuration option. The enum with values NO_PACK_REUSE, SINGLE_PACK_REUSE, and MULTI_PACK_REUSE is defined statically in builtin/pack-objects.c's compilation unit. We could hoist that enum into a scope visible from the repository_settings struct, and then use that enum value in pack-objects. Instead, define a single int that indicates what pack-objects's default value should be to avoid additional unnecessary code movement. Though `feature.experimental` implies `pack.allowPackReuse=multi`, this can still be overridden by explicitly setting the latter configuration to either "single" or "false". Tests covering all of these cases are showin t5332. Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano --- Documentation/config/feature.txt | 3 +++ builtin/pack-objects.c | 2 ++ repo-settings.c | 1 + repository.h | 1 + t/t5332-multi-pack-reuse.sh | 16 ++++++++++++++++ 5 files changed, 23 insertions(+) diff --git a/Documentation/config/feature.txt b/Documentation/config/feature.txt index bf9546fca4f693..f061b64b748449 100644 --- a/Documentation/config/feature.txt +++ b/Documentation/config/feature.txt @@ -17,6 +17,9 @@ skipping more commits at a time, reducing the number of round trips. + * `pack.useBitmapBoundaryTraversal=true` may improve bitmap traversal times by walking fewer objects. ++ +* `pack.allowPackReuse=multi` may improve the time it takes to create a pack by +reusing objects from multiple packs instead of just one. feature.manyFiles:: Enable config options that optimize for repos with many files in the diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index d8c2128a979282..329aeac8043750 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -4396,6 +4396,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) prepare_repo_settings(the_repository); if (sparse < 0) sparse = the_repository->settings.pack_use_sparse; + if (the_repository->settings.pack_use_multi_pack_reuse) + allow_pack_reuse = MULTI_PACK_REUSE; } reset_pack_idx_option(&pack_idx_opts); diff --git a/repo-settings.c b/repo-settings.c index 30cd4787627b7c..a0b590bc6c0bb9 100644 --- a/repo-settings.c +++ b/repo-settings.c @@ -43,6 +43,7 @@ void prepare_repo_settings(struct repository *r) if (experimental) { r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_SKIPPING; r->settings.pack_use_bitmap_boundary_traversal = 1; + r->settings.pack_use_multi_pack_reuse = 1; } if (manyfiles) { r->settings.index_version = 4; diff --git a/repository.h b/repository.h index 7a250a6605cc8e..21949c5a17f68c 100644 --- a/repository.h +++ b/repository.h @@ -39,6 +39,7 @@ struct repo_settings { int sparse_index; int pack_read_reverse_index; int pack_use_bitmap_boundary_traversal; + int pack_use_multi_pack_reuse; /* * Does this repository have core.useReplaceRefs=true (on by diff --git a/t/t5332-multi-pack-reuse.sh b/t/t5332-multi-pack-reuse.sh index d516062f84fb7b..b925a81d3725d8 100755 --- a/t/t5332-multi-pack-reuse.sh +++ b/t/t5332-multi-pack-reuse.sh @@ -58,6 +58,22 @@ test_expect_success 'preferred pack is reused for single-pack reuse' ' test_pack_objects_reused_all 3 1 ' +test_expect_success 'multi-pack reuse is disabled by default' ' + test_pack_objects_reused_all 3 1 +' + +test_expect_success 'feature.experimental implies multi-pack reuse' ' + test_config feature.experimental true && + + test_pack_objects_reused_all 6 2 +' + +test_expect_success 'multi-pack reuse can be disabled with feature.experimental' ' + test_config feature.experimental true && + test_config pack.allowPackReuse single && + + test_pack_objects_reused_all 3 1 +' test_expect_success 'enable multi-pack reuse' ' git config pack.allowPackReuse multi From 568459bf5e97a4f61429e3bdd1f97b54b39a1383 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 5 Feb 2024 14:35:53 +0000 Subject: [PATCH 10/51] Always check the return value of `repo_read_object_file()` There are a couple of places in Git's source code where the return value is not checked. As a consequence, they are susceptible to segmentation faults. Signed-off-by: Johannes Schindelin Signed-off-by: Junio C Hamano --- bisect.c | 3 +++ builtin/cat-file.c | 10 ++++++++-- builtin/grep.c | 2 ++ builtin/notes.c | 6 ++++-- combine-diff.c | 2 ++ rerere.c | 3 +++ 6 files changed, 22 insertions(+), 4 deletions(-) diff --git a/bisect.c b/bisect.c index 1be8e0a2711df9..daa75a60655327 100644 --- a/bisect.c +++ b/bisect.c @@ -159,6 +159,9 @@ static void show_list(const char *debug, int counted, int nr, const char *subject_start; int subject_len; + if (!buf) + die(_("unable to read %s"), oid_to_hex(&commit->object.oid)); + fprintf(stderr, "%c%c%c ", (commit_flags & TREESAME) ? ' ' : 'T', (commit_flags & UNINTERESTING) ? 'U' : ' ', diff --git a/builtin/cat-file.c b/builtin/cat-file.c index ea8ad601ecc0b7..186c364277aea2 100644 --- a/builtin/cat-file.c +++ b/builtin/cat-file.c @@ -222,6 +222,10 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name, &type, &size); const char *target; + + if (!buffer) + die(_("unable to read %s"), oid_to_hex(&oid)); + if (!skip_prefix(buffer, "object ", &target) || get_oid_hex(target, &blob_oid)) die("%s not a valid tag", oid_to_hex(&oid)); @@ -417,6 +421,8 @@ static void print_object_or_die(struct batch_options *opt, struct expand_data *d contents = repo_read_object_file(the_repository, oid, &type, &size); + if (!contents) + die("object %s disappeared", oid_to_hex(oid)); if (use_mailmap) { size_t s = size; @@ -424,8 +430,6 @@ static void print_object_or_die(struct batch_options *opt, struct expand_data *d size = cast_size_t_to_ulong(s); } - if (!contents) - die("object %s disappeared", oid_to_hex(oid)); if (type != data->type) die("object %s changed type!?", oid_to_hex(oid)); if (data->info.sizep && size != data->size && !use_mailmap) @@ -482,6 +486,8 @@ static void batch_object_write(const char *obj_name, buf = repo_read_object_file(the_repository, &data->oid, &data->type, &data->size); + if (!buf) + die(_("unable to read %s"), oid_to_hex(&data->oid)); buf = replace_idents_using_mailmap(buf, &s); data->size = cast_size_t_to_ulong(s); diff --git a/builtin/grep.c b/builtin/grep.c index fe78d4c98b13b5..63d519c93346e2 100644 --- a/builtin/grep.c +++ b/builtin/grep.c @@ -575,6 +575,8 @@ static int grep_cache(struct grep_opt *opt, data = repo_read_object_file(the_repository, &ce->oid, &type, &size); + if (!data) + die(_("unable to read tree %s"), oid_to_hex(&ce->oid)); init_tree_desc(&tree, data, size); hit |= grep_tree(opt, pathspec, &tree, &name, 0, 0); diff --git a/builtin/notes.c b/builtin/notes.c index 9f38863dd507ff..40543862d3d38d 100644 --- a/builtin/notes.c +++ b/builtin/notes.c @@ -718,9 +718,11 @@ static int append_edit(int argc, const char **argv, const char *prefix) struct strbuf buf = STRBUF_INIT; char *prev_buf = repo_read_object_file(the_repository, note, &type, &size); - if (prev_buf && size) + if (!prev_buf) + die(_("unable to read %s"), oid_to_hex(note)); + if (size) strbuf_add(&buf, prev_buf, size); - if (d.buf.len && prev_buf && size) + if (d.buf.len && size) append_separator(&buf); strbuf_insert(&d.buf, 0, buf.buf, buf.len); diff --git a/combine-diff.c b/combine-diff.c index f90f442482932e..c5492809e9cbaf 100644 --- a/combine-diff.c +++ b/combine-diff.c @@ -338,6 +338,8 @@ static char *grab_blob(struct repository *r, free_filespec(df); } else { blob = repo_read_object_file(r, oid, &type, size); + if (!blob) + die(_("unable to read %s"), oid_to_hex(oid)); if (type != OBJ_BLOB) die("object '%s' is not a blob!", oid_to_hex(oid)); } diff --git a/rerere.c b/rerere.c index 09e19412859b3e..5d79cc42117c8c 100644 --- a/rerere.c +++ b/rerere.c @@ -975,6 +975,9 @@ static int handle_cache(struct index_state *istate, mmfile[i].ptr = repo_read_object_file(the_repository, &ce->oid, &type, &size); + if (!mmfile[i].ptr) + die(_("unable to read %s"), + oid_to_hex(&ce->oid)); mmfile[i].size = size; } } From d2058cb2f00ae4c174b44193b039790dc8015c40 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Tue, 6 Feb 2024 06:34:41 +0100 Subject: [PATCH 11/51] builtin/stash: report failure to write to index The git-stash(1) command needs to write to the index for many of its operations. When the index is locked by a concurrent writer it will thus fail to operate, which is expected. What is not expected though is that we do not print any error message at all in this case. The user can thus easily miss the fact that the command didn't do what they expected it to do and would be left wondering why that is. Fix this bug and report failures to write to the index. Add tests for the subcommands which hit the respective code paths. While at it, unify error messages when writing to the index fails. The chosen error message is already used in "builtin/stash.c". Reported-by: moti sd Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- builtin/stash.c | 8 ++++---- t/t3903-stash.sh | 52 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 56 insertions(+), 4 deletions(-) diff --git a/builtin/stash.c b/builtin/stash.c index 4a6771c9f4c493..5c6ed02313784b 100644 --- a/builtin/stash.c +++ b/builtin/stash.c @@ -521,7 +521,7 @@ static void unstage_changes_unless_new(struct object_id *orig_tree) repo_hold_locked_index(the_repository, &lock, LOCK_DIE_ON_ERROR); if (write_locked_index(&the_index, &lock, COMMIT_LOCK | SKIP_IF_UNCHANGED)) - die(_("Unable to write index.")); + die(_("could not write index")); } static int do_apply_stash(const char *prefix, struct stash_info *info, @@ -538,7 +538,7 @@ static int do_apply_stash(const char *prefix, struct stash_info *info, repo_read_index_preload(the_repository, NULL, 0); if (repo_refresh_and_write_index(the_repository, REFRESH_QUIET, 0, 0, NULL, NULL, NULL)) - return -1; + return error(_("could not write index")); if (write_index_as_tree(&c_tree, &the_index, get_index_file(), 0, NULL)) @@ -1365,7 +1365,7 @@ static int do_create_stash(const struct pathspec *ps, struct strbuf *stash_msg_b repo_read_index_preload(the_repository, NULL, 0); if (repo_refresh_and_write_index(the_repository, REFRESH_QUIET, 0, 0, NULL, NULL, NULL) < 0) { - ret = -1; + ret = error(_("could not write index")); goto done; } @@ -1556,7 +1556,7 @@ static int do_push_stash(const struct pathspec *ps, const char *stash_msg, int q if (repo_refresh_and_write_index(the_repository, REFRESH_QUIET, 0, 0, NULL, NULL, NULL)) { - ret = -1; + ret = error(_("could not write index")); goto done; } diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh index 34faeac3f1cde4..3caf490e3996ec 100755 --- a/t/t3903-stash.sh +++ b/t/t3903-stash.sh @@ -1516,4 +1516,56 @@ test_expect_success 'restore untracked files even when we hit conflicts' ' ) ' +test_expect_success 'stash create reports a locked index' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit A A.file && + echo change >A.file && + touch .git/index.lock && + + cat >expect <<-EOF && + error: could not write index + EOF + test_must_fail git stash create 2>err && + test_cmp expect err + ) +' + +test_expect_success 'stash push reports a locked index' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit A A.file && + echo change >A.file && + touch .git/index.lock && + + cat >expect <<-EOF && + error: could not write index + EOF + test_must_fail git stash push 2>err && + test_cmp expect err + ) +' + +test_expect_success 'stash apply reports a locked index' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit A A.file && + echo change >A.file && + git stash push && + touch .git/index.lock && + + cat >expect <<-EOF && + error: could not write index + EOF + test_must_fail git stash apply 2>err && + test_cmp expect err + ) +' + test_done From f6b58c1be40ba4bd6e7f2364acfe5fa34ce04120 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Tue, 6 Feb 2024 07:35:23 +0100 Subject: [PATCH 12/51] reftable: introduce macros to grow arrays Throughout the reftable library we have many cases where we need to grow arrays. In order to avoid too many reallocations, we roughly double the capacity of the array on each iteration. The resulting code pattern is duplicated across many sites. We have similar patterns in our main codebase, which is why we have eventually introduced an `ALLOC_GROW()` macro to abstract it away and avoid some code duplication. We cannot easily reuse this macro here though because `ALLOC_GROW()` uses `REALLOC_ARRAY()`, which in turn will call realloc(3P) to grow the array. The reftable code is structured as a library though (even if the boundaries are fuzzy), and one property this brings with it is that it is possible to plug in your own allocators. So instead of using realloc(3P), we need to use `reftable_realloc()` that knows to use the user-provided implementation. So let's introduce two new macros `REFTABLE_REALLOC_ARRAY()` and `REFTABLE_ALLOC_GROW()` that mirror what we do in our main codebase, with two modifications: - They use `reftable_realloc()`, as explained above. - They use a different growth factor of `2 * cap + 1` instead of `(cap + 16) * 3 / 2`. The second change is because we know a bit more about the allocation patterns in the reftable library. In most cases, we end up only having a handful of items in the array and don't end up growing them. The initial capacity that our normal growth factor uses (which is 24) would thus end up over-allocating in a lot of code paths. This effect is measurable: - Before change: HEAP SUMMARY: in use at exit: 671,983 bytes in 152 blocks total heap usage: 3,843,446 allocs, 3,843,294 frees, 223,761,402 bytes allocated - After change with a growth factor of `(2 * alloc + 1)`: HEAP SUMMARY: in use at exit: 671,983 bytes in 152 blocks total heap usage: 3,843,446 allocs, 3,843,294 frees, 223,761,410 bytes allocated - After change with a growth factor of `(alloc + 16)* 2 / 3`: HEAP SUMMARY: in use at exit: 671,983 bytes in 152 blocks total heap usage: 3,833,673 allocs, 3,833,521 frees, 4,728,251,742 bytes allocated While the total heap usage is roughly the same, we do end up allocating significantly more bytes with our usual growth factor (in fact, roughly 21 times as many). Convert the reftable library to use these new macros. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- reftable/basics.c | 8 ++------ reftable/basics.h | 11 +++++++++++ reftable/block.c | 7 +------ reftable/merged_test.c | 20 ++++++-------------- reftable/pq.c | 8 ++------ reftable/stack.c | 29 ++++++++++++----------------- reftable/writer.c | 14 ++------------ 7 files changed, 36 insertions(+), 61 deletions(-) diff --git a/reftable/basics.c b/reftable/basics.c index f761e48028c449..af9004cec24db5 100644 --- a/reftable/basics.c +++ b/reftable/basics.c @@ -89,17 +89,13 @@ void parse_names(char *buf, int size, char ***namesp) next = end; } if (p < next) { - if (names_len == names_cap) { - names_cap = 2 * names_cap + 1; - names = reftable_realloc( - names, names_cap * sizeof(*names)); - } + REFTABLE_ALLOC_GROW(names, names_len + 1, names_cap); names[names_len++] = xstrdup(p); } p = next + 1; } - names = reftable_realloc(names, (names_len + 1) * sizeof(*names)); + REFTABLE_REALLOC_ARRAY(names, names_len + 1); names[names_len] = NULL; *namesp = names; } diff --git a/reftable/basics.h b/reftable/basics.h index 096b36862b9f4e..2f855cd724b357 100644 --- a/reftable/basics.h +++ b/reftable/basics.h @@ -53,6 +53,17 @@ void *reftable_realloc(void *p, size_t sz); void reftable_free(void *p); void *reftable_calloc(size_t sz); +#define REFTABLE_REALLOC_ARRAY(x, alloc) (x) = reftable_realloc((x), st_mult(sizeof(*(x)), (alloc))) +#define REFTABLE_ALLOC_GROW(x, nr, alloc) \ + do { \ + if ((nr) > alloc) { \ + alloc = 2 * (alloc) + 1; \ + if (alloc < (nr)) \ + alloc = (nr); \ + REFTABLE_REALLOC_ARRAY(x, alloc); \ + } \ + } while (0) + /* Find the longest shared prefix size of `a` and `b` */ struct strbuf; int common_prefix_size(struct strbuf *a, struct strbuf *b); diff --git a/reftable/block.c b/reftable/block.c index 1df3d8a0f09671..6952d0facf5242 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -51,12 +51,7 @@ static int block_writer_register_restart(struct block_writer *w, int n, if (2 + 3 * rlen + n > w->block_size - w->next) return -1; if (is_restart) { - if (w->restart_len == w->restart_cap) { - w->restart_cap = w->restart_cap * 2 + 1; - w->restarts = reftable_realloc( - w->restarts, sizeof(uint32_t) * w->restart_cap); - } - + REFTABLE_ALLOC_GROW(w->restarts, w->restart_len + 1, w->restart_cap); w->restarts[w->restart_len++] = w->next; } diff --git a/reftable/merged_test.c b/reftable/merged_test.c index 46908f738f770f..e05351e035fc08 100644 --- a/reftable/merged_test.c +++ b/reftable/merged_test.c @@ -231,14 +231,10 @@ static void test_merged(void) while (len < 100) { /* cap loops/recursion. */ struct reftable_ref_record ref = { NULL }; int err = reftable_iterator_next_ref(&it, &ref); - if (err > 0) { + if (err > 0) break; - } - if (len == cap) { - cap = 2 * cap + 1; - out = reftable_realloc( - out, sizeof(struct reftable_ref_record) * cap); - } + + REFTABLE_ALLOC_GROW(out, len + 1, cap); out[len++] = ref; } reftable_iterator_destroy(&it); @@ -368,14 +364,10 @@ static void test_merged_logs(void) while (len < 100) { /* cap loops/recursion. */ struct reftable_log_record log = { NULL }; int err = reftable_iterator_next_log(&it, &log); - if (err > 0) { + if (err > 0) break; - } - if (len == cap) { - cap = 2 * cap + 1; - out = reftable_realloc( - out, sizeof(struct reftable_log_record) * cap); - } + + REFTABLE_ALLOC_GROW(out, len + 1, cap); out[len++] = log; } reftable_iterator_destroy(&it); diff --git a/reftable/pq.c b/reftable/pq.c index dcefeb793a9051..2461daf5ff49c9 100644 --- a/reftable/pq.c +++ b/reftable/pq.c @@ -75,13 +75,9 @@ void merged_iter_pqueue_add(struct merged_iter_pqueue *pq, const struct pq_entry { int i = 0; - if (pq->len == pq->cap) { - pq->cap = 2 * pq->cap + 1; - pq->heap = reftable_realloc(pq->heap, - pq->cap * sizeof(struct pq_entry)); - } - + REFTABLE_ALLOC_GROW(pq->heap, pq->len + 1, pq->cap); pq->heap[pq->len++] = *e; + i = pq->len - 1; while (i > 0) { int j = (i - 1) / 2; diff --git a/reftable/stack.c b/reftable/stack.c index bf3869ce70dc7a..1dfab99e96fffc 100644 --- a/reftable/stack.c +++ b/reftable/stack.c @@ -551,7 +551,7 @@ struct reftable_addition { struct reftable_stack *stack; char **new_tables; - int new_tables_len; + size_t new_tables_len, new_tables_cap; uint64_t next_update_index; }; @@ -602,8 +602,9 @@ static int reftable_stack_init_addition(struct reftable_addition *add, static void reftable_addition_close(struct reftable_addition *add) { - int i = 0; struct strbuf nm = STRBUF_INIT; + size_t i; + for (i = 0; i < add->new_tables_len; i++) { stack_filename(&nm, add->stack, add->new_tables[i]); unlink(nm.buf); @@ -613,6 +614,7 @@ static void reftable_addition_close(struct reftable_addition *add) reftable_free(add->new_tables); add->new_tables = NULL; add->new_tables_len = 0; + add->new_tables_cap = 0; delete_tempfile(&add->lock_file); strbuf_release(&nm); @@ -631,8 +633,8 @@ int reftable_addition_commit(struct reftable_addition *add) { struct strbuf table_list = STRBUF_INIT; int lock_file_fd = get_tempfile_fd(add->lock_file); - int i = 0; int err = 0; + size_t i; if (add->new_tables_len == 0) goto done; @@ -660,12 +662,12 @@ int reftable_addition_commit(struct reftable_addition *add) } /* success, no more state to clean up. */ - for (i = 0; i < add->new_tables_len; i++) { + for (i = 0; i < add->new_tables_len; i++) reftable_free(add->new_tables[i]); - } reftable_free(add->new_tables); add->new_tables = NULL; add->new_tables_len = 0; + add->new_tables_cap = 0; err = reftable_stack_reload_maybe_reuse(add->stack, 1); if (err) @@ -792,11 +794,9 @@ int reftable_addition_add(struct reftable_addition *add, goto done; } - add->new_tables = reftable_realloc(add->new_tables, - sizeof(*add->new_tables) * - (add->new_tables_len + 1)); - add->new_tables[add->new_tables_len] = strbuf_detach(&next_name, NULL); - add->new_tables_len++; + REFTABLE_ALLOC_GROW(add->new_tables, add->new_tables_len + 1, + add->new_tables_cap); + add->new_tables[add->new_tables_len++] = strbuf_detach(&next_name, NULL); done: if (tab_fd > 0) { close(tab_fd); @@ -1367,17 +1367,12 @@ static int stack_check_addition(struct reftable_stack *st, while (1) { struct reftable_ref_record ref = { NULL }; err = reftable_iterator_next_ref(&it, &ref); - if (err > 0) { + if (err > 0) break; - } if (err < 0) goto done; - if (len >= cap) { - cap = 2 * cap + 1; - refs = reftable_realloc(refs, cap * sizeof(refs[0])); - } - + REFTABLE_ALLOC_GROW(refs, len + 1, cap); refs[len++] = ref; } diff --git a/reftable/writer.c b/reftable/writer.c index ee4590e20f84dd..4483bb21c3d6a8 100644 --- a/reftable/writer.c +++ b/reftable/writer.c @@ -200,12 +200,7 @@ static void writer_index_hash(struct reftable_writer *w, struct strbuf *hash) return; } - if (key->offset_len == key->offset_cap) { - key->offset_cap = 2 * key->offset_cap + 1; - key->offsets = reftable_realloc( - key->offsets, sizeof(uint64_t) * key->offset_cap); - } - + REFTABLE_ALLOC_GROW(key->offsets, key->offset_len + 1, key->offset_cap); key->offsets[key->offset_len++] = off; } @@ -674,12 +669,7 @@ static int writer_flush_nonempty_block(struct reftable_writer *w) if (err < 0) return err; - if (w->index_cap == w->index_len) { - w->index_cap = 2 * w->index_cap + 1; - w->index = reftable_realloc( - w->index, - sizeof(struct reftable_index_record) * w->index_cap); - } + REFTABLE_ALLOC_GROW(w->index, w->index_len + 1, w->index_cap); ir.offset = w->next; strbuf_reset(&ir.last_key); From b4ff12c8eefff9cba73ba3cb7492111adfa31d87 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Tue, 6 Feb 2024 07:35:27 +0100 Subject: [PATCH 13/51] reftable: introduce macros to allocate arrays Similar to the preceding commit, let's carry over macros to allocate arrays with `REFTABLE_ALLOC_ARRAY()` and `REFTABLE_CALLOC_ARRAY()`. This requires us to change the signature of `reftable_calloc()`, which only takes a single argument right now and thus puts the burden on the caller to calculate the final array's size. This is a net improvement though as it means that we can now provide proper overflow checks when multiplying the array size with the member size. Convert callsites of `reftable_calloc()` to the new signature and start using the new macros where possible. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- reftable/basics.h | 4 +++- reftable/block.c | 10 ++++++---- reftable/block_test.c | 2 +- reftable/blocksource.c | 4 ++-- reftable/iter.c | 3 +-- reftable/merged.c | 4 ++-- reftable/merged_test.c | 22 +++++++++++++--------- reftable/publicbasics.c | 3 ++- reftable/reader.c | 8 +++----- reftable/readwrite_test.c | 8 +++++--- reftable/record.c | 14 ++++++++------ reftable/record_test.c | 4 ++-- reftable/refname.c | 4 ++-- reftable/stack.c | 28 +++++++++++++--------------- reftable/tree.c | 4 ++-- reftable/writer.c | 7 +++---- 16 files changed, 68 insertions(+), 61 deletions(-) diff --git a/reftable/basics.h b/reftable/basics.h index 2f855cd724b357..4c3ac963a35204 100644 --- a/reftable/basics.h +++ b/reftable/basics.h @@ -51,8 +51,10 @@ int names_length(char **names); void *reftable_malloc(size_t sz); void *reftable_realloc(void *p, size_t sz); void reftable_free(void *p); -void *reftable_calloc(size_t sz); +void *reftable_calloc(size_t nelem, size_t elsize); +#define REFTABLE_ALLOC_ARRAY(x, alloc) (x) = reftable_malloc(st_mult(sizeof(*(x)), (alloc))) +#define REFTABLE_CALLOC_ARRAY(x, alloc) (x) = reftable_calloc((alloc), sizeof(*(x))) #define REFTABLE_REALLOC_ARRAY(x, alloc) (x) = reftable_realloc((x), st_mult(sizeof(*(x)), (alloc))) #define REFTABLE_ALLOC_GROW(x, nr, alloc) \ do { \ diff --git a/reftable/block.c b/reftable/block.c index 6952d0facf5242..838759823a0473 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -143,8 +143,10 @@ int block_writer_finish(struct block_writer *w) int block_header_skip = 4 + w->header_off; uLongf src_len = w->next - block_header_skip; uLongf dest_cap = src_len * 1.001 + 12; + uint8_t *compressed; + + REFTABLE_ALLOC_ARRAY(compressed, dest_cap); - uint8_t *compressed = reftable_malloc(dest_cap); while (1) { uLongf out_dest_len = dest_cap; int zresult = compress2(compressed, &out_dest_len, @@ -201,9 +203,9 @@ int block_reader_init(struct block_reader *br, struct reftable_block *block, uLongf dst_len = sz - block_header_skip; /* total size of dest buffer. */ uLongf src_len = block->len - block_header_skip; - /* Log blocks specify the *uncompressed* size in their header. - */ - uncompressed = reftable_malloc(sz); + + /* Log blocks specify the *uncompressed* size in their header. */ + REFTABLE_ALLOC_ARRAY(uncompressed, sz); /* Copy over the block header verbatim. It's not compressed. */ memcpy(uncompressed, block->data, block_header_skip); diff --git a/reftable/block_test.c b/reftable/block_test.c index dedb05c7d8c056..e162c6e33fa00a 100644 --- a/reftable/block_test.c +++ b/reftable/block_test.c @@ -36,7 +36,7 @@ static void test_block_read_write(void) int j = 0; struct strbuf want = STRBUF_INIT; - block.data = reftable_calloc(block_size); + REFTABLE_CALLOC_ARRAY(block.data, block_size); block.len = block_size; block.source = malloc_block_source(); block_writer_init(&bw, BLOCK_TYPE_REF, block.data, block_size, diff --git a/reftable/blocksource.c b/reftable/blocksource.c index 8c41e3c70f36aa..eeed254ba9c2da 100644 --- a/reftable/blocksource.c +++ b/reftable/blocksource.c @@ -29,7 +29,7 @@ static int strbuf_read_block(void *v, struct reftable_block *dest, uint64_t off, { struct strbuf *b = v; assert(off + size <= b->len); - dest->data = reftable_calloc(size); + REFTABLE_CALLOC_ARRAY(dest->data, size); memcpy(dest->data, b->buf + off, size); dest->len = size; return size; @@ -132,7 +132,7 @@ int reftable_block_source_from_file(struct reftable_block_source *bs, return REFTABLE_IO_ERROR; } - p = reftable_calloc(sizeof(*p)); + REFTABLE_CALLOC_ARRAY(p, 1); p->size = st.st_size; p->data = xmmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0); close(fd); diff --git a/reftable/iter.c b/reftable/iter.c index a8d174c040658e..8b5ebf6183a97f 100644 --- a/reftable/iter.c +++ b/reftable/iter.c @@ -160,8 +160,7 @@ int new_indexed_table_ref_iter(struct indexed_table_ref_iter **dest, int oid_len, uint64_t *offsets, int offset_len) { struct indexed_table_ref_iter empty = INDEXED_TABLE_REF_ITER_INIT; - struct indexed_table_ref_iter *itr = - reftable_calloc(sizeof(struct indexed_table_ref_iter)); + struct indexed_table_ref_iter *itr = reftable_calloc(1, sizeof(*itr)); int err = 0; *itr = empty; diff --git a/reftable/merged.c b/reftable/merged.c index c258ce953e81d7..2031fd51b4426f 100644 --- a/reftable/merged.c +++ b/reftable/merged.c @@ -190,7 +190,7 @@ int reftable_new_merged_table(struct reftable_merged_table **dest, } } - m = reftable_calloc(sizeof(struct reftable_merged_table)); + REFTABLE_CALLOC_ARRAY(m, 1); m->stack = stack; m->stack_len = n; m->min = first_min; @@ -240,7 +240,7 @@ static int merged_table_seek_record(struct reftable_merged_table *mt, struct reftable_record *rec) { struct reftable_iterator *iters = reftable_calloc( - sizeof(struct reftable_iterator) * mt->stack_len); + mt->stack_len, sizeof(*iters)); struct merged_iter merged = { .stack = iters, .typ = reftable_record_type(rec), diff --git a/reftable/merged_test.c b/reftable/merged_test.c index e05351e035fc08..e233a9d581cff0 100644 --- a/reftable/merged_test.c +++ b/reftable/merged_test.c @@ -93,10 +93,12 @@ merged_table_from_records(struct reftable_ref_record **refs, int i = 0; struct reftable_merged_table *mt = NULL; int err; - struct reftable_table *tabs = - reftable_calloc(n * sizeof(struct reftable_table)); - *readers = reftable_calloc(n * sizeof(struct reftable_reader *)); - *source = reftable_calloc(n * sizeof(**source)); + struct reftable_table *tabs; + + REFTABLE_CALLOC_ARRAY(tabs, n); + REFTABLE_CALLOC_ARRAY(*readers, n); + REFTABLE_CALLOC_ARRAY(*source, n); + for (i = 0; i < n; i++) { write_test_table(&buf[i], refs[i], sizes[i]); block_source_from_strbuf(&(*source)[i], &buf[i]); @@ -266,10 +268,12 @@ merged_table_from_log_records(struct reftable_log_record **logs, int i = 0; struct reftable_merged_table *mt = NULL; int err; - struct reftable_table *tabs = - reftable_calloc(n * sizeof(struct reftable_table)); - *readers = reftable_calloc(n * sizeof(struct reftable_reader *)); - *source = reftable_calloc(n * sizeof(**source)); + struct reftable_table *tabs; + + REFTABLE_CALLOC_ARRAY(tabs, n); + REFTABLE_CALLOC_ARRAY(*readers, n); + REFTABLE_CALLOC_ARRAY(*source, n); + for (i = 0; i < n; i++) { write_test_log_table(&buf[i], logs[i], sizes[i], i + 1); block_source_from_strbuf(&(*source)[i], &buf[i]); @@ -412,7 +416,7 @@ static void test_default_write_opts(void) }; int err; struct reftable_block_source source = { NULL }; - struct reftable_table *tab = reftable_calloc(sizeof(*tab) * 1); + struct reftable_table *tab = reftable_calloc(1, sizeof(*tab)); uint32_t hash_id; struct reftable_reader *rd = NULL; struct reftable_merged_table *merged = NULL; diff --git a/reftable/publicbasics.c b/reftable/publicbasics.c index bcb82530d6ce35..44b84a125e43b3 100644 --- a/reftable/publicbasics.c +++ b/reftable/publicbasics.c @@ -37,8 +37,9 @@ void reftable_free(void *p) free(p); } -void *reftable_calloc(size_t sz) +void *reftable_calloc(size_t nelem, size_t elsize) { + size_t sz = st_mult(nelem, elsize); void *p = reftable_malloc(sz); memset(p, 0, sz); return p; diff --git a/reftable/reader.c b/reftable/reader.c index 64dc366fb15935..3e0de5e8ad8bb1 100644 --- a/reftable/reader.c +++ b/reftable/reader.c @@ -539,8 +539,7 @@ static int reader_seek_indexed(struct reftable_reader *r, if (err == 0) { struct table_iter empty = TABLE_ITER_INIT; - struct table_iter *malloced = - reftable_calloc(sizeof(struct table_iter)); + struct table_iter *malloced = reftable_calloc(1, sizeof(*malloced)); *malloced = empty; table_iter_copy_from(malloced, &next); iterator_from_table_iter(it, malloced); @@ -635,8 +634,7 @@ void reader_close(struct reftable_reader *r) int reftable_new_reader(struct reftable_reader **p, struct reftable_block_source *src, char const *name) { - struct reftable_reader *rd = - reftable_calloc(sizeof(struct reftable_reader)); + struct reftable_reader *rd = reftable_calloc(1, sizeof(*rd)); int err = init_reader(rd, src, name); if (err == 0) { *p = rd; @@ -711,7 +709,7 @@ static int reftable_reader_refs_for_unindexed(struct reftable_reader *r, uint8_t *oid) { struct table_iter ti_empty = TABLE_ITER_INIT; - struct table_iter *ti = reftable_calloc(sizeof(struct table_iter)); + struct table_iter *ti = reftable_calloc(1, sizeof(*ti)); struct filtering_ref_iterator *filter = NULL; struct filtering_ref_iterator empty = FILTERING_REF_ITERATOR_INIT; int oid_len = hash_size(r->hash_id); diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c index b8a32240164d6a..91ea7a10ef2924 100644 --- a/reftable/readwrite_test.c +++ b/reftable/readwrite_test.c @@ -56,7 +56,9 @@ static void write_table(char ***names, struct strbuf *buf, int N, int i = 0, n; struct reftable_log_record log = { NULL }; const struct reftable_stats *stats = NULL; - *names = reftable_calloc(sizeof(char *) * (N + 1)); + + REFTABLE_CALLOC_ARRAY(*names, N + 1); + reftable_writer_set_limits(w, update_index, update_index); for (i = 0; i < N; i++) { char name[100]; @@ -188,7 +190,7 @@ static void test_log_overflow(void) static void test_log_write_read(void) { int N = 2; - char **names = reftable_calloc(sizeof(char *) * (N + 1)); + char **names = reftable_calloc(N + 1, sizeof(*names)); int err; struct reftable_write_options opts = { .block_size = 256, @@ -519,7 +521,7 @@ static void test_table_read_write_seek_index(void) static void test_table_refs_for(int indexed) { int N = 50; - char **want_names = reftable_calloc(sizeof(char *) * (N + 1)); + char **want_names = reftable_calloc(N + 1, sizeof(*want_names)); int want_names_len = 0; uint8_t want_hash[GIT_SHA1_RAWSZ]; diff --git a/reftable/record.c b/reftable/record.c index 5c3fbb7b2a1e95..2f3cd6b62c0240 100644 --- a/reftable/record.c +++ b/reftable/record.c @@ -497,12 +497,13 @@ static void reftable_obj_record_copy_from(void *rec, const void *src_rec, (const struct reftable_obj_record *)src_rec; reftable_obj_record_release(obj); - obj->hash_prefix = reftable_malloc(src->hash_prefix_len); + + REFTABLE_ALLOC_ARRAY(obj->hash_prefix, src->hash_prefix_len); obj->hash_prefix_len = src->hash_prefix_len; if (src->hash_prefix_len) memcpy(obj->hash_prefix, src->hash_prefix, obj->hash_prefix_len); - obj->offsets = reftable_malloc(src->offset_len * sizeof(uint64_t)); + REFTABLE_ALLOC_ARRAY(obj->offsets, src->offset_len); obj->offset_len = src->offset_len; COPY_ARRAY(obj->offsets, src->offsets, src->offset_len); } @@ -559,7 +560,8 @@ static int reftable_obj_record_decode(void *rec, struct strbuf key, int n = 0; uint64_t last; int j; - r->hash_prefix = reftable_malloc(key.len); + + REFTABLE_ALLOC_ARRAY(r->hash_prefix, key.len); memcpy(r->hash_prefix, key.buf, key.len); r->hash_prefix_len = key.len; @@ -577,7 +579,7 @@ static int reftable_obj_record_decode(void *rec, struct strbuf key, if (count == 0) return start.len - in.len; - r->offsets = reftable_malloc(count * sizeof(uint64_t)); + REFTABLE_ALLOC_ARRAY(r->offsets, count); r->offset_len = count; n = get_var_int(&r->offsets[0], &in); @@ -715,12 +717,12 @@ static void reftable_log_record_copy_from(void *rec, const void *src_rec, } if (dst->value.update.new_hash) { - dst->value.update.new_hash = reftable_malloc(hash_size); + REFTABLE_ALLOC_ARRAY(dst->value.update.new_hash, hash_size); memcpy(dst->value.update.new_hash, src->value.update.new_hash, hash_size); } if (dst->value.update.old_hash) { - dst->value.update.old_hash = reftable_malloc(hash_size); + REFTABLE_ALLOC_ARRAY(dst->value.update.old_hash, hash_size); memcpy(dst->value.update.old_hash, src->value.update.old_hash, hash_size); } diff --git a/reftable/record_test.c b/reftable/record_test.c index 2876db7d2708aa..999366ad467496 100644 --- a/reftable/record_test.c +++ b/reftable/record_test.c @@ -231,8 +231,8 @@ static void test_reftable_log_record_roundtrip(void) .value_type = REFTABLE_LOG_UPDATE, .value = { .update = { - .new_hash = reftable_calloc(GIT_SHA1_RAWSZ), - .old_hash = reftable_calloc(GIT_SHA1_RAWSZ), + .new_hash = reftable_calloc(GIT_SHA1_RAWSZ, 1), + .old_hash = reftable_calloc(GIT_SHA1_RAWSZ, 1), .name = xstrdup("old name"), .email = xstrdup("old@email"), .message = xstrdup("old message"), diff --git a/reftable/refname.c b/reftable/refname.c index 957349693247ae..7570e4acf9eec7 100644 --- a/reftable/refname.c +++ b/reftable/refname.c @@ -140,8 +140,8 @@ int validate_ref_record_addition(struct reftable_table tab, { struct modification mod = { .tab = tab, - .add = reftable_calloc(sizeof(char *) * sz), - .del = reftable_calloc(sizeof(char *) * sz), + .add = reftable_calloc(sz, sizeof(*mod.add)), + .del = reftable_calloc(sz, sizeof(*mod.del)), }; int i = 0; int err = 0; diff --git a/reftable/stack.c b/reftable/stack.c index 1dfab99e96fffc..a7b2c610263b84 100644 --- a/reftable/stack.c +++ b/reftable/stack.c @@ -50,8 +50,7 @@ static ssize_t reftable_fd_write(void *arg, const void *data, size_t sz) int reftable_new_stack(struct reftable_stack **dest, const char *dir, struct reftable_write_options config) { - struct reftable_stack *p = - reftable_calloc(sizeof(struct reftable_stack)); + struct reftable_stack *p = reftable_calloc(1, sizeof(*p)); struct strbuf list_file_name = STRBUF_INIT; int err = 0; @@ -94,7 +93,7 @@ static int fd_read_lines(int fd, char ***namesp) goto done; } - buf = reftable_malloc(size + 1); + REFTABLE_ALLOC_ARRAY(buf, size + 1); if (read_in_full(fd, buf, size) != size) { err = REFTABLE_IO_ERROR; goto done; @@ -114,7 +113,7 @@ int read_lines(const char *filename, char ***namesp) int err = 0; if (fd < 0) { if (errno == ENOENT) { - *namesp = reftable_calloc(sizeof(char *)); + REFTABLE_CALLOC_ARRAY(*namesp, 1); return 0; } @@ -191,8 +190,7 @@ void reftable_stack_destroy(struct reftable_stack *st) static struct reftable_reader **stack_copy_readers(struct reftable_stack *st, int cur_len) { - struct reftable_reader **cur = - reftable_calloc(sizeof(struct reftable_reader *) * cur_len); + struct reftable_reader **cur = reftable_calloc(cur_len, sizeof(*cur)); int i = 0; for (i = 0; i < cur_len; i++) { cur[i] = st->readers[i]; @@ -208,9 +206,9 @@ static int reftable_stack_reload_once(struct reftable_stack *st, char **names, int err = 0; int names_len = names_length(names); struct reftable_reader **new_readers = - reftable_calloc(sizeof(struct reftable_reader *) * names_len); + reftable_calloc(names_len, sizeof(*new_readers)); struct reftable_table *new_tables = - reftable_calloc(sizeof(struct reftable_table) * names_len); + reftable_calloc(names_len, sizeof(*new_tables)); int new_readers_len = 0; struct reftable_merged_table *new_merged = NULL; struct strbuf table_path = STRBUF_INIT; @@ -344,7 +342,7 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st, goto out; } - names = reftable_calloc(sizeof(char *)); + REFTABLE_CALLOC_ARRAY(names, 1); } else { err = fd_read_lines(fd, &names); if (err < 0) @@ -686,7 +684,7 @@ int reftable_stack_new_addition(struct reftable_addition **dest, { int err = 0; struct reftable_addition empty = REFTABLE_ADDITION_INIT; - *dest = reftable_calloc(sizeof(**dest)); + REFTABLE_CALLOC_ARRAY(*dest, 1); **dest = empty; err = reftable_stack_init_addition(*dest, st); if (err) { @@ -871,7 +869,7 @@ static int stack_write_compact(struct reftable_stack *st, { int subtabs_len = last - first + 1; struct reftable_table *subtabs = reftable_calloc( - sizeof(struct reftable_table) * (last - first + 1)); + last - first + 1, sizeof(*subtabs)); struct reftable_merged_table *mt = NULL; int err = 0; struct reftable_iterator it = { NULL }; @@ -979,9 +977,9 @@ static int stack_compact_range(struct reftable_stack *st, int first, int last, int compact_count = last - first + 1; char **listp = NULL; char **delete_on_success = - reftable_calloc(sizeof(char *) * (compact_count + 1)); + reftable_calloc(compact_count + 1, sizeof(*delete_on_success)); char **subtable_locks = - reftable_calloc(sizeof(char *) * (compact_count + 1)); + reftable_calloc(compact_count + 1, sizeof(*subtable_locks)); int i = 0; int j = 0; int is_empty_table = 0; @@ -1204,7 +1202,7 @@ int fastlog2(uint64_t sz) struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n) { - struct segment *segs = reftable_calloc(sizeof(struct segment) * n); + struct segment *segs = reftable_calloc(n, sizeof(*segs)); int next = 0; struct segment cur = { 0 }; int i = 0; @@ -1268,7 +1266,7 @@ struct segment suggest_compaction_segment(uint64_t *sizes, int n) static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st) { uint64_t *sizes = - reftable_calloc(sizeof(uint64_t) * st->merged->stack_len); + reftable_calloc(st->merged->stack_len, sizeof(*sizes)); int version = (st->config.hash_id == GIT_SHA1_FORMAT_ID) ? 1 : 2; int overhead = header_size(version) - 1; int i = 0; diff --git a/reftable/tree.c b/reftable/tree.c index a5bf880985472d..528f33ae388d0a 100644 --- a/reftable/tree.c +++ b/reftable/tree.c @@ -20,8 +20,8 @@ struct tree_node *tree_search(void *key, struct tree_node **rootp, if (!insert) { return NULL; } else { - struct tree_node *n = - reftable_calloc(sizeof(struct tree_node)); + struct tree_node *n; + REFTABLE_CALLOC_ARRAY(n, 1); n->key = key; *rootp = n; return *rootp; diff --git a/reftable/writer.c b/reftable/writer.c index 4483bb21c3d6a8..47b344813264e7 100644 --- a/reftable/writer.c +++ b/reftable/writer.c @@ -49,7 +49,7 @@ static int padded_write(struct reftable_writer *w, uint8_t *data, size_t len, { int n = 0; if (w->pending_padding > 0) { - uint8_t *zeroed = reftable_calloc(w->pending_padding); + uint8_t *zeroed = reftable_calloc(w->pending_padding, sizeof(*zeroed)); int n = w->write(w->write_arg, zeroed, w->pending_padding); if (n < 0) return n; @@ -123,8 +123,7 @@ struct reftable_writer * reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t), void *writer_arg, struct reftable_write_options *opts) { - struct reftable_writer *wp = - reftable_calloc(sizeof(struct reftable_writer)); + struct reftable_writer *wp = reftable_calloc(1, sizeof(*wp)); strbuf_init(&wp->block_writer_data.last_key, 0); options_set_defaults(opts); if (opts->block_size >= (1 << 24)) { @@ -132,7 +131,7 @@ reftable_new_writer(ssize_t (*writer_func)(void *, const void *, size_t), abort(); } wp->last_key = reftable_empty_strbuf; - wp->block = reftable_calloc(opts->block_size); + REFTABLE_CALLOC_ARRAY(wp->block, opts->block_size); wp->write = writer_func; wp->write_arg = writer_arg; wp->opts = *opts; From ca63af0a248d31bf917d207722b284da41ffcee7 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Tue, 6 Feb 2024 07:35:31 +0100 Subject: [PATCH 14/51] reftable/stack: fix parameter validation when compacting range The `stack_compact_range()` function receives a "first" and "last" index that indicates which tables of the reftable stack should be compacted. Naturally, "first" must be smaller than "last" in order to identify a proper range of tables to compress, which we indeed also assert in the function. But the validations happens after we have already allocated arrays with a size of `last - first + 1`, leading to an underflow and thus an invalid allocation size. Fix this by reordering the array allocations to happen after we have validated parameters. While at it, convert the array allocations to use the newly introduced macros. Note that the relevant variables pointing into arrays should also be converted to use `size_t` instead of `int`. This is left for a later commit in this series. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- reftable/stack.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/reftable/stack.c b/reftable/stack.c index a7b2c610263b84..1de2f6751c24ac 100644 --- a/reftable/stack.c +++ b/reftable/stack.c @@ -966,6 +966,7 @@ static int stack_write_compact(struct reftable_stack *st, static int stack_compact_range(struct reftable_stack *st, int first, int last, struct reftable_log_expiry_config *expiry) { + char **delete_on_success = NULL, **subtable_locks = NULL, **listp = NULL; struct strbuf temp_tab_file_name = STRBUF_INIT; struct strbuf new_table_name = STRBUF_INIT; struct strbuf lock_file_name = STRBUF_INIT; @@ -974,12 +975,7 @@ static int stack_compact_range(struct reftable_stack *st, int first, int last, int err = 0; int have_lock = 0; int lock_file_fd = -1; - int compact_count = last - first + 1; - char **listp = NULL; - char **delete_on_success = - reftable_calloc(compact_count + 1, sizeof(*delete_on_success)); - char **subtable_locks = - reftable_calloc(compact_count + 1, sizeof(*subtable_locks)); + int compact_count; int i = 0; int j = 0; int is_empty_table = 0; @@ -989,6 +985,10 @@ static int stack_compact_range(struct reftable_stack *st, int first, int last, goto done; } + compact_count = last - first + 1; + REFTABLE_CALLOC_ARRAY(delete_on_success, compact_count + 1); + REFTABLE_CALLOC_ARRAY(subtable_locks, compact_count + 1); + st->stats.attempts++; strbuf_reset(&lock_file_name); @@ -1146,12 +1146,14 @@ static int stack_compact_range(struct reftable_stack *st, int first, int last, done: free_names(delete_on_success); - listp = subtable_locks; - while (*listp) { - unlink(*listp); - listp++; + if (subtable_locks) { + listp = subtable_locks; + while (*listp) { + unlink(*listp); + listp++; + } + free_names(subtable_locks); } - free_names(subtable_locks); if (lock_file_fd >= 0) { close(lock_file_fd); lock_file_fd = -1; From 6d5e80fba2397708671cee6d9c5e394c4c187659 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Tue, 6 Feb 2024 07:35:35 +0100 Subject: [PATCH 15/51] reftable/stack: index segments with `size_t` We use `int`s to index into arrays of segments and track the length of them, which is considered to be a code smell in the Git project. Convert the code to use `size_t` instead. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- reftable/stack.c | 25 +++++++++++-------------- reftable/stack.h | 6 +++--- reftable/stack_test.c | 7 +++---- 3 files changed, 17 insertions(+), 21 deletions(-) diff --git a/reftable/stack.c b/reftable/stack.c index 1de2f6751c24ac..5da4ea81414eee 100644 --- a/reftable/stack.c +++ b/reftable/stack.c @@ -1202,12 +1202,11 @@ int fastlog2(uint64_t sz) return l - 1; } -struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n) +struct segment *sizes_to_segments(size_t *seglen, uint64_t *sizes, size_t n) { struct segment *segs = reftable_calloc(n, sizeof(*segs)); - int next = 0; struct segment cur = { 0 }; - int i = 0; + size_t next = 0, i; if (n == 0) { *seglen = 0; @@ -1233,29 +1232,27 @@ struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n) return segs; } -struct segment suggest_compaction_segment(uint64_t *sizes, int n) +struct segment suggest_compaction_segment(uint64_t *sizes, size_t n) { - int seglen = 0; - struct segment *segs = sizes_to_segments(&seglen, sizes, n); struct segment min_seg = { .log = 64, }; - int i = 0; + struct segment *segs; + size_t seglen = 0, i; + + segs = sizes_to_segments(&seglen, sizes, n); for (i = 0; i < seglen; i++) { - if (segment_size(&segs[i]) == 1) { + if (segment_size(&segs[i]) == 1) continue; - } - if (segs[i].log < min_seg.log) { + if (segs[i].log < min_seg.log) min_seg = segs[i]; - } } while (min_seg.start > 0) { - int prev = min_seg.start - 1; - if (fastlog2(min_seg.bytes) < fastlog2(sizes[prev])) { + size_t prev = min_seg.start - 1; + if (fastlog2(min_seg.bytes) < fastlog2(sizes[prev])) break; - } min_seg.start = prev; min_seg.bytes += sizes[prev]; diff --git a/reftable/stack.h b/reftable/stack.h index c1e3efa89960b1..d919455669ede0 100644 --- a/reftable/stack.h +++ b/reftable/stack.h @@ -32,13 +32,13 @@ struct reftable_stack { int read_lines(const char *filename, char ***lines); struct segment { - int start, end; + size_t start, end; int log; uint64_t bytes; }; int fastlog2(uint64_t sz); -struct segment *sizes_to_segments(int *seglen, uint64_t *sizes, int n); -struct segment suggest_compaction_segment(uint64_t *sizes, int n); +struct segment *sizes_to_segments(size_t *seglen, uint64_t *sizes, size_t n); +struct segment suggest_compaction_segment(uint64_t *sizes, size_t n); #endif diff --git a/reftable/stack_test.c b/reftable/stack_test.c index 289e9021464700..2d5b24e5c5d786 100644 --- a/reftable/stack_test.c +++ b/reftable/stack_test.c @@ -711,7 +711,7 @@ static void test_sizes_to_segments(void) uint64_t sizes[] = { 2, 3, 4, 5, 7, 9 }; /* .................0 1 2 3 4 5 */ - int seglen = 0; + size_t seglen = 0; struct segment *segs = sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes)); EXPECT(segs[2].log == 3); @@ -726,7 +726,7 @@ static void test_sizes_to_segments(void) static void test_sizes_to_segments_empty(void) { - int seglen = 0; + size_t seglen = 0; struct segment *segs = sizes_to_segments(&seglen, NULL, 0); EXPECT(seglen == 0); reftable_free(segs); @@ -735,8 +735,7 @@ static void test_sizes_to_segments_empty(void) static void test_sizes_to_segments_all_equal(void) { uint64_t sizes[] = { 5, 5 }; - - int seglen = 0; + size_t seglen = 0; struct segment *segs = sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes)); EXPECT(seglen == 1); From 47616c4399d958b8a9f40b1ad70071d2e3c56126 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Tue, 6 Feb 2024 07:35:41 +0100 Subject: [PATCH 16/51] reftable/stack: use `size_t` to track stack slices during compaction We use `int`s to track reftable slices when compacting the reftable stack, which is considered to be a code smell in the Git project. Convert the code to use `size_t` instead. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- reftable/stack.c | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/reftable/stack.c b/reftable/stack.c index 5da4ea81414eee..a86481a9a6d514 100644 --- a/reftable/stack.c +++ b/reftable/stack.c @@ -24,7 +24,8 @@ static int stack_try_add(struct reftable_stack *st, void *arg), void *arg); static int stack_write_compact(struct reftable_stack *st, - struct reftable_writer *wr, int first, int last, + struct reftable_writer *wr, + size_t first, size_t last, struct reftable_log_expiry_config *config); static int stack_check_addition(struct reftable_stack *st, const char *new_tab_name); @@ -820,7 +821,8 @@ uint64_t reftable_stack_next_update_index(struct reftable_stack *st) return 1; } -static int stack_compact_locked(struct reftable_stack *st, int first, int last, +static int stack_compact_locked(struct reftable_stack *st, + size_t first, size_t last, struct strbuf *temp_tab, struct reftable_log_expiry_config *config) { @@ -864,22 +866,21 @@ static int stack_compact_locked(struct reftable_stack *st, int first, int last, } static int stack_write_compact(struct reftable_stack *st, - struct reftable_writer *wr, int first, int last, + struct reftable_writer *wr, + size_t first, size_t last, struct reftable_log_expiry_config *config) { int subtabs_len = last - first + 1; struct reftable_table *subtabs = reftable_calloc( last - first + 1, sizeof(*subtabs)); struct reftable_merged_table *mt = NULL; - int err = 0; struct reftable_iterator it = { NULL }; struct reftable_ref_record ref = { NULL }; struct reftable_log_record log = { NULL }; - uint64_t entries = 0; + int err = 0; - int i = 0, j = 0; - for (i = first, j = 0; i <= last; i++) { + for (size_t i = first, j = 0; i <= last; i++) { struct reftable_reader *t = st->readers[i]; reftable_table_from_reader(&subtabs[j++], t); st->stats.bytes += t->size; @@ -963,7 +964,8 @@ static int stack_write_compact(struct reftable_stack *st, } /* < 0: error. 0 == OK, > 0 attempt failed; could retry. */ -static int stack_compact_range(struct reftable_stack *st, int first, int last, +static int stack_compact_range(struct reftable_stack *st, + size_t first, size_t last, struct reftable_log_expiry_config *expiry) { char **delete_on_success = NULL, **subtable_locks = NULL, **listp = NULL; @@ -972,12 +974,10 @@ static int stack_compact_range(struct reftable_stack *st, int first, int last, struct strbuf lock_file_name = STRBUF_INIT; struct strbuf ref_list_contents = STRBUF_INIT; struct strbuf new_table_path = STRBUF_INIT; + size_t i, j, compact_count; int err = 0; int have_lock = 0; int lock_file_fd = -1; - int compact_count; - int i = 0; - int j = 0; int is_empty_table = 0; if (first > last || (!expiry && first == last)) { @@ -1172,17 +1172,17 @@ static int stack_compact_range(struct reftable_stack *st, int first, int last, int reftable_stack_compact_all(struct reftable_stack *st, struct reftable_log_expiry_config *config) { - return stack_compact_range(st, 0, st->merged->stack_len - 1, config); + return stack_compact_range(st, 0, st->merged->stack_len ? + st->merged->stack_len - 1 : 0, config); } -static int stack_compact_range_stats(struct reftable_stack *st, int first, - int last, +static int stack_compact_range_stats(struct reftable_stack *st, + size_t first, size_t last, struct reftable_log_expiry_config *config) { int err = stack_compact_range(st, first, last, config); - if (err > 0) { + if (err > 0) st->stats.failures++; - } return err; } From 81879123c32d06406de5bfaf68baf6029660ca2a Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Tue, 6 Feb 2024 07:35:46 +0100 Subject: [PATCH 17/51] reftable/stack: use `size_t` to track stack length While the stack length is already stored as `size_t`, we frequently use `int`s to refer to those stacks throughout the reftable library. Convert those cases to use `size_t` instead to make things consistent. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- reftable/basics.c | 7 +++---- reftable/basics.h | 2 +- reftable/merged.c | 11 +++++------ reftable/merged_test.c | 14 ++++++-------- reftable/reftable-merged.h | 2 +- reftable/stack.c | 21 ++++++++++----------- 6 files changed, 26 insertions(+), 31 deletions(-) diff --git a/reftable/basics.c b/reftable/basics.c index af9004cec24db5..0785aff941bdc2 100644 --- a/reftable/basics.c +++ b/reftable/basics.c @@ -64,12 +64,11 @@ void free_names(char **a) reftable_free(a); } -int names_length(char **names) +size_t names_length(char **names) { char **p = names; - for (; *p; p++) { - /* empty */ - } + while (*p) + p++; return p - names; } diff --git a/reftable/basics.h b/reftable/basics.h index 4c3ac963a35204..91f3533efee501 100644 --- a/reftable/basics.h +++ b/reftable/basics.h @@ -44,7 +44,7 @@ void parse_names(char *buf, int size, char ***namesp); int names_equal(char **a, char **b); /* returns the array size of a NULL-terminated array of strings. */ -int names_length(char **names); +size_t names_length(char **names); /* Allocation routines; they invoke the functions set through * reftable_set_alloc() */ diff --git a/reftable/merged.c b/reftable/merged.c index 2031fd51b4426f..e2c6253324f256 100644 --- a/reftable/merged.c +++ b/reftable/merged.c @@ -45,11 +45,10 @@ static int merged_iter_init(struct merged_iter *mi) static void merged_iter_close(void *p) { struct merged_iter *mi = p; - int i = 0; + merged_iter_pqueue_release(&mi->pq); - for (i = 0; i < mi->stack_len; i++) { + for (size_t i = 0; i < mi->stack_len; i++) reftable_iterator_destroy(&mi->stack[i]); - } reftable_free(mi->stack); strbuf_release(&mi->key); strbuf_release(&mi->entry_key); @@ -168,14 +167,14 @@ static void iterator_from_merged_iter(struct reftable_iterator *it, } int reftable_new_merged_table(struct reftable_merged_table **dest, - struct reftable_table *stack, int n, + struct reftable_table *stack, size_t n, uint32_t hash_id) { struct reftable_merged_table *m = NULL; uint64_t last_max = 0; uint64_t first_min = 0; - int i = 0; - for (i = 0; i < n; i++) { + + for (size_t i = 0; i < n; i++) { uint64_t min = reftable_table_min_update_index(&stack[i]); uint64_t max = reftable_table_max_update_index(&stack[i]); diff --git a/reftable/merged_test.c b/reftable/merged_test.c index e233a9d581cff0..442917cc83595e 100644 --- a/reftable/merged_test.c +++ b/reftable/merged_test.c @@ -88,18 +88,17 @@ static struct reftable_merged_table * merged_table_from_records(struct reftable_ref_record **refs, struct reftable_block_source **source, struct reftable_reader ***readers, int *sizes, - struct strbuf *buf, int n) + struct strbuf *buf, size_t n) { - int i = 0; struct reftable_merged_table *mt = NULL; - int err; struct reftable_table *tabs; + int err; REFTABLE_CALLOC_ARRAY(tabs, n); REFTABLE_CALLOC_ARRAY(*readers, n); REFTABLE_CALLOC_ARRAY(*source, n); - for (i = 0; i < n; i++) { + for (size_t i = 0; i < n; i++) { write_test_table(&buf[i], refs[i], sizes[i]); block_source_from_strbuf(&(*source)[i], &buf[i]); @@ -263,18 +262,17 @@ static struct reftable_merged_table * merged_table_from_log_records(struct reftable_log_record **logs, struct reftable_block_source **source, struct reftable_reader ***readers, int *sizes, - struct strbuf *buf, int n) + struct strbuf *buf, size_t n) { - int i = 0; struct reftable_merged_table *mt = NULL; - int err; struct reftable_table *tabs; + int err; REFTABLE_CALLOC_ARRAY(tabs, n); REFTABLE_CALLOC_ARRAY(*readers, n); REFTABLE_CALLOC_ARRAY(*source, n); - for (i = 0; i < n; i++) { + for (size_t i = 0; i < n; i++) { write_test_log_table(&buf[i], logs[i], sizes[i], i + 1); block_source_from_strbuf(&(*source)[i], &buf[i]); diff --git a/reftable/reftable-merged.h b/reftable/reftable-merged.h index 1a6d16915ab453..c91a2d83a2b8a9 100644 --- a/reftable/reftable-merged.h +++ b/reftable/reftable-merged.h @@ -33,7 +33,7 @@ struct reftable_table; the stack array. */ int reftable_new_merged_table(struct reftable_merged_table **dest, - struct reftable_table *stack, int n, + struct reftable_table *stack, size_t n, uint32_t hash_id); /* returns an iterator positioned just before 'name' */ diff --git a/reftable/stack.c b/reftable/stack.c index a86481a9a6d514..bb684a3dc1643e 100644 --- a/reftable/stack.c +++ b/reftable/stack.c @@ -202,18 +202,18 @@ static struct reftable_reader **stack_copy_readers(struct reftable_stack *st, static int reftable_stack_reload_once(struct reftable_stack *st, char **names, int reuse_open) { - int cur_len = !st->merged ? 0 : st->merged->stack_len; + size_t cur_len = !st->merged ? 0 : st->merged->stack_len; struct reftable_reader **cur = stack_copy_readers(st, cur_len); - int err = 0; - int names_len = names_length(names); + size_t names_len = names_length(names); struct reftable_reader **new_readers = reftable_calloc(names_len, sizeof(*new_readers)); struct reftable_table *new_tables = reftable_calloc(names_len, sizeof(*new_tables)); - int new_readers_len = 0; + size_t new_readers_len = 0; struct reftable_merged_table *new_merged = NULL; struct strbuf table_path = STRBUF_INIT; - int i; + int err = 0; + size_t i; while (*names) { struct reftable_reader *rd = NULL; @@ -221,11 +221,10 @@ static int reftable_stack_reload_once(struct reftable_stack *st, char **names, /* this is linear; we assume compaction keeps the number of tables under control so this is not quadratic. */ - int j = 0; - for (j = 0; reuse_open && j < cur_len; j++) { - if (cur[j] && 0 == strcmp(cur[j]->name, name)) { - rd = cur[j]; - cur[j] = NULL; + for (i = 0; reuse_open && i < cur_len; i++) { + if (cur[i] && 0 == strcmp(cur[i]->name, name)) { + rd = cur[i]; + cur[i] = NULL; break; } } @@ -870,7 +869,7 @@ static int stack_write_compact(struct reftable_stack *st, size_t first, size_t last, struct reftable_log_expiry_config *config) { - int subtabs_len = last - first + 1; + size_t subtabs_len = last - first + 1; struct reftable_table *subtabs = reftable_calloc( last - first + 1, sizeof(*subtabs)); struct reftable_merged_table *mt = NULL; From 59f302ca5abf3e8ec4f14f098b26adf786017fad Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Tue, 6 Feb 2024 07:35:50 +0100 Subject: [PATCH 18/51] reftable/merged: refactor seeking of records The code to seek reftable records in the merged table code is quite hard to read and does not conform to our coding style in multiple ways: - We have multiple exit paths where we release resources even though that is not really necessary. - We use a scoped error variable `e` which is hard to reason about. This variable is not required at all. - We allocate memory in the variable declarations, which is easy to miss. Refactor the function so that it becomes more maintainable in the future. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- reftable/merged.c | 54 ++++++++++++++++++----------------------------- 1 file changed, 21 insertions(+), 33 deletions(-) diff --git a/reftable/merged.c b/reftable/merged.c index e2c6253324f256..0abcda26e846e2 100644 --- a/reftable/merged.c +++ b/reftable/merged.c @@ -238,50 +238,38 @@ static int merged_table_seek_record(struct reftable_merged_table *mt, struct reftable_iterator *it, struct reftable_record *rec) { - struct reftable_iterator *iters = reftable_calloc( - mt->stack_len, sizeof(*iters)); struct merged_iter merged = { - .stack = iters, .typ = reftable_record_type(rec), .hash_id = mt->hash_id, .suppress_deletions = mt->suppress_deletions, .key = STRBUF_INIT, .entry_key = STRBUF_INIT, }; - int n = 0; - int err = 0; - int i = 0; - for (i = 0; i < mt->stack_len && err == 0; i++) { - int e = reftable_table_seek_record(&mt->stack[i], &iters[n], - rec); - if (e < 0) { - err = e; - } - if (e == 0) { - n++; - } - } - if (err < 0) { - int i = 0; - for (i = 0; i < n; i++) { - reftable_iterator_destroy(&iters[i]); - } - reftable_free(iters); - return err; + struct merged_iter *p; + int err; + + REFTABLE_CALLOC_ARRAY(merged.stack, mt->stack_len); + for (size_t i = 0; i < mt->stack_len; i++) { + err = reftable_table_seek_record(&mt->stack[i], + &merged.stack[merged.stack_len], rec); + if (err < 0) + goto out; + if (!err) + merged.stack_len++; } - merged.stack_len = n; err = merged_iter_init(&merged); - if (err < 0) { + if (err < 0) + goto out; + + p = reftable_malloc(sizeof(struct merged_iter)); + *p = merged; + iterator_from_merged_iter(it, p); + +out: + if (err < 0) merged_iter_close(&merged); - return err; - } else { - struct merged_iter *p = - reftable_malloc(sizeof(struct merged_iter)); - *p = merged; - iterator_from_merged_iter(it, p); - } - return 0; + return err; } int reftable_merged_table_seek_ref(struct reftable_merged_table *mt, From 62d3c8e8c8a3dc3113cead8d9dd36f7e59054670 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Tue, 6 Feb 2024 07:35:55 +0100 Subject: [PATCH 19/51] reftable/merged: refactor initialization of iterators Refactor the initialization of the merged iterator to fit our code style better. This refactoring prepares the code for a refactoring of how records are being initialized. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- reftable/merged.c | 27 +++++++++++++-------------- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/reftable/merged.c b/reftable/merged.c index 0abcda26e846e2..0e60e2a39b1d81 100644 --- a/reftable/merged.c +++ b/reftable/merged.c @@ -19,24 +19,23 @@ license that can be found in the LICENSE file or at static int merged_iter_init(struct merged_iter *mi) { - int i = 0; - for (i = 0; i < mi->stack_len; i++) { - struct reftable_record rec = reftable_new_record(mi->typ); - int err = iterator_next(&mi->stack[i], &rec); - if (err < 0) { + for (size_t i = 0; i < mi->stack_len; i++) { + struct pq_entry e = { + .rec = reftable_new_record(mi->typ), + .index = i, + }; + int err; + + err = iterator_next(&mi->stack[i], &e.rec); + if (err < 0) return err; - } - if (err > 0) { reftable_iterator_destroy(&mi->stack[i]); - reftable_record_release(&rec); - } else { - struct pq_entry e = { - .rec = rec, - .index = i, - }; - merged_iter_pqueue_add(&mi->pq, &e); + reftable_record_release(&e.rec); + continue; } + + merged_iter_pqueue_add(&mi->pq, &e); } return 0; From 3ddef475d008b8a705a53e6bb1405bb773ffdc50 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Tue, 6 Feb 2024 07:35:59 +0100 Subject: [PATCH 20/51] reftable/record: improve semantics when initializing records According to our usual coding style, the `reftable_new_record()` function would indicate that it is allocating a new record. This is not the case though as the function merely initializes records without allocating any memory. Replace `reftable_new_record()` with a new `reftable_record_init()` function that takes a record pointer as input and initializes it accordingly. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- reftable/block.c | 18 +++++++++--------- reftable/merged.c | 8 +++++--- reftable/reader.c | 4 ++-- reftable/record.c | 43 ++++++++++-------------------------------- reftable/record.h | 10 +++++----- reftable/record_test.c | 4 ++-- 6 files changed, 33 insertions(+), 54 deletions(-) diff --git a/reftable/block.c b/reftable/block.c index 838759823a0473..ce8a7d24f3d33e 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -382,23 +382,23 @@ int block_reader_seek(struct block_reader *br, struct block_iter *it, .key = *want, .r = br, }; - struct reftable_record rec = reftable_new_record(block_reader_type(br)); - int err = 0; struct block_iter next = BLOCK_ITER_INIT; + struct reftable_record rec; + int err = 0, i; - int i = binsearch(br->restart_count, &restart_key_less, &args); if (args.error) { err = REFTABLE_FORMAT_ERROR; goto done; } - it->br = br; - if (i > 0) { - i--; - it->next_off = block_reader_restart_offset(br, i); - } else { + i = binsearch(br->restart_count, &restart_key_less, &args); + if (i > 0) + it->next_off = block_reader_restart_offset(br, i - 1); + else it->next_off = br->header_off + 4; - } + it->br = br; + + reftable_record_init(&rec, block_reader_type(br)); /* We're looking for the last entry less/equal than the wanted key, so we have to go one entry too far and then back up. diff --git a/reftable/merged.c b/reftable/merged.c index 0e60e2a39b1d81..a0f222e07bdf75 100644 --- a/reftable/merged.c +++ b/reftable/merged.c @@ -21,11 +21,11 @@ static int merged_iter_init(struct merged_iter *mi) { for (size_t i = 0; i < mi->stack_len; i++) { struct pq_entry e = { - .rec = reftable_new_record(mi->typ), .index = i, }; int err; + reftable_record_init(&e.rec, mi->typ); err = iterator_next(&mi->stack[i], &e.rec); if (err < 0) return err; @@ -57,10 +57,12 @@ static int merged_iter_advance_nonnull_subiter(struct merged_iter *mi, size_t idx) { struct pq_entry e = { - .rec = reftable_new_record(mi->typ), .index = idx, }; - int err = iterator_next(&mi->stack[idx], &e.rec); + int err; + + reftable_record_init(&e.rec, mi->typ); + err = iterator_next(&mi->stack[idx], &e.rec); if (err < 0) return err; diff --git a/reftable/reader.c b/reftable/reader.c index 3e0de5e8ad8bb1..5e6c8f30a1685d 100644 --- a/reftable/reader.c +++ b/reftable/reader.c @@ -444,13 +444,13 @@ static int reader_start(struct reftable_reader *r, struct table_iter *ti, static int reader_seek_linear(struct table_iter *ti, struct reftable_record *want) { - struct reftable_record rec = - reftable_new_record(reftable_record_type(want)); struct strbuf want_key = STRBUF_INIT; struct strbuf got_key = STRBUF_INIT; struct table_iter next = TABLE_ITER_INIT; + struct reftable_record rec; int err = -1; + reftable_record_init(&rec, reftable_record_type(want)); reftable_record_key(want, &want_key); while (1) { diff --git a/reftable/record.c b/reftable/record.c index 2f3cd6b62c0240..26c5e43f9bfc3e 100644 --- a/reftable/record.c +++ b/reftable/record.c @@ -1259,45 +1259,22 @@ reftable_record_vtable(struct reftable_record *rec) abort(); } -struct reftable_record reftable_new_record(uint8_t typ) +void reftable_record_init(struct reftable_record *rec, uint8_t typ) { - struct reftable_record clean = { - .type = typ, - }; + memset(rec, 0, sizeof(*rec)); + rec->type = typ; - /* the following is involved, but the naive solution (just return - * `clean` as is, except for BLOCK_TYPE_INDEX), returns a garbage - * clean.u.obj.offsets pointer on Windows VS CI. Go figure. - */ switch (typ) { - case BLOCK_TYPE_OBJ: - { - struct reftable_obj_record obj = { 0 }; - clean.u.obj = obj; - break; - } - case BLOCK_TYPE_INDEX: - { - struct reftable_index_record idx = { - .last_key = STRBUF_INIT, - }; - clean.u.idx = idx; - break; - } case BLOCK_TYPE_REF: - { - struct reftable_ref_record ref = { 0 }; - clean.u.ref = ref; - break; - } case BLOCK_TYPE_LOG: - { - struct reftable_log_record log = { 0 }; - clean.u.log = log; - break; - } + case BLOCK_TYPE_OBJ: + return; + case BLOCK_TYPE_INDEX: + strbuf_init(&rec->u.idx.last_key, 0); + return; + default: + BUG("unhandled record type"); } - return clean; } void reftable_record_print(struct reftable_record *rec, int hash_size) diff --git a/reftable/record.h b/reftable/record.h index fd80cd451d5d4c..e64ed30c8058ca 100644 --- a/reftable/record.h +++ b/reftable/record.h @@ -69,9 +69,6 @@ struct reftable_record_vtable { /* returns true for recognized block types. Block start with the block type. */ int reftable_is_block_type(uint8_t typ); -/* return an initialized record for the given type */ -struct reftable_record reftable_new_record(uint8_t typ); - /* Encode `key` into `dest`. Sets `is_restart` to indicate a restart. Returns * number of bytes written. */ int reftable_encode_key(int *is_restart, struct string_view dest, @@ -100,8 +97,8 @@ struct reftable_obj_record { /* record is a generic wrapper for different types of records. It is normally * created on the stack, or embedded within another struct. If the type is * known, a fresh instance can be initialized explicitly. Otherwise, use - * reftable_new_record() to initialize generically (as the index_record is not - * valid as 0-initialized structure) + * `reftable_record_init()` to initialize generically (as the index_record is + * not valid as 0-initialized structure) */ struct reftable_record { uint8_t type; @@ -113,6 +110,9 @@ struct reftable_record { } u; }; +/* Initialize the reftable record for the given type */ +void reftable_record_init(struct reftable_record *rec, uint8_t typ); + /* see struct record_vtable */ int reftable_record_equal(struct reftable_record *a, struct reftable_record *b, int hash_size); void reftable_record_print(struct reftable_record *rec, int hash_size); diff --git a/reftable/record_test.c b/reftable/record_test.c index 999366ad467496..a86cff552661ba 100644 --- a/reftable/record_test.c +++ b/reftable/record_test.c @@ -16,11 +16,11 @@ static void test_copy(struct reftable_record *rec) { - struct reftable_record copy = { 0 }; + struct reftable_record copy; uint8_t typ; typ = reftable_record_type(rec); - copy = reftable_new_record(typ); + reftable_record_init(©, typ); reftable_record_copy_from(©, rec, GIT_SHA1_RAWSZ); /* do it twice to catch memory leaks */ reftable_record_copy_from(©, rec, GIT_SHA1_RAWSZ); From 78307f1a89dc94eaab7de40fe80b7594ed80225c Mon Sep 17 00:00:00 2001 From: Philippe Blain Date: Tue, 6 Feb 2024 13:20:12 +0000 Subject: [PATCH 21/51] .github/PULL_REQUEST_TEMPLATE.md: add a note about single-commit PRs Contributors using Gitgitgadget continue to send single-commit PRs with their commit message text duplicated below the three-dash line, increasing the signal-to-noise ratio for reviewers. This is because Gitgitgadget copies the pull request description as an in-patch commentary, for single-commit PRs, and _GitHub_ defaults to prefilling the pull request description with the commit message, for single-commit PRs (followed by the content of the pull request template). Add a note in the pull request template mentioning that for single-commit PRs, the PR description should thus be kept empty, in the hope that contributors read it and act on it. This partly addresses: https://github.com/gitgitgadget/gitgitgadget/issues/340 Signed-off-by: Philippe Blain Signed-off-by: Junio C Hamano --- .github/PULL_REQUEST_TEMPLATE.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index 952c7c3a2aa11e..37654cdfd7abcf 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -4,4 +4,7 @@ a mailing list (git@vger.kernel.org) for code submissions, code reviews, and bug reports. Nevertheless, you can use GitGitGadget (https://gitgitgadget.github.io/) to conveniently send your Pull Requests commits to our mailing list. +For a single-commit pull request, please *leave the pull request description +empty*: your commit message itself should describe your changes. + Please read the "guidelines for contributing" linked above! From db489ea4f368656d7b0d5702f0bcc06779ea89d0 Mon Sep 17 00:00:00 2001 From: Britton Leo Kerin Date: Tue, 6 Feb 2024 12:50:42 -0900 Subject: [PATCH 22/51] completion: tests: always use 'master' for default initial branch name The default initial branch name can normally be configured using the GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME environment variable. However, when testing e.g. completion it's convenient to know the exact initial branch name that will be used. To achieve that without too much trouble it is considered sufficient to force the default initial branch name to 'master' for all of t9902-completion.sh. Signed-off-by: Britton Leo Kerin Signed-off-by: Junio C Hamano --- t/t9902-completion.sh | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh index aa9a614de33a17..a5d4e900a26622 100755 --- a/t/t9902-completion.sh +++ b/t/t9902-completion.sh @@ -5,6 +5,11 @@ test_description='test bash completion' +# Override environment and always use master for the default initial branch +# name for these tests, so that rev completion candidates are as expected. +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME + . ./lib-bash.sh complete () From e1f74dd58b77fe9bc5ed196221642395cf8951d0 Mon Sep 17 00:00:00 2001 From: Britton Leo Kerin Date: Tue, 6 Feb 2024 12:50:43 -0900 Subject: [PATCH 23/51] completion: bisect: complete bad, new, old, and help subcommands The bad, new, old and help subcommands to git-bisect(1) are not completed. Add the bad, new, old, and help subcommands to the appropriate lists such that the commands and their possible ref arguments are completed. Add tests. Signed-off-by: Britton Leo Kerin Signed-off-by: Junio C Hamano --- contrib/completion/git-completion.bash | 4 +- t/t9902-completion.sh | 71 ++++++++++++++++++++++++++ 2 files changed, 73 insertions(+), 2 deletions(-) diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index 185b47d802f433..06d0b156e7b006 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -1449,7 +1449,7 @@ _git_bisect () { __git_has_doubledash && return - local subcommands="start bad good skip reset visualize replay log run" + local subcommands="start bad new good old skip reset visualize replay log run help" local subcommand="$(__git_find_on_cmdline "$subcommands")" if [ -z "$subcommand" ]; then __git_find_repo_path @@ -1462,7 +1462,7 @@ _git_bisect () fi case "$subcommand" in - bad|good|reset|skip|start) + bad|new|good|old|reset|skip|start) __git_complete_refs ;; *) diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh index a5d4e900a26622..7388c892cf51c5 100755 --- a/t/t9902-completion.sh +++ b/t/t9902-completion.sh @@ -1264,6 +1264,77 @@ test_expect_success 'git switch - with no options, complete local branches and u EOF ' +test_expect_success 'git bisect - when not bisecting, complete only replay and start subcommands' ' + test_completion "git bisect " <<-\EOF + replay Z + start Z + EOF +' + +test_expect_success 'setup for git-bisect tests requiring a repo' ' + git init git-bisect && + ( + cd git-bisect && + echo "initial contents" >file && + git add file && + git commit -am "Initial commit" && + git tag initial && + echo "new line" >>file && + git commit -am "First change" && + echo "another new line" >>file && + git commit -am "Second change" && + git tag final + ) +' + +test_expect_success 'git bisect - start subcommand arguments before double-dash are completed as revs' ' + ( + cd git-bisect && + test_completion "git bisect start " <<-\EOF + HEAD Z + final Z + initial Z + master Z + EOF + ) +' + +# Note that these arguments are s, which in practice the fallback +# completion (not the git completion) later ends up completing as paths. +test_expect_success 'git bisect - start subcommand arguments after double-dash are not completed' ' + ( + cd git-bisect && + test_completion "git bisect start final initial -- " "" + ) +' + +test_expect_success 'setup for git-bisect tests requiring ongoing bisection' ' + ( + cd git-bisect && + git bisect start --term-new=custom_new --term-old=custom_old final initial + ) +' + +test_expect_success 'git-bisect - when bisecting all subcommands are candidates' ' + ( + cd git-bisect && + test_completion "git bisect " <<-\EOF + start Z + bad Z + new Z + good Z + old Z + skip Z + reset Z + visualize Z + replay Z + log Z + run Z + help Z + EOF + ) +' + test_expect_success 'git checkout - completes refs and unique remote branches for DWIM' ' test_completion "git checkout " <<-\EOF HEAD Z From af8910a2d4cdc3452c4b48e073e18fc10ff76723 Mon Sep 17 00:00:00 2001 From: Britton Leo Kerin Date: Tue, 6 Feb 2024 12:50:44 -0900 Subject: [PATCH 24/51] completion: bisect: complete custom terms and related options git bisect supports the use of custom terms via the --term-(new|bad) and --term-(old|good) options, but the completion code doesn't know about these options or the new subcommands they define. Add support for these options and the custom subcommands by checking for BISECT_TERMS and adding them to the list of subcommands. Add tests. Signed-off-by: Britton Leo Kerin Signed-off-by: Junio C Hamano --- contrib/completion/git-completion.bash | 32 ++++++++++++++++++++++++-- t/t9902-completion.sh | 15 ++++++++++++ 2 files changed, 45 insertions(+), 2 deletions(-) diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index 06d0b156e7b006..6a3d9c7760039e 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -1449,7 +1449,20 @@ _git_bisect () { __git_has_doubledash && return - local subcommands="start bad new good old skip reset visualize replay log run help" + __git_find_repo_path + + # If a bisection is in progress get the terms being used. + local term_bad term_good + if [ -f "$__git_repo_path"/BISECT_TERMS ]; then + term_bad=$(__git bisect terms --term-bad) + term_good=$(__git bisect terms --term-good) + fi + + # We will complete any custom terms, but still always complete the + # more usual bad/new/good/old because git bisect gives a good error + # message if these are given when not in use, and that's better than + # silent refusal to complete if the user is confused. + local subcommands="start bad new $term_bad good old $term_good terms skip reset visualize replay log run help" local subcommand="$(__git_find_on_cmdline "$subcommands")" if [ -z "$subcommand" ]; then __git_find_repo_path @@ -1462,7 +1475,22 @@ _git_bisect () fi case "$subcommand" in - bad|new|good|old|reset|skip|start) + start) + case "$cur" in + --*) + __gitcomp "--term-new --term-bad --term-old --term-good" + return + ;; + *) + __git_complete_refs + ;; + esac + ;; + terms) + __gitcomp "--term-good --term-old --term-bad --term-new" + return + ;; + bad|new|"$term_bad"|good|old|"$term_good"|reset|skip) __git_complete_refs ;; *) diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh index 7388c892cf51c5..304903b1a73cc2 100755 --- a/t/t9902-completion.sh +++ b/t/t9902-completion.sh @@ -1321,9 +1321,12 @@ test_expect_success 'git-bisect - when bisecting all subcommands are candidates' test_completion "git bisect " <<-\EOF start Z bad Z + custom_new Z + custom_old Z new Z good Z old Z + terms Z skip Z reset Z visualize Z @@ -1335,6 +1338,18 @@ test_expect_success 'git-bisect - when bisecting all subcommands are candidates' ) ' +test_expect_success 'git-bisect - options to terms subcommand are candidates' ' + ( + cd git-bisect && + test_completion "git bisect terms --" <<-\EOF + --term-bad Z + --term-good Z + --term-new Z + --term-old Z + EOF + ) +' + test_expect_success 'git checkout - completes refs and unique remote branches for DWIM' ' test_completion "git checkout " <<-\EOF HEAD Z From 41928aeb45e70d4361c780cc69d3975faee5eec4 Mon Sep 17 00:00:00 2001 From: Britton Leo Kerin Date: Tue, 6 Feb 2024 12:50:45 -0900 Subject: [PATCH 25/51] completion: bisect: complete missing --first-parent and - -no-checkout options The --first-parent and --no-checkout options to the start subcommand of git-bisect(1) are not completed. Enable completion of the --first-parent and --no-checkout options to the start subcommand. Add test. Signed-off-by: Britton Leo Kerin Signed-off-by: Junio C Hamano --- contrib/completion/git-completion.bash | 2 +- t/t9902-completion.sh | 11 +++++++++++ 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index 6a3d9c7760039e..57c6e09968d45e 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -1478,7 +1478,7 @@ _git_bisect () start) case "$cur" in --*) - __gitcomp "--term-new --term-bad --term-old --term-good" + __gitcomp "--first-parent --no-checkout --term-new --term-bad --term-old --term-good" return ;; *) diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh index 304903b1a73cc2..8fcd1cfa7ef8fc 100755 --- a/t/t9902-completion.sh +++ b/t/t9902-completion.sh @@ -1271,6 +1271,17 @@ test_expect_success 'git bisect - when not bisecting, complete only replay and s EOF ' +test_expect_success 'git bisect - complete options to start subcommand' ' + test_completion "git bisect start --" <<-\EOF + --term-new Z + --term-bad Z + --term-old Z + --term-good Z + --no-checkout Z + --first-parent Z + EOF +' + test_expect_success 'setup for git-bisect tests requiring a repo' ' git init git-bisect && ( From a9e5b7a76da5ceab772166c94830dd899cf55b88 Mon Sep 17 00:00:00 2001 From: Britton Leo Kerin Date: Tue, 6 Feb 2024 12:50:46 -0900 Subject: [PATCH 26/51] completion: new function __git_complete_log_opts The options accepted by git-log are also accepted by at least one other command (git-bisect). Factor the common option completion code into a new function and use it from _git_log. The new function leaves COMPREPLY empty if no option candidates are found, so that callers can safely check it to determine if completion for other arguments should be attempted. Signed-off-by: Britton Leo Kerin Signed-off-by: Junio C Hamano --- contrib/completion/git-completion.bash | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index 57c6e09968d45e..b9ebd5e4095de2 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -2089,10 +2089,12 @@ __git_diff_merges_opts="off none on first-parent 1 separate m combined c dense-c __git_log_pretty_formats="oneline short medium full fuller reference email raw format: tformat: mboxrd" __git_log_date_formats="relative iso8601 iso8601-strict rfc2822 short local default human raw unix auto: format:" -_git_log () +# Complete porcelain (i.e. not git-rev-list) options and at least some +# option arguments accepted by git-log. Note that this same set of options +# are also accepted by some other git commands besides git-log. +__git_complete_log_opts () { - __git_has_doubledash && return - __git_find_repo_path + COMPREPLY=() local merge="" if [ -f "$__git_repo_path/MERGE_HEAD" ]; then @@ -2186,6 +2188,16 @@ _git_log () return ;; esac +} + +_git_log () +{ + __git_has_doubledash && return + __git_find_repo_path + + __git_complete_log_opts + [ ${#COMPREPLY[@]} -eq 0 ] || return + __git_complete_revlist } From d115b877879cc8d072971437395ea2b97d47a7d7 Mon Sep 17 00:00:00 2001 From: Britton Leo Kerin Date: Tue, 6 Feb 2024 12:50:47 -0900 Subject: [PATCH 27/51] completion: bisect: complete log opts for visualize subcommand Arguments passed to the "visualize" subcommand of git-bisect(1) get forwarded to git-log(1). It thus supports the same options as git-log(1) would, but our Bash completion script does not know to handle this. Make completion of porcelain git-log options and option arguments to the visualize subcommand work by calling __git_complete_log_opts when the start of an option to the subcommand is seen (visualize doesn't support any options besides the git-log options). Add test. Signed-off-by: Britton Leo Kerin Signed-off-by: Junio C Hamano --- contrib/completion/git-completion.bash | 4 ++++ t/t9902-completion.sh | 15 +++++++++++++++ 2 files changed, 19 insertions(+) diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index b9ebd5e4095de2..5337ae4ce2b798 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -1490,6 +1490,10 @@ _git_bisect () __gitcomp "--term-good --term-old --term-bad --term-new" return ;; + visualize) + __git_complete_log_opts + return + ;; bad|new|"$term_bad"|good|old|"$term_good"|reset|skip) __git_complete_refs ;; diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh index 8fcd1cfa7ef8fc..b989388e7ea731 100755 --- a/t/t9902-completion.sh +++ b/t/t9902-completion.sh @@ -1361,6 +1361,21 @@ test_expect_success 'git-bisect - options to terms subcommand are candidates' ' ) ' +test_expect_success 'git-bisect - git-log options to visualize subcommand are candidates' ' + ( + cd git-bisect && + # The completion used for git-log and here does not complete + # every git-log option, so rather than hope to stay in sync + # with exactly what it does we will just spot-test here. + test_completion "git bisect visualize --sta" <<-\EOF && + --stat Z + EOF + test_completion "git bisect visualize --summar" <<-\EOF + --summary Z + EOF + ) +' + test_expect_success 'git checkout - completes refs and unique remote branches for DWIM' ' test_completion "git checkout " <<-\EOF HEAD Z From d8e08f0717c17b2ee629c50844c34adc83575ad0 Mon Sep 17 00:00:00 2001 From: Britton Leo Kerin Date: Tue, 6 Feb 2024 12:50:48 -0900 Subject: [PATCH 28/51] completion: bisect: recognize but do not complete view subcommand The "view" alias for the visualize subcommand is neither completed nor recognized. It's undesirable to complete it because it's first letters are the same as for visualize, making completion less rather than more efficient without adding much in the way of interface discovery. However, it needs to be recognized in order to enable log option completion for it. Recognize but do not complete the view command by creating and using separate lists of completable_subcommands and all_subcommands. Add tests. Signed-off-by: Britton Leo Kerin Signed-off-by: Junio C Hamano --- contrib/completion/git-completion.bash | 15 +++++++++++---- t/t9902-completion.sh | 24 ++++++++++++++++++++++++ 2 files changed, 35 insertions(+), 4 deletions(-) diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index 5337ae4ce2b798..0734debc11294a 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -1462,12 +1462,19 @@ _git_bisect () # more usual bad/new/good/old because git bisect gives a good error # message if these are given when not in use, and that's better than # silent refusal to complete if the user is confused. - local subcommands="start bad new $term_bad good old $term_good terms skip reset visualize replay log run help" - local subcommand="$(__git_find_on_cmdline "$subcommands")" + # + # We want to recognize 'view' but not complete it, because it overlaps + # with 'visualize' too much and is just an alias for it. + # + local completable_subcommands="start bad new $term_bad good old $term_good terms skip reset visualize replay log run help" + local all_subcommands="$completable_subcommands view" + + local subcommand="$(__git_find_on_cmdline "$all_subcommands")" + if [ -z "$subcommand" ]; then __git_find_repo_path if [ -f "$__git_repo_path"/BISECT_START ]; then - __gitcomp "$subcommands" + __gitcomp "$completable_subcommands" else __gitcomp "replay start" fi @@ -1490,7 +1497,7 @@ _git_bisect () __gitcomp "--term-good --term-old --term-bad --term-new" return ;; - visualize) + visualize|view) __git_complete_log_opts return ;; diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh index b989388e7ea731..70557eb6844259 100755 --- a/t/t9902-completion.sh +++ b/t/t9902-completion.sh @@ -1376,6 +1376,30 @@ test_expect_success 'git-bisect - git-log options to visualize subcommand are ca ) ' +test_expect_success 'git-bisect - view subcommand is not a candidate' ' + ( + cd git-bisect && + test_completion "git bisect vi" <<-\EOF + visualize Z + EOF + ) +' + +test_expect_success 'git-bisect - existing view subcommand is recognized and enables completion of git-log options' ' + ( + cd git-bisect && + # The completion used for git-log and here does not complete + # every git-log option, so rather than hope to stay in sync + # with exactly what it does we will just spot-test here. + test_completion "git bisect view --sta" <<-\EOF && + --stat Z + EOF + test_completion "git bisect view --summar" <<-\EOF + --summary Z + EOF + ) +' + test_expect_success 'git checkout - completes refs and unique remote branches for DWIM' ' test_completion "git checkout " <<-\EOF HEAD Z From 1dbe40156307763255d96653cc853ef4ff841920 Mon Sep 17 00:00:00 2001 From: Phillip Wood Date: Wed, 7 Feb 2024 16:44:35 +0000 Subject: [PATCH 29/51] show-ref --verify: accept pseudorefs "git show-ref --verify" is useful for scripts that want to look up a fully qualified refname without falling back to the DWIM rules used by "git rev-parse" rules when the ref does not exist. Currently it will only accept "HEAD" or a refname beginning with "refs/". Running git show-ref --verify CHERRY_PICK_HEAD will always result in fatal: 'CHERRY_PICK_HEAD' - not a valid ref even when CHERRY_PICK_HEAD exists. By calling refname_is_safe() instead of comparing the refname to "HEAD" we can accept all one-level refs that contain only uppercase ascii letters and underscores. Signed-off-by: Phillip Wood Signed-off-by: Junio C Hamano --- builtin/show-ref.c | 2 +- t/t1403-show-ref.sh | 8 ++++++++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/builtin/show-ref.c b/builtin/show-ref.c index 79955c2856eace..1c15421e6008e8 100644 --- a/builtin/show-ref.c +++ b/builtin/show-ref.c @@ -172,7 +172,7 @@ static int cmd_show_ref__verify(const struct show_one_options *show_one_opts, while (*refs) { struct object_id oid; - if ((starts_with(*refs, "refs/") || !strcmp(*refs, "HEAD")) && + if ((starts_with(*refs, "refs/") || refname_is_safe(*refs)) && !read_ref(*refs, &oid)) { show_one(show_one_opts, *refs, &oid); } diff --git a/t/t1403-show-ref.sh b/t/t1403-show-ref.sh index d0a8f7b121cd87..33fb7a38ffff84 100755 --- a/t/t1403-show-ref.sh +++ b/t/t1403-show-ref.sh @@ -174,6 +174,14 @@ test_expect_success 'show-ref --verify HEAD' ' test_must_be_empty actual ' +test_expect_success 'show-ref --verify pseudorefs' ' + git update-ref CHERRY_PICK_HEAD HEAD $ZERO_OID && + test_when_finished "git update-ref -d CHERRY_PICK_HEAD" && + git show-ref -s --verify HEAD >actual && + git show-ref -s --verify CHERRY_PICK_HEAD >expect && + test_cmp actual expect +' + test_expect_success 'show-ref --verify with dangling ref' ' sha1_file() { echo "$*" | sed "s#..#.git/objects/&/#" From 1af410d455c5cddae1cdea3e5333ffd6ab616045 Mon Sep 17 00:00:00 2001 From: Phillip Wood Date: Wed, 7 Feb 2024 16:44:36 +0000 Subject: [PATCH 30/51] t1400: use show-ref to check pseudorefs Now that "git show-ref --verify" accepts pseudorefs use that in preference to "git rev-parse" when checking pseudorefs as we do when checking branches etc. Signed-off-by: Phillip Wood Signed-off-by: Junio C Hamano --- t/t1400-update-ref.sh | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh index f18843bf7aa293..78a09abc35e8b5 100755 --- a/t/t1400-update-ref.sh +++ b/t/t1400-update-ref.sh @@ -524,51 +524,51 @@ test_expect_success 'given old value for missing pseudoref, do not create' ' test_expect_success 'create pseudoref' ' git update-ref PSEUDOREF $A && - test $A = $(git rev-parse PSEUDOREF) + test $A = $(git show-ref -s --verify PSEUDOREF) ' test_expect_success 'overwrite pseudoref with no old value given' ' git update-ref PSEUDOREF $B && - test $B = $(git rev-parse PSEUDOREF) + test $B = $(git show-ref -s --verify PSEUDOREF) ' test_expect_success 'overwrite pseudoref with correct old value' ' git update-ref PSEUDOREF $C $B && - test $C = $(git rev-parse PSEUDOREF) + test $C = $(git show-ref -s --verify PSEUDOREF) ' test_expect_success 'do not overwrite pseudoref with wrong old value' ' test_must_fail git update-ref PSEUDOREF $D $E 2>err && - test $C = $(git rev-parse PSEUDOREF) && + test $C = $(git show-ref -s --verify PSEUDOREF) && test_grep "cannot lock ref.*expected" err ' test_expect_success 'delete pseudoref' ' git update-ref -d PSEUDOREF && - test_must_fail git rev-parse PSEUDOREF + test_must_fail git show-ref -s --verify PSEUDOREF ' test_expect_success 'do not delete pseudoref with wrong old value' ' git update-ref PSEUDOREF $A && test_must_fail git update-ref -d PSEUDOREF $B 2>err && - test $A = $(git rev-parse PSEUDOREF) && + test $A = $(git show-ref -s --verify PSEUDOREF) && test_grep "cannot lock ref.*expected" err ' test_expect_success 'delete pseudoref with correct old value' ' git update-ref -d PSEUDOREF $A && - test_must_fail git rev-parse PSEUDOREF + test_must_fail git show-ref -s --verify PSEUDOREF ' test_expect_success 'create pseudoref with old OID zero' ' git update-ref PSEUDOREF $A $Z && - test $A = $(git rev-parse PSEUDOREF) + test $A = $(git show-ref -s --verify PSEUDOREF) ' test_expect_success 'do not overwrite pseudoref with old OID zero' ' test_when_finished git update-ref -d PSEUDOREF && test_must_fail git update-ref PSEUDOREF $B $Z 2>err && - test $A = $(git rev-parse PSEUDOREF) && + test $A = $(git show-ref -s --verify PSEUDOREF) && test_grep "already exists" err ' From abfbff61efc4d91dc964eb2360760fa640ad0f0f Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Wed, 7 Feb 2024 10:46:54 -0800 Subject: [PATCH 31/51] tag: fix sign_buffer() call to create a signed tag The command "git tag -s" internally calls sign_buffer() to make a cryptographic signature using the chosen backend like GPG and SSH. The internal helper functions used by "git tag" implementation seem to use a "negative return values are errors, zero or positive return values are not" convention, and there are places (e.g., verify_tag() that calls gpg_verify_tag()) that these internal helper functions translate return values that signal errors to conform to this convention, but do_sign() that calls sign_buffer() forgets to do so. Fix it, so that a failed call to sign_buffer() that can return the exit status from pipe_command() will not be overlooked. Reported-by: Sergey Kosukhin Signed-off-by: Junio C Hamano --- builtin/tag.c | 2 +- gpg-interface.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/builtin/tag.c b/builtin/tag.c index 3918eacbb57bd9..b28ead06eac90e 100644 --- a/builtin/tag.c +++ b/builtin/tag.c @@ -176,7 +176,7 @@ static int verify_tag(const char *name, const char *ref UNUSED, static int do_sign(struct strbuf *buffer) { - return sign_buffer(buffer, buffer, get_signing_key()); + return sign_buffer(buffer, buffer, get_signing_key()) ? -1 : 0; } static const char tag_template[] = diff --git a/gpg-interface.h b/gpg-interface.h index 143cdc1c02d4b3..7cd98161f7f21b 100644 --- a/gpg-interface.h +++ b/gpg-interface.h @@ -66,7 +66,7 @@ size_t parse_signed_buffer(const char *buf, size_t size); * Create a detached signature for the contents of "buffer" and append * it after "signature"; "buffer" and "signature" can be the same * strbuf instance, which would cause the detached signature appended - * at the end. + * at the end. Returns 0 on success, non-zero on failure. */ int sign_buffer(struct strbuf *buffer, struct strbuf *signature, const char *signing_key); From 47ac5f6e1a0125965f6907dc55c130615cf131e1 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Wed, 7 Feb 2024 13:44:35 -0800 Subject: [PATCH 32/51] bisect: document "terms" subcommand more fully The documentation for "git bisect terms", although it did not hide any information, was a bit incomplete and forced readers to fill in the blanks to get the complete picture. Acked-by: Matthieu Moy Signed-off-by: Junio C Hamano --- Documentation/git-bisect.txt | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/Documentation/git-bisect.txt b/Documentation/git-bisect.txt index fbb39fbdf5d62a..3d813f9c779b9b 100644 --- a/Documentation/git-bisect.txt +++ b/Documentation/git-bisect.txt @@ -20,7 +20,7 @@ on the subcommand: [--no-checkout] [--first-parent] [ [...]] [--] [...] git bisect (bad|new|) [] git bisect (good|old|) [...] - git bisect terms [--term-good | --term-bad] + git bisect terms [--term-(good|old) | --term-(bad|new)] git bisect skip [(|)...] git bisect reset [] git bisect (visualize|view) @@ -165,8 +165,10 @@ To get a reminder of the currently used terms, use git bisect terms ------------------------------------------------ -You can get just the old (respectively new) term with `git bisect terms ---term-old` or `git bisect terms --term-good`. +You can get just the old term with `git bisect terms --term-old` +or `git bisect terms --term-good`; `git bisect terms --term-new` +and `git bisect terms --term-bad` can be used to learn how to call +the commits more recent than the sought change. If you would like to use your own terms instead of "bad"/"good" or "new"/"old", you can choose any names you like (except existing bisect From 841dbd40a350b6bde402ca36a56c815f13f20480 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Wed, 7 Feb 2024 13:44:36 -0800 Subject: [PATCH 33/51] bisect: document command line arguments for "bisect start" The syntax commonly used for alternatives is --opt-(a|b), not --opt-{a,b}. List bad/new and good/old consistently in this order, to be consistent with the description for "git bisect terms". Clarify to either or to make them consistent with the description of "git bisect (good|bad)" subcommands. Suggested-by: Matthieu Moy Signed-off-by: Junio C Hamano --- Documentation/git-bisect.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/git-bisect.txt b/Documentation/git-bisect.txt index 3d813f9c779b9b..73f889b97b8b6c 100644 --- a/Documentation/git-bisect.txt +++ b/Documentation/git-bisect.txt @@ -16,7 +16,7 @@ DESCRIPTION The command takes various subcommands, and different options depending on the subcommand: - git bisect start [--term-{new,bad}= --term-{old,good}=] + git bisect start [--term-(bad|new)= --term-(good|old)=] [--no-checkout] [--first-parent] [ [...]] [--] [...] git bisect (bad|new|) [] git bisect (good|old|) [...] From 6931049c32ca0271ba95dcd0f197100d22c8d844 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Wed, 7 Feb 2024 21:29:00 -0800 Subject: [PATCH 34/51] ssh signing: signal an error with a negative return value The other backend for the sign_buffer() function followed our usual "an error is signalled with a negative return" convention, but the SSH signer did not. Even though we already fixed the caller that assumed only a negative return value is an error, tighten the callee to signal an error with a negative return as well. This way, the callees will be strict on what they produce, while the callers will be lenient in what they accept. Signed-off-by: Junio C Hamano --- gpg-interface.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gpg-interface.c b/gpg-interface.c index 48f43c5a21d345..e19a69c40032b2 100644 --- a/gpg-interface.c +++ b/gpg-interface.c @@ -1088,7 +1088,7 @@ static int sign_buffer_ssh(struct strbuf *buffer, struct strbuf *signature, if (strstr(signer_stderr.buf, "usage:")) error(_("ssh-keygen -Y sign is needed for ssh signing (available in openssh version 8.2p1+)")); - error("%s", signer_stderr.buf); + ret = error("%s", signer_stderr.buf); goto out; } From 46176d77c9e0b219a78d64cb99d8dbf4d9bcde2d Mon Sep 17 00:00:00 2001 From: Victoria Dye Date: Thu, 8 Feb 2024 01:57:19 +0000 Subject: [PATCH 35/51] ref-filter.c: sort formatted dates by byte value Update the ref sorting functions of 'ref-filter.c' so that when date fields are specified with a format string (such as in 'git for-each-ref --sort=creatordate:'), they are sorted by their formatted string value rather than by the underlying numeric timestamp. Currently, date fields are always sorted by timestamp, regardless of whether formatting information is included in the '--sort' key. Leaving the default (unformatted) date sorting unchanged, sorting by the formatted date string adds some flexibility to 'for-each-ref' by allowing for behavior like "sort by year, then by refname within each year" or "sort by time of day". Because the inclusion of a format string previously had no effect on sort behavior, this change likely will not affect existing usage of 'for-each-ref' or other ref listing commands. Additionally, update documentation & tests to document the new sorting mechanism. Signed-off-by: Victoria Dye Signed-off-by: Junio C Hamano --- Documentation/git-for-each-ref.txt | 8 ++++-- ref-filter.c | 6 ++++ t/t6300-for-each-ref.sh | 46 ++++++++++++++++++++++++++++++ 3 files changed, 57 insertions(+), 3 deletions(-) diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt index be9543f6840be3..3a9ad91b7af89a 100644 --- a/Documentation/git-for-each-ref.txt +++ b/Documentation/git-for-each-ref.txt @@ -359,9 +359,11 @@ In any case, a field name that refers to a field inapplicable to the object referred by the ref does not cause an error. It returns an empty string instead. -As a special case for the date-type fields, you may specify a format for -the date by adding `:` followed by date format name (see the -values the `--date` option to linkgit:git-rev-list[1] takes). +As a special case for the date-type fields, you may specify a format for the +date by adding `:` followed by date format name (see the values the `--date` +option to linkgit:git-rev-list[1] takes). If this formatting is provided in +a `--sort` key, references will be sorted according to the byte-value of the +formatted string rather than the numeric value of the underlying timestamp. Some atoms like %(align) and %(if) always require a matching %(end). We call them "opening atoms" and sometimes denote them as %($open). diff --git a/ref-filter.c b/ref-filter.c index 35b989e1dfe59e..be14b56e32489d 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -1611,6 +1611,12 @@ static void grab_date(const char *buf, struct atom_value *v, const char *atomnam if (formatp) { formatp++; parse_date_format(formatp, &date_mode); + + /* + * If this is a sort field and a format was specified, we'll + * want to compare formatted date by string value. + */ + v->atom->type = FIELD_STR; } if (!eoemail) diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh index 843a7fe14313b5..eb6c8204e8bd0e 100755 --- a/t/t6300-for-each-ref.sh +++ b/t/t6300-for-each-ref.sh @@ -1356,6 +1356,52 @@ test_expect_success '--no-sort without subsequent --sort prints expected refs' ' test_cmp expected actual ' +test_expect_success 'set up custom date sorting' ' + # Dates: + # - Wed Feb 07 2024 21:34:20 +0000 + # - Tue Dec 14 1999 00:05:22 +0000 + # - Fri Jun 04 2021 11:26:51 +0000 + # - Mon Jan 22 2007 16:44:01 GMT+0000 + i=1 && + for when in 1707341660 945129922 1622806011 1169484241 + do + GIT_COMMITTER_DATE="@$when +0000" \ + GIT_COMMITTER_EMAIL="user@example.com" \ + git tag -m "tag $when" custom-dates-$i && + i=$(($i+1)) || return 1 + done +' + +test_expect_success 'sort by date defaults to full timestamp' ' + cat >expected <<-\EOF && + 945129922 refs/tags/custom-dates-2 + 1169484241 refs/tags/custom-dates-4 + 1622806011 refs/tags/custom-dates-3 + 1707341660 refs/tags/custom-dates-1 + EOF + + git for-each-ref \ + --format="%(creatordate:unix) %(refname)" \ + --sort=creatordate \ + "refs/tags/custom-dates-*" >actual && + test_cmp expected actual +' + +test_expect_success 'sort by custom date format' ' + cat >expected <<-\EOF && + 00:05:22 refs/tags/custom-dates-2 + 11:26:51 refs/tags/custom-dates-3 + 16:44:01 refs/tags/custom-dates-4 + 21:34:20 refs/tags/custom-dates-1 + EOF + + git for-each-ref \ + --format="%(creatordate:format:%H:%M:%S) %(refname)" \ + --sort="creatordate:format:%H:%M:%S" \ + "refs/tags/custom-dates-*" >actual && + test_cmp expected actual +' + test_expect_success 'do not dereference NULL upon %(HEAD) on unborn branch' ' test_when_finished "git checkout main" && git for-each-ref --format="%(HEAD) %(refname:short)" refs/heads/ >actual && From e4301f73fffacf2ba9fb2042b95de32880381e72 Mon Sep 17 00:00:00 2001 From: Vegard Nossum Date: Fri, 2 Feb 2024 10:18:50 +0100 Subject: [PATCH 36/51] sequencer: unset GIT_CHERRY_PICK_HELP for 'exec' commands Running "git cherry-pick" as an x-command in the rebase plan loses the original authorship information. To fix this, unset GIT_CHERRY_PICK_HELP for 'exec' commands. Helped-by: Phillip Wood Signed-off-by: Vegard Nossum Signed-off-by: Junio C Hamano --- sequencer.c | 1 + t/t3404-rebase-interactive.sh | 12 ++++++++++++ 2 files changed, 13 insertions(+) diff --git a/sequencer.c b/sequencer.c index d584cac8ed9307..ed30ceaf8b9eaf 100644 --- a/sequencer.c +++ b/sequencer.c @@ -3647,6 +3647,7 @@ static int do_exec(struct repository *r, const char *command_line) fprintf(stderr, _("Executing: %s\n"), command_line); cmd.use_shell = 1; strvec_push(&cmd.args, command_line); + strvec_push(&cmd.env, "GIT_CHERRY_PICK_HELP"); status = run_command(&cmd); /* force re-reading of the cache */ diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh index c5f30554c6f356..84a92d6da08bf7 100755 --- a/t/t3404-rebase-interactive.sh +++ b/t/t3404-rebase-interactive.sh @@ -153,6 +153,18 @@ test_expect_success 'rebase -i with the exec command checks tree cleanness' ' git rebase --continue ' +test_expect_success 'cherry-pick works with rebase --exec' ' + test_when_finished "git cherry-pick --abort; \ + git rebase --abort; \ + git checkout primary" && + echo "exec git cherry-pick G" >todo && + ( + set_replace_editor todo && + test_must_fail git rebase -i D D + ) && + test_cmp_rev G CHERRY_PICK_HEAD +' + test_expect_success 'rebase -x with empty command fails' ' test_when_finished "git rebase --abort ||:" && test_must_fail env git rebase -x "" @ 2>actual && From bc7f5db896f59275fb0e4093dfd6891bfcece63d Mon Sep 17 00:00:00 2001 From: Phillip Wood Date: Fri, 9 Feb 2024 16:19:39 +0000 Subject: [PATCH 37/51] prune: mark rebase autostash and orig-head as reachable Rebase records the oid of HEAD before rebasing and the commit created by "--autostash" in files in the rebase state directory. This means that the autostash commit is never reachable from any ref or reflog and when rebasing a detached HEAD the original HEAD can become unreachable if the user expires HEAD's the reflog while the rebase is running. Fix this by reading the relevant files when marking reachable commits. Note that it is possible for the commit recorded in .git/rebase-merge/amend to be unreachable but pruning that object does not affect the operation of "git rebase --continue" as we're only interested in the object id, not in the object itself. Reported-by: Orgad Shaneh Signed-off-by: Phillip Wood Signed-off-by: Junio C Hamano --- reachable.c | 50 +++++++++++++++++++++++++++++++++++++ t/t3407-rebase-abort.sh | 17 ++++++++++++- t/t3420-rebase-autostash.sh | 10 ++++++++ 3 files changed, 76 insertions(+), 1 deletion(-) diff --git a/reachable.c b/reachable.c index f29b06a5d059e2..3b85add243ba79 100644 --- a/reachable.c +++ b/reachable.c @@ -17,6 +17,7 @@ #include "pack-mtimes.h" #include "config.h" #include "run-command.h" +#include "sequencer.h" struct connectivity_progress { struct progress *progress; @@ -30,6 +31,52 @@ static void update_progress(struct connectivity_progress *cp) display_progress(cp->progress, cp->count); } +static void add_one_file(const char *path, struct rev_info *revs) +{ + struct strbuf buf = STRBUF_INIT; + struct object_id oid; + struct object *object; + + if (!read_oneliner(&buf, path, READ_ONELINER_SKIP_IF_EMPTY)) { + strbuf_release(&buf); + return; + } + strbuf_trim(&buf); + if (!get_oid_hex(buf.buf, &oid)) { + object = parse_object_or_die(&oid, buf.buf); + add_pending_object(revs, object, ""); + } + strbuf_release(&buf); +} + +/* Mark objects recorded in rebase state files as reachable. */ +static void add_rebase_files(struct rev_info *revs) +{ + struct strbuf buf = STRBUF_INIT; + size_t len; + const char *path[] = { + "rebase-apply/autostash", + "rebase-apply/orig-head", + "rebase-merge/autostash", + "rebase-merge/orig-head", + }; + struct worktree **worktrees = get_worktrees(); + + for (struct worktree **wt = worktrees; *wt; wt++) { + strbuf_reset(&buf); + strbuf_addstr(&buf, get_worktree_git_dir(*wt)); + strbuf_complete(&buf, '/'); + len = buf.len; + for (size_t i = 0; i < ARRAY_SIZE(path); i++) { + strbuf_setlen(&buf, len); + strbuf_addstr(&buf, path[i]); + add_one_file(buf.buf, revs); + } + } + strbuf_release(&buf); + free_worktrees(worktrees); +} + static int add_one_ref(const char *path, const struct object_id *oid, int flag, void *cb_data) { @@ -322,6 +369,9 @@ void mark_reachable_objects(struct rev_info *revs, int mark_reflog, head_ref(add_one_ref, revs); other_head_refs(add_one_ref, revs); + /* rebase autostash and orig-head */ + add_rebase_files(revs); + /* Add all reflog info */ if (mark_reflog) add_reflogs_to_pending(revs, 0); diff --git a/t/t3407-rebase-abort.sh b/t/t3407-rebase-abort.sh index ebbaed147a6ce2..9f49c4228b6295 100755 --- a/t/t3407-rebase-abort.sh +++ b/t/t3407-rebase-abort.sh @@ -40,9 +40,24 @@ testrebase() { test_path_is_missing "$state_dir" ' + test_expect_success "pre rebase$type head is marked as reachable" ' + # Clean up the state from the previous one + git checkout -f --detach pre-rebase && + test_tick && + git commit --amend --only -m "reworded" && + orig_head=$(git rev-parse HEAD) && + test_must_fail git rebase$type main && + # Stop ORIG_HEAD marking $state_dir/orig-head as reachable + git update-ref -d ORIG_HEAD && + git reflog expire --expire="$GIT_COMMITTER_DATE" --all && + git prune --expire=now && + git rebase --abort && + test_cmp_rev $orig_head HEAD + ' + test_expect_success "rebase$type --abort after --skip" ' # Clean up the state from the previous one - git reset --hard pre-rebase && + git checkout -B to-rebase pre-rebase && test_must_fail git rebase$type main && test_path_is_dir "$state_dir" && test_must_fail git rebase --skip && diff --git a/t/t3420-rebase-autostash.sh b/t/t3420-rebase-autostash.sh index 693934ee8be960..1a820f148155cb 100755 --- a/t/t3420-rebase-autostash.sh +++ b/t/t3420-rebase-autostash.sh @@ -333,4 +333,14 @@ test_expect_success 'never change active branch' ' test_cmp_rev not-the-feature-branch unrelated-onto-branch ' +test_expect_success 'autostash commit is marked as reachable' ' + echo changed >file0 && + git rebase --autostash --exec "git prune --expire=now" \ + feature-branch^ feature-branch && + # git rebase succeeds if the stash cannot be applied so we need to check + # the contents of file0 + echo changed >expect && + test_cmp expect file0 +' + test_done From f51d790b67255dd40b55654527fc4f59a53f44b0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= Date: Fri, 9 Feb 2024 21:36:40 +0100 Subject: [PATCH 38/51] receive-pack: use find_commit_header() in check_cert_push_options() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Use the public function find_commit_header() instead of find_header() to simplify the code. This is possible and safe because we're operating on a strbuf, which is always NUL-terminated, so there is no risk of running over the end of the buffer. It cannot contain NUL within the buffer, as it is built using strbuf_addstr(), only. The string comparison becomes more complicated because we need to check for NUL explicitly after comparing the length-limited option, but on the flip side we don't need to clean up allocations or track the remaining buffer length. Signed-off-by: René Scharfe Signed-off-by: Junio C Hamano --- builtin/receive-pack.c | 24 +++++++++--------------- 1 file changed, 9 insertions(+), 15 deletions(-) diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index e36b1d619f5c07..8ebb3a872f1ca6 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -718,35 +718,29 @@ static const char *check_nonce(const char *buf, size_t len) static int check_cert_push_options(const struct string_list *push_options) { const char *buf = push_cert.buf; - int len = push_cert.len; - char *option; - const char *next_line; + const char *option; + size_t optionlen; int options_seen = 0; int retval = 1; - if (!len) + if (!*buf) return 1; - while ((option = find_header(buf, len, "push-option", &next_line))) { - len -= (next_line - buf); - buf = next_line; + while ((option = find_commit_header(buf, "push-option", &optionlen))) { + buf = option + optionlen + 1; options_seen++; if (options_seen > push_options->nr - || strcmp(option, - push_options->items[options_seen - 1].string)) { - retval = 0; - goto leave; - } - free(option); + || strncmp(push_options->items[options_seen - 1].string, + option, optionlen) + || push_options->items[options_seen - 1].string[optionlen]) + return 0; } if (options_seen != push_options->nr) retval = 0; -leave: - free(option); return retval; } From f66286364f7875f9b3e3ac63733dd98e396fb7b8 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Sun, 11 Feb 2024 07:58:04 -0800 Subject: [PATCH 39/51] unit-tests: do show relative file paths on non-Windows, too There are compilers other than Visual C that want to show absolute paths. Generalize the helper introduced by a2c5e294 (unit-tests: do show relative file paths, 2023-09-25) so that it can also work with a path that uses slash as the directory separator, and becomes almost no-op once one-time preparation finds out that we are using a compiler that already gives relative paths. Incidentally, this also should do the right thing on Windows with a compiler that shows relative paths but with backslash as the directory separator (if such a thing exists and is used to build git). Reported-by: Randall S. Becker Helped-by: Phillip Wood Signed-off-by: Junio C Hamano --- t/unit-tests/test-lib.c | 61 +++++++++++++++++++++++++++++++---------- 1 file changed, 47 insertions(+), 14 deletions(-) diff --git a/t/unit-tests/test-lib.c b/t/unit-tests/test-lib.c index 7bf9dfdb959551..66d6980ffbefd8 100644 --- a/t/unit-tests/test-lib.c +++ b/t/unit-tests/test-lib.c @@ -21,12 +21,11 @@ static struct { .result = RESULT_NONE, }; -#ifndef _MSC_VER -#define make_relative(location) location -#else /* * Visual C interpolates the absolute Windows path for `__FILE__`, * but we want to see relative paths, as verified by t0080. + * There are other compilers that do the same, and are not for + * Windows. */ #include "dir.h" @@ -34,32 +33,66 @@ static const char *make_relative(const char *location) { static char prefix[] = __FILE__, buf[PATH_MAX], *p; static size_t prefix_len; + static int need_bs_to_fs = -1; - if (!prefix_len) { + /* one-time preparation */ + if (need_bs_to_fs < 0) { size_t len = strlen(prefix); - const char *needle = "\\t\\unit-tests\\test-lib.c"; + char needle[] = "t\\unit-tests\\test-lib.c"; size_t needle_len = strlen(needle); - if (len < needle_len || strcmp(needle, prefix + len - needle_len)) - die("unexpected suffix of '%s'", prefix); + if (len < needle_len) + die("unexpected prefix '%s'", prefix); + + /* + * The path could be relative (t/unit-tests/test-lib.c) + * or full (/home/user/git/t/unit-tests/test-lib.c). + * Check the slash between "t" and "unit-tests". + */ + prefix_len = len - needle_len; + if (prefix[prefix_len + 1] == '/') { + /* Oh, we're not Windows */ + for (size_t i = 0; i < needle_len; i++) + if (needle[i] == '\\') + needle[i] = '/'; + need_bs_to_fs = 0; + } else { + need_bs_to_fs = 1; + } - /* let it end in a directory separator */ - prefix_len = len - needle_len + 1; + /* + * prefix_len == 0 if the compiler gives paths relative + * to the root of the working tree. Otherwise, we want + * to see that we did find the needle[] at a directory + * boundary. Again we rely on that needle[] begins with + * "t" followed by the directory separator. + */ + if (fspathcmp(needle, prefix + prefix_len) || + (prefix_len && prefix[prefix_len - 1] != needle[1])) + die("unexpected suffix of '%s'", prefix); } - /* Does it not start with the expected prefix? */ - if (fspathncmp(location, prefix, prefix_len)) + /* + * Does it not start with the expected prefix? + * Return it as-is without making it worse. + */ + if (prefix_len && fspathncmp(location, prefix, prefix_len)) return location; - strlcpy(buf, location + prefix_len, sizeof(buf)); + /* + * If we do not need to munge directory separator, we can return + * the substring at the tail of the location. + */ + if (!need_bs_to_fs) + return location + prefix_len; + /* convert backslashes to forward slashes */ + strlcpy(buf, location + prefix_len, sizeof(buf)); for (p = buf; *p; p++) if (*p == '\\') *p = '/'; - return buf; } -#endif static void msg_with_prefix(const char *prefix, const char *format, va_list ap) { From 820a3400851a018883a7396867aa679c3acb5568 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sun, 11 Feb 2024 12:11:28 +0000 Subject: [PATCH 40/51] ci: bump remaining outdated Actions versions MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit After activating automatic Dependabot updates in the git-for-windows/git repository, Dependabot noticed a couple of yet-unaddressed updates. They avoid "Node.js 16 Actions" deprecation messages by bumping the following Actions' versions: - actions/upload-artifact from 3 to 4 - actions/download-artifact from 3 to 4 - actions/cache from 3 to 4 Helped-by: Matthias Aßhauer Signed-off-by: Johannes Schindelin Signed-off-by: Junio C Hamano --- .github/workflows/coverity.yml | 4 ++-- .github/workflows/main.yml | 16 ++++++++-------- 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/.github/workflows/coverity.yml b/.github/workflows/coverity.yml index a81a7566d10735..53cf12fe044425 100644 --- a/.github/workflows/coverity.yml +++ b/.github/workflows/coverity.yml @@ -98,7 +98,7 @@ jobs: # A cache miss will add ~30s to create, but a cache hit will save minutes. - name: restore the Coverity Build Tool id: cache - uses: actions/cache/restore@v3 + uses: actions/cache/restore@v4 with: path: ${{ runner.temp }}/cov-analysis key: cov-build-${{ env.COVERITY_LANGUAGE }}-${{ env.COVERITY_PLATFORM }}-${{ steps.lookup.outputs.hash }} @@ -141,7 +141,7 @@ jobs: esac - name: cache the Coverity Build Tool if: steps.cache.outputs.cache-hit != 'true' - uses: actions/cache/save@v3 + uses: actions/cache/save@v4 with: path: ${{ runner.temp }}/cov-analysis key: cov-build-${{ env.COVERITY_LANGUAGE }}-${{ env.COVERITY_PLATFORM }}-${{ steps.lookup.outputs.hash }} diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index bb857bdaf08596..ec25f6f99deaf1 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -123,7 +123,7 @@ jobs: - name: zip up tracked files run: git archive -o artifacts/tracked.tar.gz HEAD - name: upload tracked files and build artifacts - uses: actions/upload-artifact@v3 + uses: actions/upload-artifact@v4 with: name: windows-artifacts path: artifacts @@ -140,7 +140,7 @@ jobs: cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }} steps: - name: download tracked files and build artifacts - uses: actions/download-artifact@v3 + uses: actions/download-artifact@v4 with: name: windows-artifacts path: ${{github.workspace}} @@ -157,7 +157,7 @@ jobs: run: ci/print-test-failures.sh - name: Upload failed tests' directories if: failure() && env.FAILED_TEST_ARTIFACTS != '' - uses: actions/upload-artifact@v3 + uses: actions/upload-artifact@v4 with: name: failed-tests-windows path: ${{env.FAILED_TEST_ARTIFACTS}} @@ -212,7 +212,7 @@ jobs: - name: zip up tracked files run: git archive -o artifacts/tracked.tar.gz HEAD - name: upload tracked files and build artifacts - uses: actions/upload-artifact@v3 + uses: actions/upload-artifact@v4 with: name: vs-artifacts path: artifacts @@ -230,7 +230,7 @@ jobs: steps: - uses: git-for-windows/setup-git-for-windows-sdk@v1 - name: download tracked files and build artifacts - uses: actions/download-artifact@v3 + uses: actions/download-artifact@v4 with: name: vs-artifacts path: ${{github.workspace}} @@ -248,7 +248,7 @@ jobs: run: ci/print-test-failures.sh - name: Upload failed tests' directories if: failure() && env.FAILED_TEST_ARTIFACTS != '' - uses: actions/upload-artifact@v3 + uses: actions/upload-artifact@v4 with: name: failed-tests-windows path: ${{env.FAILED_TEST_ARTIFACTS}} @@ -305,7 +305,7 @@ jobs: run: ci/print-test-failures.sh - name: Upload failed tests' directories if: failure() && env.FAILED_TEST_ARTIFACTS != '' - uses: actions/upload-artifact@v3 + uses: actions/upload-artifact@v4 with: name: failed-tests-${{matrix.vector.jobname}} path: ${{env.FAILED_TEST_ARTIFACTS}} @@ -353,7 +353,7 @@ jobs: run: ci/print-test-failures.sh - name: Upload failed tests' directories if: failure() && env.FAILED_TEST_ARTIFACTS != '' && matrix.vector.jobname != 'linux32' - uses: actions/upload-artifact@v3 + uses: actions/upload-artifact@v4 with: name: failed-tests-${{matrix.vector.jobname}} path: ${{env.FAILED_TEST_ARTIFACTS}} From 20e0ff883514216f5737a721443ee0dd59dbf68f Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sun, 11 Feb 2024 12:11:29 +0000 Subject: [PATCH 41/51] ci(linux32): add a note about Actions that must not be updated The Docker container used by the `linux32` job comes without Node.js, and therefore the `actions/checkout` and `actions/upload-artifact` Actions cannot be upgraded to the latest versions (because they use Node.js). One time too many, I accidentally tried to update them, where `actions/checkout` at least fails immediately, but the `actions/upload-artifact` step is only used when any test fails, and therefore the CI run usually passes even though that Action was updated to a version that is incompatible with the Docker container in which this job runs. So let's add a big fat warning, mainly for my own benefit, to avoid running into the very same issue over and over again. Signed-off-by: Johannes Schindelin Signed-off-by: Junio C Hamano --- .github/workflows/main.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index ec25f6f99deaf1..7bacb322e4fa01 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -344,7 +344,7 @@ jobs: steps: - uses: actions/checkout@v4 if: matrix.vector.jobname != 'linux32' - - uses: actions/checkout@v1 + - uses: actions/checkout@v1 # cannot be upgraded because Node.js Actions aren't supported in this container if: matrix.vector.jobname == 'linux32' - run: ci/install-docker-dependencies.sh - run: ci/run-build-and-tests.sh @@ -359,7 +359,7 @@ jobs: path: ${{env.FAILED_TEST_ARTIFACTS}} - name: Upload failed tests' directories if: failure() && env.FAILED_TEST_ARTIFACTS != '' && matrix.vector.jobname == 'linux32' - uses: actions/upload-artifact@v1 + uses: actions/upload-artifact@v1 # cannot be upgraded because Node.js Actions aren't supported in this container with: name: failed-tests-${{matrix.vector.jobname}} path: ${{env.FAILED_TEST_ARTIFACTS}} From 020456cb744b16f686d9220eb4b95a8e157c3485 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= Date: Fri, 9 Feb 2024 21:41:47 +0100 Subject: [PATCH 42/51] receive-pack: use find_commit_header() in check_nonce() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Use the public function find_commit_header() and remove find_header(), as it becomes unused. This is safe and appropriate because we pass the NUL-terminated payload buffer to check_nonce() instead of its start and length. The underlying strbuf push_cert cannot contain NULs, as it is built using strbuf_addstr(), only. We no longer need to call strlen(), as find_commit_header() returns the length of nonce already. Signed-off-by: René Scharfe Signed-off-by: Junio C Hamano --- builtin/receive-pack.c | 30 +++++++----------------------- 1 file changed, 7 insertions(+), 23 deletions(-) diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index 8ebb3a872f1ca6..db656074857e76 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -593,21 +593,6 @@ static char *prepare_push_cert_nonce(const char *path, timestamp_t stamp) return strbuf_detach(&buf, NULL); } -static char *find_header(const char *msg, size_t len, const char *key, - const char **next_line) -{ - size_t out_len; - const char *val = find_header_mem(msg, len, key, &out_len); - - if (!val) - return NULL; - - if (next_line) - *next_line = val + out_len + 1; - - return xmemdupz(val, out_len); -} - /* * Return zero if a and b are equal up to n bytes and nonzero if they are not. * This operation is guaranteed to run in constant time to avoid leaking data. @@ -622,13 +607,14 @@ static int constant_memequal(const char *a, const char *b, size_t n) return res; } -static const char *check_nonce(const char *buf, size_t len) +static const char *check_nonce(const char *buf) { - char *nonce = find_header(buf, len, "nonce", NULL); + size_t noncelen; + const char *found = find_commit_header(buf, "nonce", &noncelen); + char *nonce = found ? xmemdupz(found, noncelen) : NULL; timestamp_t stamp, ostamp; char *bohmac, *expect = NULL; const char *retval = NONCE_BAD; - size_t noncelen; if (!nonce) { retval = NONCE_MISSING; @@ -670,7 +656,6 @@ static const char *check_nonce(const char *buf, size_t len) goto leave; } - noncelen = strlen(nonce); expect = prepare_push_cert_nonce(service_dir, stamp); if (noncelen != strlen(expect)) { /* This is not even the right size. */ @@ -732,9 +717,8 @@ static int check_cert_push_options(const struct string_list *push_options) buf = option + optionlen + 1; options_seen++; if (options_seen > push_options->nr - || strncmp(push_options->items[options_seen - 1].string, - option, optionlen) - || push_options->items[options_seen - 1].string[optionlen]) + || xstrncmpz(push_options->items[options_seen - 1].string, + option, optionlen)) return 0; } @@ -767,7 +751,7 @@ static void prepare_push_cert_sha1(struct child_process *proc) check_signature(&sigcheck, push_cert.buf + bogs, push_cert.len - bogs); - nonce_status = check_nonce(push_cert.buf, bogs); + nonce_status = check_nonce(sigcheck.payload); } if (!is_null_oid(&push_cert_oid)) { strvec_pushf(&proc->env, "GIT_PUSH_CERT=%s", From 30bd55f901c97c310e674c051f00920f38a66ee0 Mon Sep 17 00:00:00 2001 From: Philippe Blain Date: Sat, 10 Feb 2024 18:32:20 +0000 Subject: [PATCH 43/51] completion: add space after config variable names also in Bash 3 In be6444d1ca (completion: bash: add correct suffix in variables, 2021-08-16), __git_complete_config_variable_name was changed to use "${sfx- }" instead of "$sfx" as the fourth argument of _gitcomp_nl and _gitcomp_nl_append, such that this argument evaluates to a space if sfx is unset. This was to ensure that e.g. git config branch.autoSetupMe[TAB] correctly completes to 'branch.autoSetupMerge ' with the trailing space. This commits notes that the fix only works in Bash 4 because in Bash 3 the 'local sfx' construct at the beginning of __git_complete_config_variable_name creates an empty string. Make the fix also work for Bash 3 by using the "unset or null' parameter expansion syntax ("${sfx:- }"), such that the parameter is also expanded to a space if it is set but null, as is the behaviour of 'local sfx' in Bash 3. Signed-off-by: Philippe Blain Signed-off-by: Junio C Hamano --- contrib/completion/git-completion.bash | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index 6662db221df9d5..159a4fd8add4ec 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -2750,7 +2750,7 @@ __git_complete_config_variable_name () local pfx="${cur_%.*}." cur_="${cur_#*.}" __gitcomp_direct "$(__git_heads "$pfx" "$cur_" ".")" - __gitcomp_nl_append $'autoSetupMerge\nautoSetupRebase\n' "$pfx" "$cur_" "${sfx- }" + __gitcomp_nl_append $'autoSetupMerge\nautoSetupRebase\n' "$pfx" "$cur_" "${sfx:- }" return ;; guitool.*.*) @@ -2784,7 +2784,7 @@ __git_complete_config_variable_name () local pfx="${cur_%.*}." cur_="${cur_#*.}" __git_compute_all_commands - __gitcomp_nl "$__git_all_commands" "$pfx" "$cur_" "${sfx- }" + __gitcomp_nl "$__git_all_commands" "$pfx" "$cur_" "${sfx:- }" return ;; remote.*.*) @@ -2800,7 +2800,7 @@ __git_complete_config_variable_name () local pfx="${cur_%.*}." cur_="${cur_#*.}" __gitcomp_nl "$(__git_remotes)" "$pfx" "$cur_" "." - __gitcomp_nl_append "pushDefault" "$pfx" "$cur_" "${sfx- }" + __gitcomp_nl_append "pushDefault" "$pfx" "$cur_" "${sfx:- }" return ;; url.*.*) From b1d0cc68d1921f8d26a947fe662697e42dffb730 Mon Sep 17 00:00:00 2001 From: Philippe Blain Date: Sat, 10 Feb 2024 18:32:21 +0000 Subject: [PATCH 44/51] completion: complete 'submodule.*' config variables In the Bash completion script, function __git_complete_config_variable_name completes config variables and has special logic to deal with config variables involving user-defined names, like branch..* and remote..*. This special logic is missing for submodule-related config variables. Add the appropriate branches to the case statement, making use of the in-tree '.gitmodules' to list relevant submodules. Add corresponding tests in t9902-completion.sh, making sure we complete both first level submodule config variables as well as second level variables involving submodule names. Signed-off-by: Philippe Blain Signed-off-by: Junio C Hamano --- contrib/completion/git-completion.bash | 13 ++++++++++++ t/t9902-completion.sh | 29 ++++++++++++++++++++++++++ 2 files changed, 42 insertions(+) diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index 159a4fd8add4ec..8af9bc3f4e136a 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -2803,6 +2803,19 @@ __git_complete_config_variable_name () __gitcomp_nl_append "pushDefault" "$pfx" "$cur_" "${sfx:- }" return ;; + submodule.*.*) + local pfx="${cur_%.*}." + cur_="${cur_##*.}" + __gitcomp "url update branch fetchRecurseSubmodules ignore active" "$pfx" "$cur_" "$sfx" + return + ;; + submodule.*) + local pfx="${cur_%.*}." + cur_="${cur_#*.}" + __gitcomp_nl "$(__git config -f "$(__git rev-parse --show-toplevel)/.gitmodules" --get-regexp 'submodule.*.path' | awk -F. '{print $2}')" "$pfx" "$cur_" "." + __gitcomp_nl_append $'alternateErrorStrategy\nfetchJobs\nactive\nalternateLocation\nrecurse\npropagateBranches' "$pfx" "$cur_" "${sfx:- }" + return + ;; url.*.*) local pfx="${cur_%.*}." cur_="${cur_##*.}" diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh index 35eb534fdda273..23d0e71324ccf5 100755 --- a/t/t9902-completion.sh +++ b/t/t9902-completion.sh @@ -2583,6 +2583,35 @@ test_expect_success 'git config - variable name include' ' EOF ' +test_expect_success 'setup for git config submodule tests' ' + test_create_repo sub && + test_commit -C sub initial && + git submodule add ./sub +' + +test_expect_success 'git config - variable name - submodule' ' + test_completion "git config submodule." <<-\EOF + submodule.active Z + submodule.alternateErrorStrategy Z + submodule.alternateLocation Z + submodule.fetchJobs Z + submodule.propagateBranches Z + submodule.recurse Z + submodule.sub.Z + EOF +' + +test_expect_success 'git config - variable name - submodule names' ' + test_completion "git config submodule.sub." <<-\EOF + submodule.sub.url Z + submodule.sub.update Z + submodule.sub.branch Z + submodule.sub.fetchRecurseSubmodules Z + submodule.sub.ignore Z + submodule.sub.active Z + EOF +' + test_expect_success 'git config - value' ' test_completion "git config color.pager " <<-\EOF false Z From 1e0ee4087e5e5f66882ce60dad522b4f49f79eb0 Mon Sep 17 00:00:00 2001 From: Philippe Blain Date: Sat, 10 Feb 2024 18:32:22 +0000 Subject: [PATCH 45/51] completion: add and use __git_compute_first_level_config_vars_for_section The function __git_complete_config_variable_name in the Bash completion script hardcodes several config variable names. These variables are those in config sections where user-defined names can appear, such as "branch.". These sections are treated first by the case statement, and the two last "catch all" cases are used for other sections, making use of the __git_compute_config_vars and __git_compute_config_sections function, which omit listing any variables containing wildcards or placeholders. Having hardcoded config variables introduces the risk of the completion code becoming out of sync with the actual config variables accepted by Git. To avoid these hardcoded config variables, introduce a new function, __git_compute_first_level_config_vars_for_section, making use of the existing __git_config_vars variable. This function takes as argument a config section name and computes the matching "first level" config variables for that section, i.e. those _not_ containing any placeholder, like 'branch.autoSetupMerge, 'remote.pushDefault', etc. Use this function and the variables it defines in the 'branch.*', 'remote.*' and 'submodule.*' switches of the case statement instead of hardcoding the corresponding config variables. Note that we use indirect expansion to create a variable for each section, instead of using a single associative array indexed by section names, because associative arrays are not supported in Bash 3, on which macOS is stuck for licensing reasons. Use the existing pattern in the completion script of using global variables to cache the list of config variables for each section. The rationale for such caching is explained in eaa4e6ee2a (Speed up bash completion loading, 2009-11-17), and the current approach to using and defining them via 'test -n' is explained in cf0ff02a38 (completion: work around zsh option propagation bug, 2012-02-02). Adjust the name of one of the tests added in the previous commit, reflecting that it now also tests the new function. Signed-off-by: Philippe Blain Signed-off-by: Junio C Hamano --- contrib/completion/git-completion.bash | 24 +++++++++++++++++++++--- t/t9902-completion.sh | 2 +- 2 files changed, 22 insertions(+), 4 deletions(-) diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index 8af9bc3f4e136a..57a8da7ca1ae1b 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -2596,6 +2596,15 @@ __git_compute_config_vars () __git_config_vars="$(git help --config-for-completion)" } +__git_compute_first_level_config_vars_for_section () +{ + local section="$1" + __git_compute_config_vars + local this_section="__git_first_level_config_vars_for_section_${section}" + test -n "${!this_section}" || + printf -v "__git_first_level_config_vars_for_section_${section}" %s "$(echo "$__git_config_vars" | grep -E "^${section}\.[a-z]" | awk -F. '{print $2}')" +} + __git_config_sections= __git_compute_config_sections () { @@ -2749,8 +2758,11 @@ __git_complete_config_variable_name () branch.*) local pfx="${cur_%.*}." cur_="${cur_#*.}" + local section="${pfx%.}" __gitcomp_direct "$(__git_heads "$pfx" "$cur_" ".")" - __gitcomp_nl_append $'autoSetupMerge\nautoSetupRebase\n' "$pfx" "$cur_" "${sfx:- }" + __git_compute_first_level_config_vars_for_section "${section}" + local this_section="__git_first_level_config_vars_for_section_${section}" + __gitcomp_nl_append "${!this_section}" "$pfx" "$cur_" "${sfx:- }" return ;; guitool.*.*) @@ -2799,8 +2811,11 @@ __git_complete_config_variable_name () remote.*) local pfx="${cur_%.*}." cur_="${cur_#*.}" + local section="${pfx%.}" __gitcomp_nl "$(__git_remotes)" "$pfx" "$cur_" "." - __gitcomp_nl_append "pushDefault" "$pfx" "$cur_" "${sfx:- }" + __git_compute_first_level_config_vars_for_section "${section}" + local this_section="__git_first_level_config_vars_for_section_${section}" + __gitcomp_nl_append "${!this_section}" "$pfx" "$cur_" "${sfx:- }" return ;; submodule.*.*) @@ -2812,8 +2827,11 @@ __git_complete_config_variable_name () submodule.*) local pfx="${cur_%.*}." cur_="${cur_#*.}" + local section="${pfx%.}" __gitcomp_nl "$(__git config -f "$(__git rev-parse --show-toplevel)/.gitmodules" --get-regexp 'submodule.*.path' | awk -F. '{print $2}')" "$pfx" "$cur_" "." - __gitcomp_nl_append $'alternateErrorStrategy\nfetchJobs\nactive\nalternateLocation\nrecurse\npropagateBranches' "$pfx" "$cur_" "${sfx:- }" + __git_compute_first_level_config_vars_for_section "${section}" + local this_section="__git_first_level_config_vars_for_section_${section}" + __gitcomp_nl_append "${!this_section}" "$pfx" "$cur_" "${sfx:- }" return ;; url.*.*) diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh index 23d0e71324ccf5..8600b9e0dd985b 100755 --- a/t/t9902-completion.sh +++ b/t/t9902-completion.sh @@ -2589,7 +2589,7 @@ test_expect_success 'setup for git config submodule tests' ' git submodule add ./sub ' -test_expect_success 'git config - variable name - submodule' ' +test_expect_success 'git config - variable name - submodule and __git_compute_first_level_config_vars_for_section' ' test_completion "git config submodule." <<-\EOF submodule.active Z submodule.alternateErrorStrategy Z From 6e32f718ffa2e1358f3c34dafb9ed0dc8e479781 Mon Sep 17 00:00:00 2001 From: Philippe Blain Date: Sat, 10 Feb 2024 18:32:23 +0000 Subject: [PATCH 46/51] completion: add and use __git_compute_second_level_config_vars_for_section In a previous commit we removed some hardcoded config variable names from function __git_complete_config_variable_name in the completion script by introducing a new function, __git_compute_first_level_config_vars_for_section. The remaining hardcoded config variables are "second level" configuration variables, meaning 'branch..upstream', 'remote..url', etc. where is a user-defined name. Making use of the new existing --config flag to 'git help', add a new function, __git_compute_second_level_config_vars_for_section. This function takes as argument a config section name and computes the corresponding second-level config variables, i.e. those that contain a '<' which indicates the start of a placeholder. Note that as in __git_compute_first_level_config_vars_for_section added previsouly, we use indirect expansion instead of associative arrays to stay compatible with Bash 3 on which macOS is stuck for licensing reasons. As explained in the previous commit, we use the existing pattern in the completion script of using global variables to cache the list of variables for each section. Use this new function and the variables it defines in __git_complete_config_variable_name to remove hardcoded config variables, and add a test to verify the new function. Use a single 'case' for all sections with second-level variables names, since the code for each of them is now exactly the same. Adjust the name of a test added in a previous commit to reflect that it now tests the added function. Signed-off-by: Philippe Blain Signed-off-by: Junio C Hamano --- contrib/completion/git-completion.bash | 71 ++++++++------------------ t/t9902-completion.sh | 2 +- 2 files changed, 22 insertions(+), 51 deletions(-) diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index 57a8da7ca1ae1b..87678a5bb36a2d 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -2596,6 +2596,13 @@ __git_compute_config_vars () __git_config_vars="$(git help --config-for-completion)" } +__git_config_vars_all= +__git_compute_config_vars_all () +{ + test -n "$__git_config_vars_all" || + __git_config_vars_all="$(git --no-pager help --config)" +} + __git_compute_first_level_config_vars_for_section () { local section="$1" @@ -2605,6 +2612,15 @@ __git_compute_first_level_config_vars_for_section () printf -v "__git_first_level_config_vars_for_section_${section}" %s "$(echo "$__git_config_vars" | grep -E "^${section}\.[a-z]" | awk -F. '{print $2}')" } +__git_compute_second_level_config_vars_for_section () +{ + local section="$1" + __git_compute_config_vars_all + local this_section="__git_second_level_config_vars_for_section_${section}" + test -n "${!this_section}" || + printf -v "__git_second_level_config_vars_for_section_${section}" %s "$(echo "$__git_config_vars_all" | grep -E "^${section}\.<" | awk -F. '{print $3}')" +} + __git_config_sections= __git_compute_config_sections () { @@ -2749,10 +2765,13 @@ __git_complete_config_variable_name () done case "$cur_" in - branch.*.*) + branch.*.*|guitool.*.*|difftool.*.*|man.*.*|mergetool.*.*|remote.*.*|submodule.*.*|url.*.*) local pfx="${cur_%.*}." cur_="${cur_##*.}" - __gitcomp "remote pushRemote merge mergeOptions rebase" "$pfx" "$cur_" "$sfx" + local section="${pfx%.*.}" + __git_compute_second_level_config_vars_for_section "${section}" + local this_section="__git_second_level_config_vars_for_section_${section}" + __gitcomp "${!this_section}" "$pfx" "$cur_" "$sfx" return ;; branch.*) @@ -2765,33 +2784,6 @@ __git_complete_config_variable_name () __gitcomp_nl_append "${!this_section}" "$pfx" "$cur_" "${sfx:- }" return ;; - guitool.*.*) - local pfx="${cur_%.*}." - cur_="${cur_##*.}" - __gitcomp " - argPrompt cmd confirm needsFile noConsole noRescan - prompt revPrompt revUnmerged title - " "$pfx" "$cur_" "$sfx" - return - ;; - difftool.*.*) - local pfx="${cur_%.*}." - cur_="${cur_##*.}" - __gitcomp "cmd path" "$pfx" "$cur_" "$sfx" - return - ;; - man.*.*) - local pfx="${cur_%.*}." - cur_="${cur_##*.}" - __gitcomp "cmd path" "$pfx" "$cur_" "$sfx" - return - ;; - mergetool.*.*) - local pfx="${cur_%.*}." - cur_="${cur_##*.}" - __gitcomp "cmd path trustExitCode" "$pfx" "$cur_" "$sfx" - return - ;; pager.*) local pfx="${cur_%.*}." cur_="${cur_#*.}" @@ -2799,15 +2791,6 @@ __git_complete_config_variable_name () __gitcomp_nl "$__git_all_commands" "$pfx" "$cur_" "${sfx:- }" return ;; - remote.*.*) - local pfx="${cur_%.*}." - cur_="${cur_##*.}" - __gitcomp " - url proxy fetch push mirror skipDefaultUpdate - receivepack uploadpack tagOpt pushurl - " "$pfx" "$cur_" "$sfx" - return - ;; remote.*) local pfx="${cur_%.*}." cur_="${cur_#*.}" @@ -2818,12 +2801,6 @@ __git_complete_config_variable_name () __gitcomp_nl_append "${!this_section}" "$pfx" "$cur_" "${sfx:- }" return ;; - submodule.*.*) - local pfx="${cur_%.*}." - cur_="${cur_##*.}" - __gitcomp "url update branch fetchRecurseSubmodules ignore active" "$pfx" "$cur_" "$sfx" - return - ;; submodule.*) local pfx="${cur_%.*}." cur_="${cur_#*.}" @@ -2834,12 +2811,6 @@ __git_complete_config_variable_name () __gitcomp_nl_append "${!this_section}" "$pfx" "$cur_" "${sfx:- }" return ;; - url.*.*) - local pfx="${cur_%.*}." - cur_="${cur_##*.}" - __gitcomp "insteadOf pushInsteadOf" "$pfx" "$cur_" "$sfx" - return - ;; *.*) __git_compute_config_vars __gitcomp "$__git_config_vars" "" "$cur_" "$sfx" diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh index 8600b9e0dd985b..64031a9eff86a3 100755 --- a/t/t9902-completion.sh +++ b/t/t9902-completion.sh @@ -2601,7 +2601,7 @@ test_expect_success 'git config - variable name - submodule and __git_compute_fi EOF ' -test_expect_success 'git config - variable name - submodule names' ' +test_expect_success 'git config - variable name - __git_compute_second_level_config_vars_for_section' ' test_completion "git config submodule.sub." <<-\EOF submodule.sub.url Z submodule.sub.update Z From ad1a6695454c6fd922617b03ebb5678841a17db5 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Mon, 12 Feb 2024 12:20:35 -0800 Subject: [PATCH 47/51] A few more topics before -rc1 Signed-off-by: Junio C Hamano --- Documentation/RelNotes/2.44.0.txt | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/Documentation/RelNotes/2.44.0.txt b/Documentation/RelNotes/2.44.0.txt index edeed71855e2db..bf8a9524a1f9a9 100644 --- a/Documentation/RelNotes/2.44.0.txt +++ b/Documentation/RelNotes/2.44.0.txt @@ -91,6 +91,14 @@ UI, Workflows & Features refresh token the same way as credential-cache and credential-libsecret backends. + * Command line completion support (in contrib/) has been + updated for "git bisect". + + * "git branch" and friends learned to use the formatted text as + sorting key, not the underlying timestamp value, when the --sort + option is used with author or committer timestamp with a format + specifier (e.g., "--sort=creatordate:format:%H:%M:%S"). + Performance, Internal Implementation, Development Support etc. @@ -151,6 +159,9 @@ Performance, Internal Implementation, Development Support etc. * The priority queue test has been migrated to the unit testing framework. + * Setting `feature.experimental` opts the user into multi-pack reuse + experiment + Fixes since v2.43 ----------------- @@ -279,6 +290,17 @@ Fixes since v2.43 as with a wrong length, which has been corrected. (merge 156e28b36d jh/sparse-index-expand-to-path-fix later to maint). + * A failed "git tag -s" did not necessarily result in an error + depending on the crypto backend, which has been corrected. + (merge 6931049c32 jc/sign-buffer-failure-propagation-fix later to maint). + + * "git stash" sometimes was silent even when it failed due to + unwritable index file, which has been corrected. + (merge d2058cb2f0 ps/report-failure-from-git-stash later to maint). + + * "git show-ref --verify" did not show things like "CHERRY_PICK_HEAD", + which has been corrected. + * Other code cleanup, docfix, build fix, etc. (merge 5aea3955bc rj/clarify-branch-doc-m later to maint). (merge 9cce3be2df bk/bisect-doc-fix later to maint). @@ -293,3 +315,5 @@ Fixes since v2.43 (merge 36c9c44fa4 tb/pack-bitmap-drop-unused-struct-member later to maint). (merge 19ed0dff8f js/win32-retry-pipe-write-on-enospc later to maint). (merge 3cb4384683 jc/t0091-with-unknown-git later to maint). + (merge 841dbd40a3 jc/bisect-doc later to maint). + (merge 78307f1a89 pb/template-for-single-commit-pr later to maint). From b40ba17e44cd2614d2f422a0225f6db49ca694a8 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Tue, 13 Feb 2024 11:48:15 -0800 Subject: [PATCH 48/51] write-or-die: fix the polarity of GIT_FLUSH environment variable When GIT_FLUSH is set to 1, true, on, yes, then we should disable skip_stdout_flush, but the conversion somehow did the opposite. With the understanding of the original motivation behind "skip" in 06f59e9f (Don't fflush(stdout) when it's not helpful, 2007-06-29), we can sympathize with the current naming (we wanted to avoid useless flushing of stdout by default, with an escape hatch to always flush), but it is still not a good excuse. Retire the "skip_stdout_flush" variable and replace it with "flush_stdout" that tells if we do or do not want to run fflush(). Reported-by: Xiaoguang WANG Helped-by: Phillip Wood Signed-off-by: Junio C Hamano --- write-or-die.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/write-or-die.c b/write-or-die.c index 39421528653e29..01a9a51fa2fcd7 100644 --- a/write-or-die.c +++ b/write-or-die.c @@ -18,20 +18,20 @@ */ void maybe_flush_or_die(FILE *f, const char *desc) { - static int skip_stdout_flush = -1; - if (f == stdout) { - if (skip_stdout_flush < 0) { - skip_stdout_flush = git_env_bool("GIT_FLUSH", -1); - if (skip_stdout_flush < 0) { + static int force_flush_stdout = -1; + + if (force_flush_stdout < 0) { + force_flush_stdout = git_env_bool("GIT_FLUSH", -1); + if (force_flush_stdout < 0) { struct stat st; if (fstat(fileno(stdout), &st)) - skip_stdout_flush = 0; + force_flush_stdout = 1; else - skip_stdout_flush = S_ISREG(st.st_mode); + force_flush_stdout = !S_ISREG(st.st_mode); } } - if (skip_stdout_flush && !ferror(f)) + if (!force_flush_stdout && !ferror(f)) return; } if (fflush(f)) { From 4cde9f0726e5f5d1e545a8a33db822ad6dcf5574 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Tue, 13 Feb 2024 14:25:40 -0800 Subject: [PATCH 49/51] A few more fixes before -rc1 Signed-off-by: Junio C Hamano --- Documentation/RelNotes/2.44.0.txt | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/Documentation/RelNotes/2.44.0.txt b/Documentation/RelNotes/2.44.0.txt index bf8a9524a1f9a9..1b823daa19427f 100644 --- a/Documentation/RelNotes/2.44.0.txt +++ b/Documentation/RelNotes/2.44.0.txt @@ -162,6 +162,15 @@ Performance, Internal Implementation, Development Support etc. * Setting `feature.experimental` opts the user into multi-pack reuse experiment + * Squelch node.js 16 deprecation warnings from GitHub Actions CI + by updating actions/github-script and actions/checkout that use + node.js 20. + + * The mechanism to report the filename in the source code, used by + the unit-test machinery, assumed that the compiler expanded __FILE__ + to the path to the source given to the $(CC), but some compilers + give full path, breaking the output. This has been corrected. + Fixes since v2.43 ----------------- @@ -301,6 +310,11 @@ Fixes since v2.43 * "git show-ref --verify" did not show things like "CHERRY_PICK_HEAD", which has been corrected. + * Recent conversion to allow more than 0/1 in GIT_FLUSH broke the + mechanism by flipping what yes/no means by mistake, which has been + corrected. + (merge b40ba17e44 cp/git-flush-is-an-env-bool later to maint). + * Other code cleanup, docfix, build fix, etc. (merge 5aea3955bc rj/clarify-branch-doc-m later to maint). (merge 9cce3be2df bk/bisect-doc-fix later to maint). From efb050becb6bc703f76382e1f1b6273100e6ace3 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Tue, 13 Feb 2024 14:44:20 -0800 Subject: [PATCH 50/51] Git 2.43.2 Signed-off-by: Junio C Hamano --- Documentation/RelNotes/2.43.2.txt | 37 +++++++++++++++++++++++++++++++ GIT-VERSION-GEN | 2 +- RelNotes | 2 +- 3 files changed, 39 insertions(+), 2 deletions(-) create mode 100644 Documentation/RelNotes/2.43.2.txt diff --git a/Documentation/RelNotes/2.43.2.txt b/Documentation/RelNotes/2.43.2.txt new file mode 100644 index 00000000000000..5895e23a54639e --- /dev/null +++ b/Documentation/RelNotes/2.43.2.txt @@ -0,0 +1,37 @@ +Git 2.43.2 Release Notes +======================== + +Relative to Git 2.43.1, this release has two important fixes to allow +"git imap-send" to be built with NO_CURL defined, and to restore the +forced flushing behaviour when GIT_FLUSH=1 is set. It also contains +other, unexciting, fixes that have already been merged to the 'master' +branch of the development towards the next major release. + +Fixes since Git 2.43.1 +---------------------- + + * Update to a new feature recently added, "git show-ref --exists". + + * Rename detection logic ignored the final line of a file if it is an + incomplete line. + + * "git diff --no-rename A B" did not disable rename detection but did + not trigger an error from the command line parser. + + * "git diff --no-index file1 file2" segfaulted while invoking the + external diff driver, which has been corrected. + + * Rewrite //-comments to /* comments */ in files whose comments + prevalently use the latter. + + * A failed "git tag -s" did not necessarily result in an error + depending on the crypto backend, which has been corrected. + + * "git stash" sometimes was silent even when it failed due to + unwritable index file, which has been corrected. + + * Recent conversion to allow more than 0/1 in GIT_FLUSH broke the + mechanism by flipping what yes/no means by mistake, which has been + corrected. + +Also contains documentation updates, code clean-ups and minor fixups. diff --git a/GIT-VERSION-GEN b/GIT-VERSION-GEN index 3d69127479cb4d..e1f223debbb684 100755 --- a/GIT-VERSION-GEN +++ b/GIT-VERSION-GEN @@ -1,7 +1,7 @@ #!/bin/sh GVF=GIT-VERSION-FILE -DEF_VER=v2.43.1 +DEF_VER=v2.43.2 LF=' ' diff --git a/RelNotes b/RelNotes index 93a7fcd2f249f8..9ceb8d9d2eeb19 120000 --- a/RelNotes +++ b/RelNotes @@ -1 +1 @@ -Documentation/RelNotes/2.43.1.txt \ No newline at end of file +Documentation/RelNotes/2.43.2.txt \ No newline at end of file From 4fc51f00ef18d2c0174ab2fd39d0ee473fd144bd Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Wed, 14 Feb 2024 13:49:09 -0800 Subject: [PATCH 51/51] Hopefully the last batch of fixes before 2.44 final Signed-off-by: Junio C Hamano --- Documentation/RelNotes/2.44.0.txt | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/Documentation/RelNotes/2.44.0.txt b/Documentation/RelNotes/2.44.0.txt index 30dbdcf8eee62e..7dd8d75844ea66 100644 --- a/Documentation/RelNotes/2.44.0.txt +++ b/Documentation/RelNotes/2.44.0.txt @@ -99,6 +99,9 @@ UI, Workflows & Features option is used with author or committer timestamp with a format specifier (e.g., "--sort=creatordate:format:%H:%M:%S"). + * The command line completion script (in contrib/) learned to + complete configuration variable names better. + Performance, Internal Implementation, Development Support etc. @@ -307,8 +310,24 @@ Fixes since v2.43 mechanism by flipping what yes/no means by mistake, which has been corrected. + * The sequencer machinery does not use the ref API and instead + records names of certain objects it needs for its correct operation + in temporary files, which makes these objects susceptible to loss + by garbage collection. These temporary files have been added as + starting points for reachability analysis to fix this. + (merge bc7f5db896 pw/gc-during-rebase later to maint). + + * "git cherry-pick" invoked during "git rebase -i" session lost + the authorship information, which has been corrected. + (merge e4301f73ff vn/rebase-with-cherry-pick-authorship later to maint). + + * The code paths that call repo_read_object_file() have been + tightened to react to errors. + (merge 568459bf5e js/check-null-from-read-object-file later to maint). + * Other code cleanup, docfix, build fix, etc. (merge 5aea3955bc rj/clarify-branch-doc-m later to maint). (merge 9cce3be2df bk/bisect-doc-fix later to maint). (merge 8430b438f6 vd/fsck-submodule-url-test later to maint). (merge 3cb4384683 jc/t0091-with-unknown-git later to maint). + (merge 020456cb74 rs/receive-pack-remove-find-header later to maint).