Skip to content

Optimize delta resolution from O(N^2) to O(N) in pack indexer.#7268

Open
sstepashka wants to merge 1 commit into
libgit2:mainfrom
sstepashka:main
Open

Optimize delta resolution from O(N^2) to O(N) in pack indexer.#7268
sstepashka wants to merge 1 commit into
libgit2:mainfrom
sstepashka:main

Conversation

@sstepashka

Copy link
Copy Markdown

The original git_indexer_commit resolved REF_DELTA objects by retrying the entire unresolved list on each pass — O(N^2) for packs with deep or wide delta chains.

There are 2 dependency maps:

  • desp_by_offset - base object pack offset to OFS_DELTA children
  • deps_by_oid - OID to its REF_DELTA children

Delta resolution then becomes a single BFS traversal: each resolved object immediately looks up and resolves its children in O(1), giving O(N) total work. Thin pack fixup (injecting missing bases from the ODB) was preserved and integrated into the same traversal.

New regression test (test_pack_indexer__delta_chain) was added to cover out-of-order delta resolution. Existing tests pass.

I also verified the change by cloning a giant repo (~5.7 Gb) before and after with some clone example of a bare mirror of a single branch and compared the resulting directories using diff -rq. They are identical.

The change doesn't make too much difference with default -DUSE_SHA1, but when I switch to use CommonCrypto on MacOS. I get:

8:57.53 total

instead of previous

17:12.05 total

The original git_indexer_commit resolved REF_DELTA objects by retrying the
entire unresolved list on each pass — O(N^2) for packs with deep or wide delta
chains.

There are 2 dependency maps:
 - `desp_by_offset` - base object pack offset to `OFS_DELTA` children
 - `deps_by_oid` - OID to its `REF_DELTA` children

Delta resolution then becomes a single BFS traversal: each resolved object
immediately looks up and resolves its children in O(1), giving O(N) total
work. Thin pack fixup (injecting missing bases from the ODB) was preserved and
integrated into the same traversal.

New regression test (`test_pack_indexer__delta_chain`) was added to cover
out-of-order delta resolution. Existing tests pass.

I also verified the change by cloning a giant repo (~5.7 Gb) before and after
with some clone example of a bare mirror of a single branch and compared the
resulting directories using `diff -rq`. They are identical.

The change doesn't make too much difference with default `-DUSE_SHA1`,
but when I switch to use `CommonCrypto` on MacOS. I get:
```
8:57.53 total
```
instead of previous
```
17:12.05 total
```
@sonarqubecloud

Copy link
Copy Markdown

Quality Gate Passed Quality Gate passed

Issues
10 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant