-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent checking out discarded vault commit IDs with version
#252
Comments
version
version
In terms of GC of unreachable commits. This appears to require a reachability algorithm. I can think of a basic way by building up a set and removing elements from the set when traversing the canonical branch pointer but I'm wondering if there's a faster way that's already implemented in git https://stackoverflow.com/a/21157751. Also see https://en.m.wikipedia.org/wiki/Reachability. Additionally because we are updating both HEAD and branch pointer on every commit, we can also eliminate unreachable commits everytime we do a commit op by walking the branch pointer commit ID to the HEAD commit ID before switching the branch pointer. This is done on-line but is exposed to transitory failures. So it's still good to have a git gc implemented internally. |
After reviewing There are several ways to do this: According to this benchmark: import b from 'benny';
import packageJson from '../package.json';
async function main () {
let map = new Map();
let obj = {};
let arr = [];
let set = new Set();
const summary = await b.suite(
'gitgc',
b.add('map', async () => {
map = new Map();
return async () => {
for (let i = 0; i < 1000; i++) {
map.set(i, undefined);
}
for (let i = 0; i < 1000; i++) {
map.delete(i);
}
for (const i of map) {
// NOOP
}
}
}),
b.add('obj', async () => {
obj = {};
return async () => {
for (let i = 0; i < 1000; i++) {
obj[i] = undefined;
}
for (let i = 0; i < 1000; i++) {
delete obj[i];
}
for (const i in obj) {
// NOOP
}
};
}),
b.add('arr', async () => {
arr = [];
return async () => {
for (let i = 0; i < 1000; i++) {
arr.push({ id: i, mark: false});
}
for (let i = 0; i < 1000; i++) {
arr[i].mark = true;
}
for (let i = 0; i < 1000; i++) {
if (arr[i].mark === false) {
// NOOP
}
}
};
}),
b.add('set', async () => {
set = new Set();
return async () => {
for (let i = 0; i < 1000; i++) {
set.add(i);
}
for (let i = 0; i < 1000; i++) {
set.delete(i);
}
for (const i of set) {
// NOOP
}
};
}),
b.cycle(),
b.complete(),
b.save({
file: 'gitgc',
folder: 'benches/results',
version: packageJson.version,
details: true,
}),
b.save({
file: 'gitgc',
folder: 'benches/results',
format: 'chart.html',
}),
);
return summary;
}
if (require.main === module) {
(async () => {
await main();
})();
}
export default main; The fastest one is is
So we just get an |
This would only need to be run if the regular |
Should redo the array benchmark with pre-declared size of the array. |
After redoing it with
However the problem with this is that you have to know how many git objects at any moment, this requires a quick count I guess initially. We could do something like C++ vector, by doubling size upon resizing the array. And resizing arrays in JS is easy by just doing |
One issue is that iterating over the git object database, we have to keep track of objects that are not commits. It appears that isogit does have a function for this: https://isomorphic-git.org/docs/en/walk, however the walkers are predefined, and one of the |
Example of walking the filesystem: https://gist.github.com/lovasoa/8691344, can be used in case the |
@tegefaulkes what we talked about regarding garbage collection concept. Can you spec out an algorithm? You'd have to start with applying the |
@tegefaulkes is there a test for this behaviour? |
tests that cover this is;
|
Describe the bug
The
version
command for the vaults can currently move to commit IDs that are expected to be discarded. When checked out at an earlier commit ID and a mutation is performed, we expect to discard the later commit IDs (respective to the earlier checked out commit) such that they are no longer valid and inaccessible.To Reproduce
c1 -> c2 -> c3
c3
version
to move toc1
: expected commit log nowc1
c1 -> c4
c4
version
to move toc3
(with its old commit ID): commit log now back toc1 -> c2 -> c3
c4
: commit log now back toc1 -> c4
Expected behavior
In step 5, instead of reverting to the earlier commit history, we would instead expect this to be an invalid operation. We want to restrict the "branching" potential for our vault log (for simplicity purposes), such that any mutations performed on a vault when checked out at an earlier commit will discard all commits later than the current one.
Additional context
version
command https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/205#note_688569888The text was updated successfully, but these errors were encountered: