-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GC issue in LMDB sail compromising deletes #5254
Comments
@nguyenm100 Unfortunately, I don't have time this week to fix this. A good starting point is here: rdf4j/core/sail/lmdb/src/main/java/org/eclipse/rdf4j/sail/lmdb/LmdbSailStore.java Line 449 in cd6509b
Basically, the GC algorithm goes through removed IDs and looks if they are present in the store. If not then they are removed from the values database. Maybe the logic inside the triple store has errors:
If you don't find the cause then you could try to comment out the GC logic. Best regards, |
Thanks @kenwenzel if I disable the gc it works. What's the implications of disabling the gc during runtime? I see it gets called upon ValueStore creation, which may occur after a machine restart. if I leave that call but disable everything else, will it GC (correctly) just once on startup? i think what may be occuring is that once the gc marks something for deletion, it's not clear how it gets reinstated.. so if I update a set of triples based on a subj iri, that requires a delete/insert. the delete will remove that subj iri and mark it for deletion, but the insert should unmark again. does that make sense? |
If you diable GC then everything just works as with the NativeStore where values are never deleted. Could you try to create a minimal example that triggers the bug? |
Hello @kenwenzel , thank you for looking into it. We have figured out a minimal reproduction. cc @nguyenm100 To enforce the value eviction, we need to do The following example adds a statement and then deletes it. Repeat it until an exception occurs. package org.example;
import org.eclipse.rdf4j.common.transaction.IsolationLevels;
import org.eclipse.rdf4j.model.util.Values;
import org.eclipse.rdf4j.model.vocabulary.RDF;
import org.eclipse.rdf4j.repository.Repository;
import org.eclipse.rdf4j.repository.RepositoryConnection;
import org.eclipse.rdf4j.repository.sail.SailRepository;
import org.eclipse.rdf4j.sail.NotifyingSailConnection;
import org.eclipse.rdf4j.sail.lmdb.LmdbStore;
import java.time.Duration;
class Reproduce {
public static void main(final String[] args) throws InterruptedException {
final LmdbStore store = new LmdbStore();
final Repository repository = new SailRepository(store);
int count = 0;
while (true) {
try (final NotifyingSailConnection conn = store.getConnection()) {
conn.begin(IsolationLevels.SERIALIZABLE);
conn.addStatement(
Values.iri(RDF.NAMESPACE + "sub"), Values.iri(RDF.NAMESPACE + "pred"),
Values.iri(RDF.NAMESPACE + "obj")
);
conn.commit();
}
try (final RepositoryConnection conn = repository.getConnection()) {
conn.begin();
conn.prepareUpdate("DELETE { ?s ?p ?o } WHERE { ?s ?p ?o }").execute();
conn.commit();
}
System.out.println("Count: " + count++);
// https://github.com/eclipse-rdf4j/rdf4j/blob/main/core/sail/lmdb/src/main/java/org/eclipse/rdf4j/sail/lmdb/ValueStore.java#L97
// if you set the VALUE_EVICTION_INTERVAL to be 0, remove this Thread.sleep
Thread.sleep(Duration.ofMinutes(1).toMillis());
}
}
} On my laptop it failed on the 8th. Full exception
|
almost positive it has something to do with: https://github.com/eclipse-rdf4j/rdf4j/blob/main/core/sail/lmdb/src/main/java/org/eclipse/rdf4j/sail/lmdb/ValueStore.java#L1180 if I remove the cleaner callback and just immediately add it to the unusedRevisionIds list, it fails immediately. the pattern we're seeing is very indicative of some race condition occuring during the gc. |
OK, I found the problem: Afterwards, you can try the following unit test: package org.eclipse.rdf4j.sail.lmdb;
import java.io.File;
import org.eclipse.rdf4j.common.transaction.IsolationLevels;
import org.eclipse.rdf4j.model.ValueFactory;
import org.eclipse.rdf4j.model.impl.SimpleValueFactory;
import org.eclipse.rdf4j.model.util.Values;
import org.eclipse.rdf4j.model.vocabulary.RDF;
import org.eclipse.rdf4j.repository.Repository;
import org.eclipse.rdf4j.repository.RepositoryConnection;
import org.eclipse.rdf4j.repository.sail.SailRepository;
import org.eclipse.rdf4j.sail.lmdb.config.LmdbStoreConfig;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.io.TempDir;
public class LmdbSailStoreGCTest {
protected final ValueFactory F = SimpleValueFactory.getInstance();
protected LmdbStore store;
protected Repository repo;
@BeforeEach
public void before(@TempDir File dataDir) {
store = new LmdbStore(dataDir, new LmdbStoreConfig("spoc,posc"));
repo = new SailRepository(store);
repo.init();
}
@Test
public void testAddAndDelete() {
int count = 0;
while (count++ < 100) {
try (final RepositoryConnection conn = repo.getConnection()) {
conn.begin(IsolationLevels.SERIALIZABLE);
conn.add(Values.iri(RDF.NAMESPACE + "sub"), Values.iri(RDF.NAMESPACE + "pred"), Values.iri(RDF.NAMESPACE + "obj"));
conn.add(Values.iri("x:sub"), Values.iri("x:pred"), Values.iri("x:obj"));
conn.add(Values.iri("x:sub"), RDF.TYPE, RDF.LIST);
conn.commit();
}
try (final RepositoryConnection conn = repo.getConnection()) {
conn.begin();
conn.prepareUpdate("DELETE { ?s ?p ?o } WHERE { ?s ?p ?o }").execute();
conn.commit();
}
ValueStore valueStore = (ValueStore) store.getBackingStore().getValueFactory();
valueStore.forceEvictionOfValues();
System.gc();
}
}
@AfterEach
public void after() {
repo.shutDown();
}
} |
Thanks @kenwenzel! btw- what are your thoughts for allowing for the configuration of lmdb to specify whether gc is used as well as gc VALUE_EVICTION_INTERVAL ? If you agree, we can help contribute to that if you're busy. |
@nguyenm100 Please go ahead and add configuration options for gc if you want. |
Hello @kenwenzel, thank you so much for investigation and fixing the issue. Regarding LMDB GC configuration options, we are thinking about 2 solutions:
Would like to hear your opinions. |
Current Behavior
Unfortunately I don't have a predictable repro to share at the moment but we're seeing an issue whereby a predicate is "lost" after a series of largish (10k triples) deletes and inserts occur and hoping @kenwenzel can help us identify what's going on. Effectively, the pattern is:
open a new transaction
insert 10k triples
read all triples
rollback the transaction
open a new transaction
insert 10k triples
commit transaction
open a new transaction
remove 10k triples
rollback the transaction
open a new transaction
remove 10k triples
commit transaction
repeat these steps a number of times and at some point, an expected predicate is lost from the database.. that is, lmdbsail goes to fetch it and mdb_get returns back MDB_NOTFOUND
Things we've tried:
the consequence is that ValueStore.resolveValue() returns false at some point.
@kenwenzel any pointers where to look and/or debugging advice? can we disable certain features (e.g multithreading/lazyfetching/caching/refcounting) to possibly make the issue more predictable to recreate?
what's bizarre is that it's ALWAYS the predicate that goes missing. subject and object are present.
Expected Behavior
everything to work
Steps To Reproduce
open a new transaction
insert 10k triples
read all triples
rollback the transaction
open a new transaction
insert 10k triples
commit transaction
open a new transaction
remove 10k triples
rollback the transaction
open a new transaction
remove 10k triples
commit transaction
Version
5.0.3
Are you interested in contributing a solution yourself?
Perhaps?
Anything else?
No response
The text was updated successfully, but these errors were encountered: