Make empty path return empty set when the graph is empty #1806

RobinTF · 2025-02-14T13:43:45Z

When the graph is empty, the empty path should return an empty set, even when one of the two sides has a binding. For example, consider the following query on an empty graph. So far, QLever would return the binding { "?a": 1, "?b": 1 } because the left size has the binding { "?a": 1 }. But the correct result is to return an empty solution set because the graph is empty. This is now fixed.

SELECT * WHERE {
  VALUES ?a { 1 }
  ?a a? ?b
}

hannahbast · 2025-02-14T13:49:00Z

@RobinTF Thanks, Robin, for looking into this. I don't quite understand the description. What exactly is this PR supposed to do?

RobinTF · 2025-02-14T13:59:45Z

@hannahbast It's a bit hard to describe because it's a very rare case. It should fix a compliance test (which we will hopefully see once the builds finish).
So anyways imagine this query (which is more or less similar to the compliance test) being run on an empty graph:

SELECT * WHERE {
  VALUES ?a { 1 }
  ?a a? ?b
}

Because it is run on an empty graph ?a a ?b doesn't match anything, so the graph is empty, so the result should be empty too. This isn't the case currently, because ?a is "bound", so we use that as a starting point for TransitivePath, without checking if it is contained in the graph. This check is rather expensive i.e. linear in time for every value of ?a (we could of course make it more efficient at the cost of higher memory usage), so I tried to make sure it is run only if we can't include it based on any other measures we have.

I hope this clears things up a bit.

EDIT:
Currently QLever would return a result of 1 1, with this change the result is really empty.

codecov · 2025-02-14T15:01:07Z

Codecov Report

Attention: Patch coverage is 96.96970% with 1 line in your changes missing coverage. Please review.

Project coverage is 90.06%. Comparing base (1570033) to head (418c92e).

Files with missing lines	Patch %	Lines
src/engine/TransitivePathImpl.h	93.75%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1806      +/-   ##
==========================================
+ Coverage   90.02%   90.06%   +0.04%     
==========================================
  Files         396      396              
  Lines       37974    37992      +18     
  Branches     4262     4267       +5     
==========================================
+ Hits        34185    34217      +32     
+ Misses       2493     2491       -2     
+ Partials     1296     1284      -12

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

hannahbast · 2025-02-14T17:26:12Z

@RobinTF Thanks for the clarification, I have revised the title and description accordingly. I have a follow-up question:

What if the graph is not empty but does not contain the binding (or bindings) for ?a in your example above. Should the result then still be empty? If yes (which I suppose), I should update the title and description accordingly.

And yet more specifically: If some of the bindings are in the graph and some or not, should the solution only contain those bindings that are in the graph?

RobinTF · 2025-02-14T17:31:19Z

@RobinTF Thanks for the clarification, I have revised the title and description accordingly. I have a follow-up question:

What if the graph is not empty but does not contain the binding (or bindings) for ?a in your example above. Should the result then still be empty? If yes (which I suppose), I should update the title and description accordingly.

And yet more specifically: If some of the bindings are in the graph and some or not, should the solution only contain those bindings that are in the graph?

Yes, that's how I see it too. When there is a statement like ?s <some-iri>? ?o or ?s <some-iri>* ?o, then the result can only ever contain values for ?s and ?o where they at some point are part of a triple containing <some-iri> as predicate. Unrelated triples are not supposed to be included is how I understand this.

EDIT: To clarify, it is not important where in the triple (subject or object position) the value is found for the empty path, as long as it is somewhere.

RobinTF · 2025-02-14T17:33:32Z

src/engine/TransitivePathBase.cpp

-        maxDist_));
+    candidates.push_back(makeTransitivePath(getExecutionContext(),
+                                            alternativeSubtree, lhs, rhs,
+                                            minDist_, maxDist_, useBinSearch));


This fixes a bug in the unit tests, where both implementations are supposed to get tested, but for bound variables it got reset to the configured default regardless of test configuration

hannahbast · 2025-02-14T17:35:30Z

@RobinTF OK, that's a completely new spin now. Are you saying that matches for the empty path have to be a subject or object of the underlying predicate or property path? That's not how Johannes and I understood this so far, but it's quite possible that we were wrong. Can you pinpoint from there exactly in the standard you are inferring this?

sparql-conformance · 2025-02-14T17:58:14Z

Conformance check passed ✅

Test Status Changes 📊

Number of Tests	Previous Status	Current Status
1	Failed	Passed

Details: https://qlever.cs.uni-freiburg.de/sparql-conformance-ui?cur=418c92eba59fd9f9c895ec83fb3ba40a5f174fdf&prev=1570033d07eb625dd3c2624c866eeb241f8639ef

sonarqubecloud · 2025-02-14T18:20:53Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

RobinTF · 2025-02-14T18:25:10Z

@hannahbast I had a look at some of the compliance tests and you're right. The values don't have to be related to the predicates. But this means that VALUES has just some arbitrary limitation that it doesn't count for property paths. In this case this has to be solved via query planning, by making sure the value is at least joined once with the index.

hannahbast · 2025-02-14T19:36:12Z

@RobinTF I just added #1809 to document what needs to be done for a correct implementation of the empty path. Please have a look and let me know if you have questions or if you think that something is wrong.

RobinTF added 2 commits February 14, 2025 13:56

Simplify code and add new tests

ef831e3

Add new test and fix wrong behaviour for transitive path

0b1c80c

Change condition, so it becomes simpler

c1fa437

Propagate binaryness correctly for tests

a71e54d

hannahbast changed the title ~~Prevent empty path from matching when not contained in graph~~ Make empty path return empty set when the graph is empty Feb 14, 2025

Change code to better optimize it

6b1f09c

RobinTF commented Feb 14, 2025

View reviewed changes

Clarify comment

418c92e

RobinTF closed this Feb 14, 2025

RobinTF deleted the fix-empty-path-pattern-match branch February 14, 2025 18:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make empty path return empty set when the graph is empty #1806

Make empty path return empty set when the graph is empty #1806

RobinTF commented Feb 14, 2025 •

edited by hannahbast

Loading

hannahbast commented Feb 14, 2025

RobinTF commented Feb 14, 2025 •

edited

Loading

codecov bot commented Feb 14, 2025 •

edited

Loading

hannahbast commented Feb 14, 2025 •

edited

Loading

RobinTF commented Feb 14, 2025 •

edited

Loading

RobinTF Feb 14, 2025

hannahbast commented Feb 14, 2025 •

edited

Loading

sparql-conformance bot commented Feb 14, 2025

sonarqubecloud bot commented Feb 14, 2025

RobinTF commented Feb 14, 2025

hannahbast commented Feb 14, 2025

Make empty path return empty set when the graph is empty #1806

Make empty path return empty set when the graph is empty #1806

Conversation

RobinTF commented Feb 14, 2025 • edited by hannahbast Loading

hannahbast commented Feb 14, 2025

RobinTF commented Feb 14, 2025 • edited Loading

codecov bot commented Feb 14, 2025 • edited Loading

Codecov Report

hannahbast commented Feb 14, 2025 • edited Loading

RobinTF commented Feb 14, 2025 • edited Loading

RobinTF Feb 14, 2025

Choose a reason for hiding this comment

hannahbast commented Feb 14, 2025 • edited Loading

sparql-conformance bot commented Feb 14, 2025

Conformance check passed ✅

Test Status Changes 📊

sonarqubecloud bot commented Feb 14, 2025

Quality Gate passed

RobinTF commented Feb 14, 2025

hannahbast commented Feb 14, 2025

RobinTF commented Feb 14, 2025 •

edited by hannahbast

Loading

RobinTF commented Feb 14, 2025 •

edited

Loading

codecov bot commented Feb 14, 2025 •

edited

Loading

hannahbast commented Feb 14, 2025 •

edited

Loading

RobinTF commented Feb 14, 2025 •

edited

Loading

hannahbast commented Feb 14, 2025 •

edited

Loading