-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows function inside of query that will be joined as a table leads to invalid SQL #5
Comments
Sorry, this is a bit hard to follow. Do you mind creating either a fiddle or a small project? Here is an example of a fiddle: https://dotnetfiddle.net/RtYk51 |
Yes, I understand. I have created an example using your fiddle. However, I'm having a hard time putting a query together that can't be optimized that matches my complex query above. If we were supporting CTEs, I'd have created a CTE to obtain the row numbers using your window functions and then I would have selected the CTE result constraining it by the row number. Without a CTE, this can be performed using a join to a table constructed by a query. While simple in SQL, it is more complex in LINQ. We have to join the query that would be in a CTE to itself. Under the circumstances in my original post above you actually see a SQL join with a table created from a query. This was my intented illustration. That query-generating table (I don't know the technical term for this) should be calculating the RowNumber column and some other columns that can be used to perform the join, etc. Then, I should be able to constrain my final results using the RowNumber in the where clause. The query SELECT RowNumber =... FROM TABLE WHERE RowNumber <= 2; is not valid, because you can't constrain on a window function or its result in the same query, which I suspect you know. So, you have to layer your query with a CTE or table-generating query. My simple example below is being optimized and the joined table-generating query is being simplified away. There were two problems with the example above: 1) the select clause of the inner table-generating query is missing the window function, i.e. it was not translated because it was in table-generated query that was part of a join and the translator didn't account for it and 2) the window function was placed into the where clause, which of course is not valid SQL. In the sample below, only #2 occurs because of optimizations. This is understandable because the translation engine is thinking that any expression representing a column value can be placed into the where clause, which is generally true, except for window functions. Of course, EF Core wasn't developed with window functions in mind. I'd be happy to work with you on trying to apply this fix, time permitting. However, being more familiar with the library, if you can think of how to quickly apply a fix or improve the outcome, maybe you are the better person to do so.
... produces...
What we really need to see is something like this:
|
I think the translator has to see the window function as an expression it can't optimize away, like it normally would with any other expression. By selecting the entire joined table row it selects the window function, which while basically valid, is of no use in this context and really must be abstracted into another the inner table.
... producers...
|
You can use Steps are: Add nuget package Thinktecture.EntityFrameworkCore.SqlServer The question is: should this library handle a case like this without forcing users to install another extension... |
I'm going to see if I can implement subquery-with-window-function detection after a join (inner) and then construct the SQL accordingly. It is already taking me some time. I'll let you know how it goes. |
Inner joins are created in the SelectExpression. It is not practical to try to supplement or replace the code in that class. I do think Pawel's approach is probably best. I was able to integrate Pawel's AsSubQuery pattern into your library so that you will not have any dependencies on his library. Is that ok with you? Maybe you can give him credit in your readme. Also, I think you should add to the readme how to use the method since it allows users who do not already know how to use a single query to constrain their window function and that should be more performant than two calls. I didn't create a proper unit test. The one I created was only for debugging. Some of the type and member names could be improved upon. |
I synced my fork and see that I'm having difficulty with the SqlServer assembly. My thoughts follow: Get rid of support for .net7.0, leave .net6.0, but upgrade the dependency to Microsoft.EntityFrameworkCore.SqlServer 7.0 and then upgrade your release to reflect the new version of the dependency. That way you are supporting .net6 and 7 target frameworks and the newer SqlServer library. Since you are upgrading your version to account for the breaking changes in EF Core Sql Server, you should be good. You should be able to do a minor version upgrade since no public APIs change, only internals/dependencies. I think. I'm not positive how versions are supposed to change when required dependencies change. I think you can upgrade your EF Core and Relational dependencies to 7 as well. I have been using them .net6 without an issue. You should only want to support .net7.0 when you are actually coding in C#11 or you want any SDK benefits of .net7. But, by doing so you may potentially require users to also upgrade their supported runtime, which may not be necessary at this time. The dependency of Microsoft.EntityFrameworkCore.SqlServer 6 does not correspond to .net 6 target framework and Microsoft.EntityFrameworkCore.SqlServer 7 does not correspond to .net 7 target framework. These target frameworks are not dependencies of each version of Microsoft.EntityFrameworkCore.SqlServer, but that is what is implied by your configuration. In fact, Microsoft.EntityFrameworkCore.SqlServer 6 and 7 are merely their own versions independent of the target framework. Microsoft.EntityFrameworkCore.SqlServer 7.0 is currently targeting .net6, so that is the target framework you should be supporting for best backward compatibility. Microsoft.EntityFrameworkCore.SqlServer is being upgraded with breaking changes and so your assembly should also be upgraded with this new dependency. I couldn't get your projects to compile properly with your configuration, otherwise I wouldn't mention it here. I think you have done an amazing job in developing this library. Before deciding upon your library I'd looked into linq2db.efcore and, like you, found their design problematic and, like you, I wanted an extension of EF Core. So, I'm grateful you created this library and that you are still servicing it. Semantic Versioning 2 Dependency Change Version Handling: |
I'm not convinced
Let me look into it a bit more and I agree with you, a use case like this should be in the README regardless of whether |
I didn't reuse the namespace. |
The truth is that your library is a good compliment, as is, to Thinktecture.EntityFrameworkCore and vice versa. So, if they are both dependencies of the same project then there would be no need to have the added feature. |
I should have said "name collision" which makes it more relevant since global usings has been introduced. |
Same problem here. Thinktecture is not available for PostgresSql, so in my opinion this library should not be related to others. |
I think I copied in the subqiuery logic from Thinktecture and submitted a PR, which was rejected by the library author because he considered it outside of the scope of the project. Take a look at my PR. |
@john-bartu, been working on a solution to have that working automatically, almost got it, but got stuck and then distracted. Will try to finish this week. Sql Server users had a workaround, so it wasn't overly urgent until now. If you can't wait, use @Ephron-WL's fork. |
If you found a way to implement the subquery without having to call a subquery function explicitly, then I salute you! 🏆🏆🏆 |
I might also add that I don't agree with developers shutting you down for suggesting the use of a subquery function. I mean, sure, we all want the system to do work for us without having to give it hints, e.g. there is implicit handling of the condition, but you and I both researched the possibilities and couldn't find any solution. The best was the use of the function, which I think is entirely legitimate and especially a minor implementation detail. There is the ideal, which many of us strive to obtain, and then there is what is practical and works now after an ideal was found not to be attainable. If you have figured out how to handle it implicitly in your library, then you are one major step ahead of my understanding and skillset in the matter and I commend that. |
@virzak In the meantime, I tried to write my own translation service to add one function |
Almost there guys. |
Ist there any progress on this? In my opinion the I would really need the |
@InspiringCode , this weekend. Promise. If I won't get it this weekend, I'll make the PR public and ask for help. It is almost there, but doesn't work for some reason. There was a lot of work this holiday break on this project, but didn't get to this issue. Feel terrible for feeding people promises, but we're almost there. |
Apologies to making you wait for a year, but this is now in a new beta. Usage: ctx.TestRows.Where(t => EF.Functions.DenseRank(EF.Functions.Over().OrderBy(t.Id / 10)) == 2) There are currently 3 basic tests that are passing, but I'm sure more work will be needed. This works for PostgreSQL, SQLite and SQL Server providers. These tests aren't enough to cover my client project which was the driver for creation of this library, so I'm excited to add more tests very soon. Feel free to submit PRs with failing basic tests of your own. The approach for this was to rewrite expression: From: {[Microsoft.EntityFrameworkCore.Query.EntityQueryRootExpression].Where(t => (__Functions_0.Max(t.Id, __Over_1.PartitionBy((t.Id / 10))) == Convert(23, Nullable`1)))} To: {[Microsoft.EntityFrameworkCore.Query.EntityQueryRootExpression].Select(o => new MyAnon() {Original = o, P0 = __Functions_0.Max(o.Id, __Over_1.PartitionBy((o.Id / 10)))}).AsSubQuery().Where(w => (w.P0 == Convert(23, Nullable`1))).Select(b => b.Original)} before query execution. The starting point is here: Lines 15 to 17 in ce835fb
The visitor:
If anyone has an opinion on the approach, I'd really appreciate. For example, it might be better to figure out if cc: @roji |
Thank you for your efforts @virzak! Please consider the following query (which is a very simplified version of my production query): public class Employee {
public Guid Id { get; set; }
public string Name { get; set; }
public string Department { get; set; }
public decimal Salary { get; set; }
public string Manager { get; set; }
} var employeesWithRank = db
.Set<Employee>()
.Select(e => new {
EmployeeId = e.Id,
Rank = EF.Functions.RowNumber(e.Department, EF.Functions.OrderByDescending(e.Salary))
});
var query = db.Set<Employee>()
.Where(x => x.Manager == "John")
.Join(employeesWithRank,
e => e.Id,
r => r.EmployeeId,
(e, r) => new { Id = e.Id, Name = e.Name, SalaryRank = r.Rank }); It should return all employees whos manager is John with their salary rank (but their rank within their WHOLE department and NOT the rank within all subordinates of Jonn!). So the expected query would be: SELECT [e].[Id], [e].[Name], [t].[Rank] AS [SalaryRank]
FROM [Employee] AS [e]
INNER JOIN (
SELECT [e0].[Id] AS [EmployeeId], DENSE_RANK() OVER(PARTITION BY [e0].[Department] ORDER BY [e0].[Salary] DESC) AS [Rank]
FROM [Employee] AS [e0]
) AS [t] ON [e].[Id] = [t].[EmployeeId]
WHERE [e].[Manager] = N'John' But the returned query is: SELECT [e].[Id], [e].[Name], DENSE_RANK() OVER(PARTITION BY [e0].[Department] ORDER BY [e0].[Salary] DESC) AS [SalaryRank]
FROM [Employee] AS [e]
INNER JOIN [Employee] AS [e0] ON [e].[Id] = [e0].[Id]
WHERE [e].[Manager] = N'John' Which is simply wrong. How can I solve this problem with your library? |
@virzak I'm interested in what you're doing here, but unfortunately I don't think I'll have time to look at the code in the near future. I will say that I'm currently working on various query architecture improvements that should hopefully make something like this much easier to do (making the SQL tree much more open to arbirtrary external manipulations, making it fully immutable...). CTE support is also high on my mental list, though this may not be done in time for 9.0 (we have a lot of planned work). |
Looks like that the work which was done for Where needs to be applied for Join as well. If a Window Function is detected inside a join - that's when The query you're currently getting can be available can be achieved using two methods (current 1 and current 2). One of them can be modified. |
@virzak But when I do it as you proposed in your dotnetfiddle, I would still have to add a reference to Thinktecture which I want to avoid. For this reason I proposed to make You would probably also have to perform var query = db.Set<Employee>()
.Where(e => e.Manager == "John")
.Select(e => new {
Id = e.Id,
Name = e.Name,
SalaryRank = db
.Set<Employee>()
.Select(r => EF.Functions.RowNumber(r.Department, EF.Functions.OrderByDescending(r.Salary)))
.Single(r => r.Id == e.Id) }) |
@InspiringCode, I didn't save the fiddle, that's the source of confusion. Please look again. We'll make current 1 to produce exact SQL of desired. |
@InspiringCode there you go: https://dotnetfiddle.net/32T5Ux No thinktecture and no .AsSubQuery(). Also desired Sql. Still lots of tests to run, but this is the concept. |
@InspiringCode, for your subselect could you provide a fiddle link? |
@john-bartu @Ephron-WL really sorry for making you wait guys. The fix is in in the latest beta along with window functions inside a Let me know if this works well for you. |
@virzak Thank you for your time. |
@virzak Thank you for your efforts! Regarding the subquery example: I thought about it in more detail and it is indeed hard to come up with a meaningful example. You would need to have double nested sub queries but in this case a Join is probably simpler and cleaner. So I guess we can forget about it for the moment. |
@InspiringCode There is a similar issue with a subquery in the where clause, but it can also be solved with a join. efcore-extensions/tests/Zomp.EFCore.WindowFunctions.Testing/SubQueryTests.cs Lines 159 to 164 in 9efbb28
|
Unfortunately for our production case, this still does not work. I will try to prepare a test case, or at least a fiddle showing my usage in which this does not work. In the meantime I'll just say that we are trying to use "EF.Functions.Count(EF.Functions.Over())" to determine the number of total rows before pagination but after filtering. We have something like: _context
.Where(.....) // huge filtering
.OrderByDescending(e => e.TimeModification)
.Select( new EventRes { Id = id, Count = EF.Functions.Count(EF.Functions.Over()))
.Skip(filter.Offset).Take(filter.ItemsPerPage)
.Join(...) // many joins The problem is, that the select containing The More or less, our logic is supposed to be this: We have groups, we filter them by members, order groups, paginate, and then take them with their members. |
I gave this a go with 8.0.19-beta but it didn't solve my case: SELECT [e].[CattleMustered], [e].[TimeTaken], ROW_NUMBER() OVER(ORDER BY [e].[CattleMustered] DESC, [e].[TimeTaken]) AS [c], [e].[Id], ROW_NUMBER() OVER(ORDER BY [e].[CattleMustered] DESC, [e].[TimeTaken]) AS [c0]
FROM [Entry1] AS [e]
WHERE EXISTS (
SELECT 1
FROM [EntryPlayer] AS [e0]
INNER JOIN [Players] AS [p] ON [e0].[SteamPlayerId] = [p].[Id]
WHERE [e].[Id] = [e0].[EntryId] AND [p].[Name] = @__steamName_2)
ORDER BY ROW_NUMBER() OVER(ORDER BY [e].[CattleMustered] DESC, [e].[TimeTaken])
OFFSET @__p_3 ROWS FETCH NEXT @__p_4 ROWS ONLY I have a where after the row_number but it's being hoisted inside (it's through a joiner table) |
This is slightly hard to follow. Can I ask to provide a fiddle with expected SQL? Thanks |
Hi @virzak !:) Any progression being made on this? If required I'm happy to show how it works for me, but I can't speak for how it's supposedly not working ref. @john-bartu, & @redwyre . /T |
This change is already in the master branch. Since it works for you, there should be no problem. When everyone else is satisfied and the test cases improve, I'll drop the prerelease tag. |
This is an excellent library. I've been seeking window functions in EF for the past 10 years, and finally, I've found a promising initiative here. WHERE clause on top of window function is working good now on this package Can we introduce the FIRST_VALUE and LAST_VALUE functions? I'll also try contribute if I get any free time. |
If I have multiple window functions in the query, it is considering only the first one
Rank is completely ignored from the query Another potential issue |
@virzak - This approach isn't feasible. How difficult would it be to retain the user's query as-is as an inner query and simply add a new outer query to access the window function columns defined in the inner query for filtering purposes? |
I was going to use a CTE for this, but linq2db can't figure out how to handle SQL Server temporal tables.
The query below translates to invalid SQL om efcore-extensions:
Produces this...
Should produce...
Below is the expression tree of the LINQ query, which is a little overcomplicated by calling a number of EF Core methods inline.
The text was updated successfully, but these errors were encountered: