Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforce the same interface across all ...Query implementations #5210

Merged
merged 4 commits into from
May 1, 2024

Conversation

snejus
Copy link
Member

@snejus snejus commented Apr 26, 2024

Description

Make PlaylistQuery a FieldQuery

While working on the DB optimization and looking at updates upstream I discovered one query which does not follow the FieldQuery interface —PlaylistQuery, so I looked into it in more detail and ended up integrating it as a FieldQuery.

One special thing about it is that it uses IN SQL operator, so I added implementation for this sort of query outside the playlist context, see InQuery.

Otherwise, it seems like PlaylistQuery is a field query with a special way of resolving values it wants to query. In the future, we may want to consider moving this kind of custom initialization logic away from init methods to factory/@classmethod: this should make it more clear that the purpose of such logic is to resolve the data that is required to define a particular FieldQuery class fully.

Remove NamedQuery since it is unused

This simplifies query parsing logic in queryparse.py. We know that this logic can only receive FieldQuery classes thus I adjusted types and removed the logic that handles other cases.

Effectively, this means that the query parsing logic does not need to care whether the query is named by the corresponding DB field. Instead, queries like SingletonQuery and PlaylistQuery are initialized with the same data as others and take things from there themselves: in this case they translate singleton and playlist queries to the underlying DB filters.

@snejus snejus self-assigned this Apr 26, 2024
@snejus snejus force-pushed the add-in-query-and-remove-named-query branch from 5a4e33c to b9c3f14 Compare April 26, 2024 00:06
@wisp3rwind
Copy link
Member

Hey, nice to see that you're picking up the work on dbcore again!

I introduced the NamedQuery class at some point when adding typings to the module. Did you check whether this refactoring still passes type checking (i.e. mypy)? While we don't really enforce this (for now), it would be unfortunate to roll it back only to realize later that just FieldQuery doesn't suffice to express the involved types and constructor signatures.

Generally, I agree that construct_query_part has a quite complicated dispatch to the various query constructors. It would be great if that could be expressed in a more concise way.

@snejus
Copy link
Member Author

snejus commented Apr 28, 2024

Did you check whether this refactoring still passes type checking (i.e. mypy)?

It's happy!

image

mypy runs live in my editor as I code where it's allowed to control my implementation :)

You've done an amazing job designing the generic types for FieldQuery class - you can see here how the pattern supported the new InQuery nicely out of the box.

@snejus snejus force-pushed the add-in-query-and-remove-named-query branch from b9c3f14 to 28a78d1 Compare April 28, 2024 16:43
Copy link
Member

@wisp3rwind wisp3rwind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, ok, so the point is that we don't really have any non-field query at this point. One could in principle imagine this: Maybe a query that involves several fields (although we also have the computed fields machinery, which might express that as well), or a plugin maintaining an extra table and defining a query on top of that. Thus, we don't need to accommodate for such queries at this point.

Looks good to me; I left a few notes where things might be improved or I'm not quite sure I understand why you coded things the way you did.

beets/dbcore/db.py Show resolved Hide resolved
beets/dbcore/query.py Outdated Show resolved Hide resolved
beets/dbcore/query.py Outdated Show resolved Hide resolved
@snejus snejus force-pushed the add-in-query-and-remove-named-query branch from 2c286fe to 5276d04 Compare April 30, 2024 12:02
@snejus snejus changed the title Enforce the same interface across all \...Query\ implementations Enforce the same interface across all ...Query implementations Apr 30, 2024
@snejus
Copy link
Member Author

snejus commented Apr 30, 2024

@wisp3rwind you can see that I replaced memoryview with bytes in PlaylistQuery to query BLOB column Item.path.

Is there a reason we're using BLOB_TYPE = memoryview instead of bytes? I tested querying using bytes and I see to get the expected results

@snejus snejus requested a review from wisp3rwind April 30, 2024 12:06
@wisp3rwind
Copy link
Member

Is there a reason we're using BLOB_TYPE = memoryview instead of bytes? I tested querying using bytes and I see to get the expected results

beets/beets/library.py

Lines 41 to 44 in 92fb830

# To use the SQLite "blob" type, it doesn't suffice to provide a byte
# string; SQLite treats that as encoded text. Wrapping it in a
# `memoryview` tells it that we actually mean non-text data.
BLOB_TYPE = memoryview

@snejus
Copy link
Member Author

snejus commented Apr 30, 2024

Is there a reason we're using BLOB_TYPE = memoryview instead of bytes? I tested querying using bytes and I see to get the expected results

beets/beets/library.py

Lines 41 to 44 in 92fb830

# To use the SQLite "blob" type, it doesn't suffice to provide a byte
# string; SQLite treats that as encoded text. Wrapping it in a
# `memoryview` tells it that we actually mean non-text data.
BLOB_TYPE = memoryview

Thanks!

This may be slightly outdated (this was introduced ~8 years ago) or am I missing something?

🐶 beet list label:ᴄʟᴇʀɢʏ -f '$path'
/run/media/sarunas/music/Music/🔥/03_daxj_stealthmode.flac
[ins] In [17]: db = sqlite3.connect("/home/sarunas/.music/beets/library.db")

[ins] In [18]: res = list(db.execute("select path from items where path = ?", ("/run/media/sarunas/music/Music/🔥/03_daxj_stealthmode.flac".encode(),)))

[ins] In [19]: res
Out[19]: [(b'/run/media/sarunas/music/Music/\xf0\x9f\x94\xa5/03_daxj_stealthmode.flac',)]

[ins] In [20]: res[0][0].decode()
Out[20]: '/run/media/sarunas/music/Music/🔥/03_daxj_stealthmode.flac'

@wisp3rwind
Copy link
Member

Is there a reason we're using BLOB_TYPE = memoryview instead of bytes? I tested querying using bytes and I see to get the expected results

beets/beets/library.py

Lines 41 to 44 in 92fb830

# To use the SQLite "blob" type, it doesn't suffice to provide a byte
# string; SQLite treats that as encoded text. Wrapping it in a
# `memoryview` tells it that we actually mean non-text data.
BLOB_TYPE = memoryview

Thanks!

[...]

This may be slightly outdated (this was introduced ~8 years ago) or am I missing something?

I really don't know. Is there any benefit to changing this? If at all possible, I'd prefer not to do so, at least not without a very good understanding of the original reasoning. Both to keep this PR focused, and because I'm afraid that this has the potential to break things in unexpected ways on different systems (with different OS, filesystem, terminal encodings).

Note that memoryview was specifically introduced for Python 3, so it doesn't look like a leftover of Python 2 times.

Copy link
Member

@wisp3rwind wisp3rwind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another, hopefully last, round of nitpicks :)

beets/dbcore/query.py Show resolved Hide resolved
beets/util/__init__.py Show resolved Hide resolved
beetsplug/playlist.py Show resolved Hide resolved
@snejus snejus force-pushed the add-in-query-and-remove-named-query branch 3 times, most recently from 3e6e369 to dddf090 Compare April 30, 2024 18:57
snejus and others added 4 commits April 30, 2024 22:20
While working on the DB optimisation I discovered one query which does
not follow the 'FieldQuery' interface - 'PlaylistQuery', so I looked
into it in more detail.

One special thing about it is that it uses 'IN' SQL operator, so
I defined 'InQuery' query class to have this logic outside of the
playlist context.

Otherwise, it seems like 'PlaylistQuery' is a field query, even if it
has a very special way of resolving values it wants to query. In the
future, we may want to consider moving this kind of custom
_initialisation_ logic away from '__init__' methods to
factory/@classmethod: this should make it more clear that the purpose of
such logic is to resolve the data that is required to define
a particular FieldQuery class fully.
Remove 'NamedQuery' since it is not being used by any queries any more.

This simplifies query parsing logic in 'queryparse.py'. We know that
this logic can only receive 'FieldQuery' thus I adjusted types and
removed the logic that handles other cases.

Effectively, this means that the query parsing logic does not any more
care whether the query is named by the corresponding DB field. Instead,
queries like 'SingletonQuery' and 'PlaylistQuery' are responsible for
translating 'singleton' and 'playlist' to the underlying DB filters.
@snejus snejus force-pushed the add-in-query-and-remove-named-query branch from dddf090 to b5e98b5 Compare April 30, 2024 21:20
Copy link
Member

@wisp3rwind wisp3rwind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, looks good to me now. Thanks 👍

@snejus snejus merged commit 34a59f9 into beetbox:master May 1, 2024
11 checks passed
@snejus snejus deleted the add-in-query-and-remove-named-query branch May 1, 2024 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants