yt-dlp implementation in /play #115

bazettfraga · 2024-02-09T11:10:44Z

/play will now download YouTube URLs and upload them to the endpoint of a user's choice, then send the resulting URL to the clients in said area to play the song, allowing for more convenient use of music from YouTube.

The yt_cache dictionary can be saved as a yaml for a cache which persists throughout server shutdowns as previously discussed, though I have not yet integrated such a feature.

EstatoDeviato · 2024-02-15T17:13:24Z

Probably "https://www.youtube.com/" should be in the whitelist in url.txt.
Also I think ffmpeg_location should be added in config.

bazettfraga · 2024-02-15T23:03:15Z

I completely forgot to append it to the url.txt file, I will make sure to do so in the morning.
As for an ffmpeg_location config var, I will also implement it, though personally find it would have little purpose in the config. If the user does not have ffmpeg configured in their PATH environmental variable or any binary directories, then it would be more convenient for them to allow the get_ffmpeg() function to handle download and extraction of the binary.

If anyone is willing to edit the README to include ffmpeg instructions to explain how to handle that, that would be ideal.

OmniTroid · 2024-02-20T20:50:13Z

server/client_manager.py

@@ -431,6 +487,25 @@ def change_music(self, song, cid, showname="", effects=0, loop=True):
                            "This URL is not allowed."
                        )
                        return
+
+                yt_pattern = re.compile(r"^((?:https?:)?\/\/)?((?:www|m)\.)?((?:youtube\.com|youtu.be))(\/(?:[\w\-]+\?v=|embed\/|v\/)?)([\w\-]+)(\S+)?$")


There are more standard ways to parse URLs than regex, like urllib.parse. if we're lucky it may even fix the recursion error below.

Recursion error is not caused by this line but rather yt-dlp's own URL parsing if memory serves. Still, I will fix this soon!

OmniTroid · 2024-02-20T20:54:06Z

server/client_manager.py

+            try:
+                os.remove(yt_song_path)
+            except PermissionError:
+                pass #FIXME: unlucky


At the very least, log something?

For the best, I left this in while testing since I could not reproduce this bug on Linux, and was only happening on Windows (likely due to NOT closing the file stream. It should never occur now, but I will set up a virtual machine for further testing).

OmniTroid · 2024-02-20T20:59:49Z

server/tsuserver.py

@@ -522,6 +546,60 @@ def send_discord_chat(self, name, message, hub_id=0, area_id=0):
            pos=self.config["bridgebot"]["pos"],
        )

+    def get_ffmpeg(self):


Does the yt_dlp package really require a local installation of ffmpeg? This practice of downloading dependencies at runtime seems unfortunate either way

It requires it so that it can extract the audio stream from the video downloaded.
I am not proud of this function myself, but it felt like the ideal solution. Perhaps having the user acquire the binaries themselves would be better than doing this nonsense given the ffmpeg_location key, though I myself am not sure which is the better option of the two.

OmniTroid · 2024-02-20T21:00:28Z

Very cool concept, added some comments

OmniTroid · 2024-02-20T23:37:53Z

Also the match-case keywords were introduced in python 3.10, so we need to bump version in CI to make the linter happy

bazettfraga · 2024-02-21T02:25:25Z

Also the match-case keywords were introduced in python 3.10, so we need to bump version in CI to make the linter happy

OH, so that's what it was. It freaked me out the first time honestly lmao.

OmniTroid · 2024-02-21T13:56:28Z

server/client_manager.py

                        sys.setrecursionlimit(1200) #FIXME: figure out what's causing a RecursionError. Python's regex module should not recurse so far that it causes this to happen.
                        info = ""
-                        yt_id = yt_pattern.match(song).group(5)
+                        yt_id = parse_qs(yt_parse.query)['v'][0]


if the URL is malformed (eg. does not have a v query string), this will throw a KeyError. How is that exception handled?

I neglected to consider that, my bad.
Perhaps I should also account for different URL formats (such as v/ and embed/).

It's probably "good enough" to just accept v= for now, just if the link is not how we expect it to be, we should let the client know (eg. unable to play. URL is malformed). iirc if you raise a ClientError, the server will send the exception message to the client. need to check if that's correct though

ClientError posts in OOC chat, yeah. I will implement it ASAP.

Unless I am mistaken, frustratingly all of yt-dlp's errors seem to raise DownloadError which makes any error handling a nightmare.
e.g.:

yt_dlp.utils.PostProcessingError: ffprobe and ffmpeg not found. Please install or provide the path using --ffmpeg-location During handling of the above exception, another exception occurred: ... yt_dlp.utils.DownloadError: ERROR: Postprocessing: ffprobe and ffmpeg not found. Please install or provide the path using --ffmpeg-location

This seems to be as every function in YoutubeDL.py calls report_error on exception which raises DownloadError later on in the trouble function.

OmniTroid · 2024-02-26T14:10:22Z

There is an open question whether this constitutes permitted use of litterbox.catbox.moe or not. Let's not merge before we have a clear answer here. I'll come back to it.

Salanto · 2024-02-26T14:21:07Z

Wouldn't it be better to push the media to the servers WebAO repo?
That way it can be used by the client trough media streaming directly without having to even bother with filesharing TOS nightmares.

It would avoid all legal question and ensures operation for as long as the server itself is operational.

bazettfraga · 2024-02-27T01:12:37Z

I was myself wondering if it's a good idea of using catbox as a CDN at all or not as it might be an abuse of the service, but for the purposes of testing it's done its job. It is also the reason why I've implemented the ability to change how and where the files are passed, though;

Wouldn't it be better to push the media to the servers WebAO repo?

I'm unaware of how this works as I've not looked into it. Could you point me to the right resources for that service?
As an aside, the more I've thought on this project the more sane it has felt to implement the YT downloading/streaming functionality into the client itself rather than the server, something I'd not considered in hindsight as I thought this to be a simpler problem than it was.

Salanto · 2024-02-27T06:17:55Z

I'm unaware of how this works as I've not looked into it. Could you point me to the right resources for that service?

If the Webserver and the AO2 Server run on the same machine, which is the case in 99.9% of servers, the software can just store the saved file in the directory the Webserver saves data from.

bazettfraga · 2024-02-27T22:07:20Z

If the Webserver and the AO2 Server run on the same machine

Ah, I think I see what you mean now. This would still be an issue for server hosts who only want to use KFO-Server without anything else, however? At that rate, it may be simpler to have the server itself open a simple webserver to serve the files off of as this would simply require the host to open an additional port on their router.

Unless I am mistaken, but the only webAO repo I have found is AttorneyOnline/webAO and requiring the host to set this up for the purpose of serving music feels redundant and needlessly complicated.

lambdcalculus · 2024-02-29T03:41:07Z

Unless I am mistaken, but the only webAO repo I have found is AttorneyOnline/webAO and requiring the host to set this up for the purpose of serving music feels redundant and needlessly complicated.

That repo is for the webAO website - that's not what hosts would need to set up. A webAO asset server (which is what is being referred to) is simply a file server, which can be set up with, say, nginx or Apache. The AO server then sends the URL it during the handshake (iirc?) which is what webAO clients use for assets. I think it's not that big of an ask to set this up, at least for 'big' hosts who are hosting their AO servers 24/7 (and most of them already have this), since it's necessary if they want to pull in webAO users. You also get the advantage that you can choose the filename and thus URL to be something pretty, which means you won't get some gibberish like "Phoenix has played a song: gdjkshdfgjl".

If you want this feature to also be available to the 'little guy,' then I think just catbox might be fine in those cases? Since, as far as demand goes, it wouldn't be that much more than users themselves using yt-dlp to download YT vids and uploading them to catbox to use on AO. It just automates the process. And I think it's unlikely small, private, and/or non-permanent servers would end up using it enough to warrant attention.

bazettfraga · 2024-02-29T09:44:05Z

Ah, I see. I was completely unaware of this, thank you for letting me know!
Then yes, if this is something major servers are utilizing already I see no reason not to implement support for it already, and I'll have to look into it over the coming days.

If you want this feature to also be available to the 'little guy,' then I think just catbox might be fine in those cases

That was my (albeit shortsighted) original intention when creating the feature in such a way, as I know a few server owners who only host for the purpose of one or two of their own RPs, which I feel should not be neglected.

oldmud0

CW personally requested me to look at this, so here I am

oldmud0 · 2024-04-12T04:52:44Z

server/tsuserver.py

+                     "manjaro": "pacman -S ffmpeg",
+                     "fedora": "dnf install ffmpeg",
+                     "opensuse-leap": "zypper install ffmpeg-4", #This surely won't deprecate.
+                     "opensuse-tumbleweed": "zypper install ffmpeg-4" #Are they fucking serious?


truly idiot proof design

honestly code was made just to get shoved out the door so I could fix this issue (though it caused a few others that I should resolve like an idiot).

oldmud0 · 2024-04-12T04:56:24Z

server/client_manager.py

+                yt_parse = urlparse(song)
+
+                if "youtube" in yt_parse.netloc:
+                        sys.setrecursionlimit(1200) #FIXME: figure out what's causing a RecursionError. Python's regex module should not recurse so far that it causes this to happen.


This looks sus. Check the stack trace. If this keeps being an issue then the regex that parse_qs or parse_qsl uses is too complicated and you will need to write an iterative version. You do not want people crashing the darn server from writing long links.

Extremely suspect, I fully agree, though I now believe yt-dlp to be a bit brain-damaged in some ways.
When I was working on the project I considered finding alternative ways of downloading songs from youtube due to the variety of annoyances it caused.

oldmud0 · 2024-04-12T04:58:25Z

server/client_manager.py

+                        'preferredcodec': 'vorbis',
+                        'preferredquality': '192',
+                    }],
+                    "outtmpl": 'storage/tmp/%(title)s.%(ext)s',


Why don't you use the Python NamedTemporaryFile object that gives you a file and filename you can use. This way you don't need to worry about this folder having correct perms set by the admin, and also having to clean it up.

At the time I was uncertain as to whether I wanted the files to be reuploaded from local storage (if the link somehow expired) rather than having to redownload them again and would have made appropriate configs for that.
Of course it would just be smarter to host a local webserver to serve these files off of once they are pulled, but that does not resolve the core issue of the server waiting for the download function to finish when is called, which I have not looked at at all as I am preoccupied due to personal circumstances.

mposs00 · 2024-04-12T14:17:31Z

please for the love of god do not merge this ever

bazettfraga · 2024-02-21T15:10:50Z

server/client_manager.py

+        def mirror_youtube(self, yt_url):
+            try:
+                with yt_dlp.YoutubeDL(self.ydl_opts) as ydl:
+                    info = ydl.extract_info(yt_url, download=True)


I have just discovered this function locks the entire server until it concludes (both the downloading and conversion). Implementing async into the code might prove problematic for this one function. I will need to think on how to proceed with this in mind.

bazettfraga · 2024-02-21T16:18:20Z

server/client_manager.py

                        sys.setrecursionlimit(1200) #FIXME: figure out what's causing a RecursionError. Python's regex module should not recurse so far that it causes this to happen.
                        info = ""
-                        yt_id = yt_pattern.match(song).group(5)
+                        yt_id = parse_qs(yt_parse.query)['v'][0]


Unless I am mistaken, frustratingly all of yt-dlp's errors seem to raise DownloadError which makes any error handling a nightmare.
e.g.:

yt_dlp.utils.PostProcessingError: ffprobe and ffmpeg not found. Please install or provide the path using --ffmpeg-location During handling of the above exception, another exception occurred: ... yt_dlp.utils.DownloadError: ERROR: Postprocessing: ffprobe and ffmpeg not found. Please install or provide the path using --ffmpeg-location

This seems to be as every function in YoutubeDL.py calls report_error on exception which raises DownloadError later on in the trouble function.

bazettfraga · 2024-04-14T14:51:20Z

server/tsuserver.py

+                     "manjaro": "pacman -S ffmpeg",
+                     "fedora": "dnf install ffmpeg",
+                     "opensuse-leap": "zypper install ffmpeg-4", #This surely won't deprecate.
+                     "opensuse-tumbleweed": "zypper install ffmpeg-4" #Are they fucking serious?


honestly code was made just to get shoved out the door so I could fix this issue (though it caused a few others that I should resolve like an idiot).

bazettfraga · 2024-04-14T14:53:01Z

server/client_manager.py

+                        'preferredcodec': 'vorbis',
+                        'preferredquality': '192',
+                    }],
+                    "outtmpl": 'storage/tmp/%(title)s.%(ext)s',


At the time I was uncertain as to whether I wanted the files to be reuploaded from local storage (if the link somehow expired) rather than having to redownload them again and would have made appropriate configs for that.
Of course it would just be smarter to host a local webserver to serve these files off of once they are pulled, but that does not resolve the core issue of the server waiting for the download function to finish when is called, which I have not looked at at all as I am preoccupied due to personal circumstances.

bazettfraga · 2024-04-14T14:55:09Z

server/client_manager.py

+                yt_parse = urlparse(song)
+
+                if "youtube" in yt_parse.netloc:
+                        sys.setrecursionlimit(1200) #FIXME: figure out what's causing a RecursionError. Python's regex module should not recurse so far that it causes this to happen.


Extremely suspect, I fully agree, though I now believe yt-dlp to be a bit brain-damaged in some ways.
When I was working on the project I considered finding alternative ways of downloading songs from youtube due to the variety of annoyances it caused.

bazettfraga added 8 commits February 8, 2024 14:52

ytdlp functionality

4e6b044

fix stray newlines lol

2561d73

include configs for youtube music features

0a27bb2

cache links in memory

a591c81

remove unused function call

b294bcf

ffmpeg installation automated for yt use

ea5a49c

fixed dogshit zip extraction

c83db9d

it worked on linux, okay?

1e51ad0

bazettfraga force-pushed the yt-dlp branch from 4d2f27c to 9a51fe1 Compare February 10, 2024 22:36

make check for bin/ffmpeg* work on both MacOS AND Windows now.

a9d2f76

bazettfraga force-pushed the yt-dlp branch from 9a51fe1 to a9d2f76 Compare February 10, 2024 22:38

bazettfraga added 2 commits February 10, 2024 23:41

fix indentation for error report...

6654711

close file stream.

e33e2b7

bazettfraga added 2 commits February 16, 2024 10:57

add youtube to url.txt

1eb12dc

make ffmpeg_location configurable

5f22af8

OmniTroid suggested changes Feb 20, 2024

View reviewed changes

bazettfraga added 2 commits February 21, 2024 14:14

logging, fix default ffmpeg_location breaking ffmpeg in path.

8eee7ae

using urllib parse to detect youtube URLs instead of regex.

aa3cac6

OmniTroid reviewed Feb 21, 2024

View reviewed changes

oldmud0 reviewed Apr 12, 2024

View reviewed changes

bazettfraga commented Apr 14, 2024

View reviewed changes

yt-dlp implementation in /play #115

Are you sure you want to change the base?

yt-dlp implementation in /play #115

Conversation

bazettfraga commented Feb 9, 2024

EstatoDeviato commented Feb 15, 2024

bazettfraga commented Feb 15, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

OmniTroid commented Feb 20, 2024

OmniTroid commented Feb 20, 2024

bazettfraga commented Feb 21, 2024

OmniTroid Feb 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

OmniTroid commented Feb 26, 2024

Salanto commented Feb 26, 2024 • edited Loading

bazettfraga commented Feb 27, 2024

Salanto commented Feb 27, 2024

bazettfraga commented Feb 27, 2024

lambdcalculus commented Feb 29, 2024

bazettfraga commented Feb 29, 2024

oldmud0 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mposs00 commented Apr 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

OmniTroid Feb 21, 2024 •

edited

Loading

Salanto commented Feb 26, 2024 •

edited

Loading