-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with decoding .html file #11
Comments
Hey, @vgzhn, I'd assume that the |
I had the same error, but changing the beautifulsoup encoding to utf-8 seemed to fix it |
I'm having the same exact problem and I have no idea what to do |
@rstebee if you can post some or all of the data that's causing trouble, that'll help a lot. |
Hi @Jessime, First of all, thank you very much for this awesome work. I was trying out this project last night and got into the exact error posted here, and after a bit of looking around I found 2 threads on StackOverflow that helped me with finding a workaround: So, in the file
It seems like the Hope this helps with your problem. I'm also a Python noob so please feel free to propose a better solution 👏 . |
Hey all, this commit should fix things up: Thanks for reporting the issues! |
Welcome! Extracting video urls from Takeout. Traceback (most recent call last): File "youtube_history.py", line 369, in <module> analysis.run() File "youtube_history.py", line 348, in run self.download_data() File "youtube_history.py", line 155, in download_data soup = BeautifulSoup(watch_history.read_text(), 'html.parser') File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2288.0_x64__qbz5n2kfra8p0\lib\pathlib.py", line 1236, in read_text return f.read() File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2288.0_x64__qbz5n2kfra8p0\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 3321: character maps to < undefined>
I've tried locating position 3321 and couldn't find anything obvious to remove, also inserting "file = open(filename, errors="ignore")" didn't work for me.
I'm an absolute beginner with python.
Maybe that could be avoided by using the .json takeout?
The text was updated successfully, but these errors were encountered: