-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix npo support #31976
base: master
Are you sure you want to change the base?
Fix npo support #31976
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work!
I've made a few suggestions, really just conventions, so I'll let the CI test run now.
Co-authored-by: dirkf <[email protected]>
I have accepted all suggestions here. Sorry I missed some conventions. |
* simplify comment * force CI
@dirkf Is it ready to merge now? |
I have had great success using this fork to download NPO Start videos. Greatly appreciated @bartbroere. It would be nice to see these changes merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to merge this but it needs valid tests.
All the current test URLs redirect to the home page, but maybe that doesn't happen in NL. Please ensure that there is at least one valid test (with playable content, non-DRM) and mark any invalid URLs with
'skip': 'Content expired',
(or ...: 'only available in NL',
, or whatever is appropriate) after the info_dict
. If an invalid test URL doesn't match the extractor's _VALID_URL
, it should be deleted (but I don't think that applies here).
If URLs are only valid in NL, or with an account, please post a console log of successful tests.
I've made some additional suggestions that I might improve with the aid of a usable test URL.
Thanks for the suggested improvements. I'll address all the feedback, and re-request review once I feel it's all done. |
@dirkf Looking into it I realised this is probably caused by the recently released new NPO app and website. I'll check all tests and maybe add some new ones |
yt-dlp/yt-dlp#9319 addresses the same issues. I think (speaking for the Dutch users a bit) that support for npo.nl is the main feature, and all the other sites often just embed the NPO player in some way. Since almost all of the other sites no longer work in the same way as before, I would propose throwing away extractors for many other domains, since rebuilding them is probably a lot quicker. |
Re-implementing these is quicker for the cases where that's even still possible
It's a bit vexing that the yt-dlp PR has done the same job, but by all means plunder any useful stuff from there. I would generally merge/back-port an updated yt-dlp extractor but generally also only when ours doesn't work. |
I'm not convinced that the pull request on On the other hand, my branch probably doesn't work for some of the So I think we can do a nice bit of "cross-plundering" between my |
Great. Also, if there are standard patterns for the discarded sites where NPO videos are embedded, we can add a class variable @classmethod
def _extract_urls(cls, webpage):
def yield_urls():
for p in cls._EMBED_REGEX:
for m in re.finditer(p, webpage):
yield m.group('url')
return list(yield_urls()) Eventually the webpage extraction system from yt-dlp should be pulled in, with the method standardised in |
Co-authored-by: Roy <[email protected]>
Co-authored-by: dirkf <[email protected]>
Co-authored-by: dirkf <[email protected]>
I'll be mostly offline during April, so this is just a heads up that this PR will not see much progress for a few weeks. I didn't forget about it though, and plan to continue working on it in May. |
I may be able to save you much of that work. I've run through the d/l tests locally and adjusted the code somewhat to match. If anyone has an answer to #31976 (comment), I'll be able to settle that extractor and push an update that should be close to mergeable. |
Co-authored-by: dirkf <[email protected]>
It successfully downloads all test URLs right now, but I'm not sure what it should say in the The example URLs all download successfully with this command python3.11 -m youtube_dl https://npo.nl/start/serie/vpro-tegenlicht/seizoen-11/zwart-geld-de-toekomst-komt-uit-afrika https://npo.nl/start/serie/zembla/seizoen-2015/wie-is-de-mol-2/ https://www.bnnvara.nl/videos/27455 https://ongehoordnederland.tv/2024/03/01/korte-clips/heeft-preppen-zin-betwijfel-dat-je-daar-echt-iets-aan-zult-hebben-bij-oorlog-lydia-daniel/ https://www.zapp.nl/programmas/zappsport/gemist/POMS_AT_811523 https://schooltv.nl/item/zapp-music-challenge-2015-zapp-music-challenge-2015 https://hetklokhuis.nl/dossier/142/zoek-het-uit/tv-uitzending/2987/aliens https://www.vpro.nl/programmas/tegenlicht/kijk/afleveringen/2015-2016/offline-als-luxe.html https://anderetijden.nl/programma/1/Andere-Tijden/aflevering/676/Duitse-soldaten-over-de-Slag-bij-Arnhem but I can't get the test framework to pass. That gives me the error:
So even with the failing tests I think this is an improvement over the current |
Before submitting a pull request make sure you have:
In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:
What is the purpose of your pull request?
Fix support for NPO sites
This fixes two things that have been changed on the NPO websites:The current video player no longer returns the token in the JSON body, but instead provides us with an XSRF token in the cookie.A second call changed from GET to POST and should include this XSRF token.This branch started out as a small fix, but in the ~11 months the PR was open the NPO site was updated heavily, so now it changes a lot more than the things above. Many of the broadcaster (NL: omroep) sites' extractors (vpro etc.) no longer worked, so these have been removed entirely. I'm always willing to look into re-implementing support for some of these, if that's still possible. However, typically the broadcaster's website just embeds the npo.nl player, so grabbing the same media from the npo.nl URL is recommended.