Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[XVideos] Support playlists, searches and channels #30774

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

dirkf
Copy link
Contributor

@dirkf dirkf commented Mar 25, 2022

Please follow the guide below


Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense, except for code from yt-dlp for which either this or the below has been separately asserted
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

  • Bug fix
  • Improvement
  • New extractor
  • New feature

Description of your pull request and other information

The XVideos extractor supported single videos presented via various URL formats. The site also offers various playlist-like pages:

  • actual playlists: /favorite/ID/SLUG
  • related videos/playlist: video URL + #_related-...
  • channel activity: /...channels/ID
  • channel videos: channel URL + #_tabVideos
  • channel favourites (playlist): channel URL + #_tabFavorites
  • search results: ...?k=(ID).

This PR tries to support extracting from those pages. It also pulls in small changes from the yt-dlp extractor.

To do:

  • more tests
  • simplify?

@dirkf dirkf mentioned this pull request Mar 25, 2022
11 tasks
@afterdelight
Copy link
Contributor

pls merge

@dirkf
Copy link
Contributor Author

dirkf commented Mar 29, 2022

Before I merge it, please test it further if you can. Only a few URLs were used to develop the new extractor functions.

@afterdelight
Copy link
Contributor

afterdelight commented Mar 29, 2022

it works for channel, except the uploader output, -o "%(uploader)s" created a folder named NA and not the uploader's name

@dirkf
Copy link
Contributor Author

dirkf commented Mar 30, 2022

These metadata fields are in the hydration JSON for, eg, https://www.xvideos.com/video64379435/verification_video:

    'uploader_id': 505962135,
    'uploader': 'tianmeichuanmei',

#30689 is supposed to

  • extract uploader, uploader_id and uploader_url

I guess this uses those fields.

With just this PR, use %(playlist_id)s to get the uploader.

@afterdelight
Copy link
Contributor

ok then will this PR merged with the other PR?

@xspish
Copy link

xspish commented Apr 12, 2022

Hey guys! Here just to comment that it would be awesome to have channel dl support, as well as uploader/uploader_id as --output option for xvideos.

I've been doing all sorts of workarounds with other apps to get xvideos dls correctly, in yt-dlp above options work flawlessly on pornhub, but would be awesome to get them for xvideos too

thanks and I hope it gets implemented soon :)

@zapper
Copy link

zapper commented Jun 23, 2022

There is some issue once the error "requested format not available" is thrown during download of "#_tabVideos".

[download] Downloading video 12 of 575
[XVideos] 59824163: Downloading webpage
[XVideos] 59824163: Downloading m3u8 information
[XVideos] 59824163: Checking hls-480p video format URL
[XVideos] 59824163: Checking hls-720p video format URL
[XVideos] 59824163: Checking hls-360p video format URL
[XVideos] 59824163: Checking hls-250p video format URL
ERROR: requested format not available
Traceback (most recent call last):
  File "/home/zap/git/youtube-dl_PR_30774/youtube_dl/YoutubeDL.py", line 816, in wrapper
    return func(self, *args, **kwargs)
  File "/home/zap/git/youtube-dl_PR_30774/youtube_dl/YoutubeDL.py", line 848, in __extract_info
    return self.process_ie_result(ie_result, download, extra_info)
  File "/home/zap/git/youtube-dl_PR_30774/youtube_dl/YoutubeDL.py", line 882, in process_ie_result
    return self.process_video_result(ie_result, download=download)
  File "/home/zap/git/youtube-dl_PR_30774/youtube_dl/YoutubeDL.py", line 1684, in process_video_result
    raise ExtractorError('requested format not available',
youtube_dl.utils.ExtractorError: requested format not available

Then the download is interrupted... Is this expected behaviour?

The used command line options are:

-f "bestvideo[height>=2160]+bestaudio/best[height>=2160]/bestvideo[height>=1440]+bestaudio/best[height>=1440]/bestvideo[height>=1080]+bestaudio/best[height>=1080]/bestvideo[height>=720]+bestaudio/best[height>=720]" --ignore-errors --no-progress --playlist-random --verbose --download-archive .download_archive

Also i can confirm the id's are being successfully stored for --download-archive

@dirkf
Copy link
Contributor Author

dirkf commented Jun 23, 2022

Surely your format selection is equivalent to (bestvideo+bestaudio/best)[height >=? 720]

Post the full verbose log including the URL: perhaps use --playlist-items 12 to get the problem item directly, if only you weren't using --playlist-random.

@zapper
Copy link

zapper commented Jun 25, 2022

I found the issue - the root cause is actually sitting in front of the keyboard.

So I can confirm that your PR works like a charm.

There was a bug in my wrapper script which took the command line arguments in a variable which was wrongly masked and therefore not correctly passed to the "-f" parameter:

[debug] Command-line args: ['-f bestvideo[height>=2160]+bestaudio/best[height>=2160]/bestvideo[height>=1440]+bestaudio/best[height>=1440]/bestvideo[height>=1080]+bestaudio/best[height>=1080]/bestvideo[height>=720]+bestaudio/best[height>=720] --ignore-errors --no-progress --playlist-random', '--verbose', '--download-archive', '/srv/x3/Hub/.download_archive_all', 'https://www.xvideos.com/amateur-channels/closeupfantasy#_tabVideos']

Changing my long argument string with your short version visually helped me a lot to find the problem - it's usually better to have everything more readable. Thanks for that. So my issue was absolutely unrelated to your PR.

Sorry for wasting your time - i am looking forward to see your changes soon in the master branch and later in the official binary.

@dirkf dirkf linked an issue Jan 5, 2023 that may be closed by this pull request
6 tasks
@dirkf dirkf marked this pull request as ready for review October 2, 2023 02:42
@nicolaasjan
Copy link

nicolaasjan commented Oct 3, 2023

Tested on channels and that worked, but not on profiles:

youtube-dl -v https://www.xvideos.com/profiles/lewdgamerx1990#
[debug] System config: []
[debug] User config: ['--rm-cache-dir', '-i', '-o', '/dev/shm/test-ytd/%(title)s.%(ext)s', '-f', 'bestvideo[height<=1080][ext=mp4][vcodec^=avc]+bestaudio[ext=m4a]/best[ext=mp4]/best', '--no-mtime', '--embed-thumbnail', '--force-ipv4']
[debug] Custom config: []
[debug] Command-line args: ['-v', 'https://www.xvideos.com/profiles/lewdgamerx1990#']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2023.10.03
[debug] Lazy loading extractors enabled
[debug] Single file build
[debug] Python 3.8.10 (CPython x86_64 64bit) - Linux-5.4.0-163-generic-x86_64-with-glibc2.29 - OpenSSL 1.1.1f  31 Mar 2020 - glibc 2.31
[debug] exe versions: ffmpeg N-112061-g654e4b00e2-Nico-20230914, ffprobe N-112061-g654e4b00e2-Nico-20230914, phantomjs 2.1.1, rtmpdump 2.4
[debug] Proxy map: {}
Removing cache dir /home/nico/.cache/youtube-dl ..
[XVideosPlaylist] lewdgamerx1990: Downloading webpage
[download] Downloading playlist: lewdgamerx1990
[XVideosPlaylist] playlist lewdgamerx1990: Collected 0 video ids (downloading 0 of them)
[download] Finished downloading playlist: lewdgamerx1990

@dirkf dirkf force-pushed the df-xvideos-playlist-patch branch from da0092c to 2883baa Compare October 16, 2023 03:22
@githubafterdark
Copy link

I tested this and for me it didn't seem to work with models pages, like for example:

 ~/Desktop/youtube-dl/bin git:(df-xvideos-playlist-patch) ✗ '/home/account/Desktop/youtube-dl/bin/youtube-dl'  https://www.xvideos.com/models/lizvicious1        
[generic] lizvicious1: Requesting header
WARNING: Falling back on generic information extractor.
[generic] lizvicious1: Downloading webpage
[generic] lizvicious1: Extracting information
ERROR: Unsupported URL: https://www.xvideos.com/models/lizvicious1

It also didn't work with #_tabVideos appended either, I don't think I'm doing anything incorrectly:

 ~/Desktop/youtube-dl/bin git:(df-xvideos-playlist-patch) ✗ '/home/account/Desktop/youtube-dl/bin/youtube-dl'  https://www.xvideos.com/models/lizvicious1#_tabVideos
[generic] lizvicious1#_tabVideos: Requesting header
WARNING: Falling back on generic information extractor.
[generic] lizvicious1#_tabVideos: Downloading webpage
[generic] lizvicious1#_tabVideos: Extracting information
ERROR: Unsupported URL: https://www.xvideos.com/models/lizvicious1#_tabVideos

@dirkf dirkf force-pushed the df-xvideos-playlist-patch branch from 150aa26 to d4ba4d2 Compare October 22, 2023 02:44
dirkf added 8 commits October 22, 2023 03:58
* check all playlist counts, not just max
* also consider any actual playlist in the test case
* add uploader, tag, performer and view_count extraction (closes ytdl-org#30689)
* add dis/like_count extraction
…star pages

* various -channels/...
* profiles
* pornstars, models
* tabs within the above, with sorting and pagination where applicable
* also quickie lists and videos
@dirkf dirkf force-pushed the df-xvideos-playlist-patch branch from d4ba4d2 to 629d407 Compare October 22, 2023 03:00
@dirkf
Copy link
Contributor Author

dirkf commented Oct 22, 2023

You may not have installed the working code, which gives this:

$ python -m youtube_dl --flat-playlist 'https://www.xvideos.com/models/lizvicious1#_tabVideos'
[XVideosChannel] lizvicious1: Downloading webpage
[download] Downloading playlist: Lizvicious - Pornstar page (videos,all)
[XVideosChannel] lizvicious1/videos: Downloading webpage
[XVideosChannel] lizvicious1/videos (+1): Downloading webpage
[XVideosChannel] lizvicious1/videos (+2): Downloading webpage
[XVideosChannel] lizvicious1/videos (+3): Downloading webpage
[XVideosChannel] playlist Lizvicious - Pornstar page (videos,all): Downloading 109 videos
[download] Downloading video 1 of 109
[download] Downloading video 2 of 109
[download] Downloading video 3 of 109
...
[download] Downloading video 107 of 109
[download] Downloading video 108 of 109
[download] Downloading video 109 of 109
[download] Finished downloading playlist: Lizvicious - Pornstar page (videos,all)
$

See, eg, https://stackoverflow.com/questions/13561618/pip-how-to-install-a-git-pull-request.

@dirkf dirkf force-pushed the df-xvideos-playlist-patch branch from 629d407 to cc1657b Compare October 22, 2023 03:39
@githubafterdark
Copy link

githubafterdark commented Oct 22, 2023

Interesting, I didn't think I did anything incorrectly.

I cloned the repo and checked out the branch and ran make, ran the binary it gave me and it gave me that output, here are the commands I used:

git clone https://github.com/dirkf/youtube-dl/
cd youtube-dl
git checkout df-xvideos-playlist-patch
make

Then I ran it from the bin folder.

With that said, your pip install suggestion worked for me, thanks!

Here is the command I used: pip install git+https://github.com/ytdl-org/youtube-dl.git@refs/pull/30774/head

@dirkf
Copy link
Contributor Author

dirkf commented Oct 22, 2023

After checking out the code, you can run it from the youtube-dl directory (that contains youtube_dl) using python -m youtube_dl ..., where you would normally have youtube-dl ..., so make isn't needed; here python is whatever you use to invoke the Python that wish to use, nowadays typically python3.

@master-leonardo
Copy link

Hey, could you please update this code for the new site changes?

@dirkf
Copy link
Contributor Author

dirkf commented Apr 12, 2024

You mean like yt-dlp/yt-dlp#9502 ?

@master-leonardo
Copy link

You mean like yt-dlp/yt-dlp#9502 ?

Yes and no.
Yes as in it works for individual downloads.
And no as that change doesn't support profiles, channels, searches etc..

@dirkf
Copy link
Contributor Author

dirkf commented Apr 15, 2024

If anyone would like to document the site changes with regard to the various types of pages that the site supports and the PR extractor is meant to support, that would make it a lot quicker to update.

@dirkf dirkf mentioned this pull request Apr 27, 2024
5 tasks
@Lux-Hue
Copy link

Lux-Hue commented May 13, 2024

If anyone would like to document the site changes with regard to the various types of pages that the site supports and the PR extractor is meant to support, that would make it a lot quicker to update.

The most important change is the formatting of the video links.

Was:
https://www.xvideos.com/video66313831/hottest_car_sex_ever_tesla_autopilot_driving

Is now:
https://www.xvideos.com/video.kfkdepm21e2/hottest_car_sex_ever_tesla_autopilot_driving

For some reason (haven't had much time to dive into the code) youtube-dl extracts the video style URL, which is invalid.

@Lux-Hue
Copy link

Lux-Hue commented May 13, 2024

Way I see it, the issue is here:

extractor/xvideos.py around line 914

def _extract_videos(self, url, playlist_id, num, page):

This is the input of one video's data in the page variable.

{
  "id": 72799231,
  "u": "/prof-video-click/model/creamy-spot1/uopdkffea37/big_dildo_for_creamy_pussy_squirt_watch_full_uncensored_video_in_red_subscription_",
  "i": "https://cdn77-pic.xvideos-cdn.com/videos/thumbs169/28/6d/6f/286d6f37c99ae3ceee9d25ecbe688a59/286d6f37c99ae3ceee9d25ecbe688a59.27.jpg",
  "il": "https://cdn77-pic.xvideos-cdn.com/videos/thumbs169ll/28/6d/6f/286d6f37c99ae3ceee9d25ecbe688a59/286d6f37c99ae3ceee9d25ecbe688a59.27.jpg",
  "if": "https://cdn77-pic.xvideos-cdn.com/videos/thumbs169lll/28/6d/6f/286d6f37c99ae3ceee9d25ecbe688a59/286d6f37c99ae3ceee9d25ecbe688a59.27.jpg",
  "ip": "https://cdn77-pic.xvideos-cdn.com/videos/thumbs169poster/28/6d/6f/286d6f37c99ae3ceee9d25ecbe688a59/286d6f37c99ae3ceee9d25ecbe688a59.27.jpg",
  "c": 10,
  "tf": "Big Dildo for Creamy Pussy Squirt (Watch full uncensored video in RED subscription)",
  "t": "Big Dildo for Creamy Pussy Squirt (Watch full u...",
  "d": "2 min",
  "r": "99%",
  "n": "247.4k",
  "v": false,
  "vim": 0,
  "vv": 0,
  "hm": 1,
  "h": 1,
  "hp": 1,
  "td": 0,
  "fk": 0,
  "ve": 0,
  "ui": 611724437,
  "p": "creamyspot",
  "pn": "Creamyspot",
  "pu": "/creamyspot",
  "ch": true,
  "pm": false,
  "ut": null,
  "iu": false
}

It seems the extractor is using id which has the wrong value, instead it should use the u field which has the uopdkffea37 (My best guess)

Hope this makes things easier

@dirkf dirkf linked an issue Oct 2, 2024 that may be closed by this pull request
@ifixthat-gmx
Copy link

Hello
This PR hasnt been merged yet.
I am currently working on an implementation in my fork, where I try to combine the modifications from @DarkFighterLuke #30689 , @dirkf with my own.
So I will probably do another PR when I finished and cleaned up the code.
You can check the code in my fork (https://github.com/ifixthat-gmx/ytdl-org---youtube-dl/tree/tmp-xvideos)

@ifixthat-gmx
Copy link

@dirkf
I have not inserted my User/Channel and Search IE yet.
But I will add all the IE throughout multiple commits - probably 1 IE at a time.
Can @ you here whenever I have added them ?

@ifixthat-gmx
Copy link

ifixthat-gmx commented Oct 6, 2024

I just finished 'XVideosIE' with commit ebd496c0a125489ae260cab6ec72cbb2b52e16f0

but LN59 is still on debug instead of normal (so I can directly start importing my implementation of user-profiles :
doprint = 1 # 0=normal , 1=debug

Edit1)
and XVideosUserIE should be done as well see here
may produce duplicate videos in playlist between quickies and normal videos (fix will be applied later - have to figure out where the dupes in quickies come from as they are not shown as quickies in webbrowser)

Edit2)
in XVideosIE updated _VALID_URL to allow old ID-scheme url-format (partially?)

@ifixthat-gmx
Copy link

@dirkf
in case you have some time, could you checkout my xvideos.py ?
It would help quit alot to have some feedback on it.

@Lux-Hue
Copy link

Lux-Hue commented Nov 14, 2024

@ifixthat-gmx one issue I found when running myself, this https://github.com/ifixthat-gmx/ytdl-org---youtube-dl/blob/tmp-xvideos/youtube_dl/extractor/xvideos.py#L34 should be compat_urllib_parse. Other than that, it works as it should when downloading channels/model pages off of xvideos. 👍

@master-leonardo
Copy link

@ifixthat-gmx one issue I found when running myself, this https://github.com/ifixthat-gmx/ytdl-org---youtube-dl/blob/tmp-xvideos/youtube_dl/extractor/xvideos.py#L34 should be compat_urllib_parse. Other than that, it works as it should when downloading channels/model pages off of xvideos. 👍

if i wanted to try this fork, how do i install it? i can't find instructions anywhere. Does anyone have a step-by-step?
Not sure where i went wrong, if it's already broken or what

@ifixthat-gmx
Copy link

@ifixthat-gmx one issue I found when running myself, this https://github.com/ifixthat-gmx/ytdl-org---youtube-dl/blob/tmp-xvideos/youtube_dl/extractor/xvideos.py#L34 should be compat_urllib_parse. Other than that, it works as it should when downloading channels/model pages off of xvideos. 👍

it works because of https://github.com/ifixthat-gmx/ytdl-org---youtube-dl/blob/master/youtube_dl/compat.py#L97 , but thanks, I will update it next week (when I am working on it again) :)

@ifixthat-gmx
Copy link

@ifixthat-gmx one issue I found when running myself, this https://github.com/ifixthat-gmx/ytdl-org---youtube-dl/blob/tmp-xvideos/youtube_dl/extractor/xvideos.py#L34 should be compat_urllib_parse. Other than that, it works as it should when downloading channels/model pages off of xvideos. 👍

if i wanted to try this fork, how do i install it? i can't find instructions anywhere. Does anyone have a step-by-step? Not sure where i went wrong, if it's already broken or what

I am using docker to build a custom youtube-dl docker-image based on a specific python3 base-image.

  1. take python3
  2. install youtube-dl via https://github.com/ytdl-org/ytdl-nightly/archive/refs/tags/2024.08.07.zip
  3. copy some additions (config and ~patches) into image
    (on production / rollout) :
  4. copy modified extractors
  5. build docker image
    (while developing) :
  6. on docker run link/bind modified extractors

---- BUT I suggest you just install it like this

  1. pip3 install --no-cache-dir https://github.com/ytdl-org/ytdl-nightly/archive/refs/tags/2024.08.07.zip
  2. download my xvideos.py and replace it under the extractor directory (also update the extractors.py)
    see my repo/branch for the differences or this link

@master-leonardo
Copy link

I get this error after i swap those 2 files (extractors and xvideos.py),
Is it my python version? Do i need to install anything else?
I just enter the command youtube-dl, it works before the swap

C:\Windows\system32>youtube-dl
Traceback (most recent call last):
  File "c:\python38-32\lib\site-packages\youtube_dl\extractor\__init__.py", line
 4, in <module>
    from .lazy_extractors import *
ModuleNotFoundError: No module named 'youtube_dl.extractor.lazy_extractors'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\python38-32\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\python38-32\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Python38-32\Scripts\youtube-dl.exe\__main__.py", line 4, in <module>
  File "c:\python38-32\lib\site-packages\youtube_dl\__init__.py", line 14, in <m
odule>
    from .options import (
  File "c:\python38-32\lib\site-packages\youtube_dl\options.py", line 8, in <mod
ule>
    from .downloader.external import list_external_downloaders
  File "c:\python38-32\lib\site-packages\youtube_dl\downloader\__init__.py", lin
e 23, in <module>
    from .niconico import NiconicoDmcFD
  File "c:\python38-32\lib\site-packages\youtube_dl\downloader\niconico.py", lin
e 11, in <module>
    from ..extractor.niconico import NiconicoIE
  File "c:\python38-32\lib\site-packages\youtube_dl\extractor\__init__.py", line
 9, in <module>
    from .extractors import *
  File "c:\python38-32\lib\site-packages\youtube_dl\extractor\extractors.py", li
ne 1616, in <module>
    from .xvideos import (
  File "c:\python38-32\lib\site-packages\youtube_dl\extractor\xvideos.py", line
25, in <module>
    from ..utils import (
ImportError: cannot import name 'compat_urlparse' from 'youtube_dl.utils' (c:\py
thon38-32\lib\site-packages\youtube_dl\utils.py)

C:\Windows\system32>

@nicolaasjan
Copy link

nicolaasjan commented Dec 16, 2024

@master-leonardo
What if you change line 34 (compat_urlparse,) in xvideos.py to compat_urllib_parse,.

youtube-dl -v
[debug] System config: []
[debug] User config: ['--rm-cache-dir', '-i', '-o', '/dev/shm/test-ytd/%(title)s.%(ext)s', '-f', 'bestvideo[height<=1080][ext=mp4][vcodec^=avc]+bestaudio[ext=m4a]/best[ext=mp4]/best', '--no-mtime', '--embed-thumbnail', '--force-ipv4']
[debug] Custom config: []
[debug] Command-line args: ['-v']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2024.12.14.1
[debug] Lazy loading extractors enabled
[debug] Single file build
[debug] Python 3.10.12 (CPython x86_64 64bit) - Linux-5.15.0-126-generic-x86_64-with-glibc2.35 - OpenSSL 3.0.2 15 Mar 2022 - glibc 2.35
[debug] exe versions: ffmpeg N-118051-geb79c316c7-20241213, ffprobe N-118051-geb79c316c7-20241213, phantomjs 140260119963584, rtmpdump 2.4
[debug] Proxy map: {}
Removing cache dir /home/nico/.cache/youtube-dl ...

@dirkf
Copy link
Contributor Author

dirkf commented Dec 16, 2024

Import from ..compat would be correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Site xvideos.com not downloading amateur-channels xvideos.com pornstar page
9 participants