[cleanup] Misc (#8598)

Authored by: bashonly, pukkandan, seproDev, Grub4K

Co-authored-by: bashonly <bashonly@protonmail.com>
Co-authored-by: pukkandan <pukkandan.ytdlp@gmail.com>
Co-authored-by: sepro <4618135+seproDev@users.noreply.github.com>
This commit is contained in:
Simon Sawicki 2023-12-30 22:27:36 +01:00 committed by GitHub
parent 5f009a094f
commit f9fb3ce86e
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
30 changed files with 77 additions and 82 deletions

View file

@ -36,8 +36,8 @@ jobs:
fail-fast: false
matrix:
os: [ubuntu-latest]
# CPython 3.11 is in quick-test
python-version: ['3.8', '3.9', '3.10', '3.12', pypy-3.8, pypy-3.10]
# CPython 3.8 is in quick-test
python-version: ['3.9', '3.10', '3.11', '3.12', pypy-3.8, pypy-3.10]
include:
# atleast one of each CPython/PyPy tests must be in windows
- os: windows-latest

View file

@ -10,10 +10,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python 3.11
- name: Set up Python 3.8
uses: actions/setup-python@v4
with:
python-version: '3.11'
python-version: '3.8'
- name: Install test requirements
run: pip install pytest -r requirements.txt
- name: Run tests

View file

@ -29,6 +29,7 @@ ## [coletdjnz](https://github.com/coletdjnz)
[![gh-sponsor](https://img.shields.io/badge/_-Github-white.svg?logo=github&labelColor=555555&style=for-the-badge)](https://github.com/sponsors/coletdjnz)
* Improved plugin architecture
* Rewrote the networking infrastructure, implemented support for `requests`
* YouTube improvements including: age-gate bypass, private playlists, multiple-clients (to avoid throttling) and a lot of under-the-hood improvements
* Added support for new websites YoutubeWebArchive, MainStreaming, PRX, nzherald, Mediaklikk, StarTV etc
* Improved/fixed support for Patreon, panopto, gfycat, itv, pbs, SouthParkDE etc
@ -46,16 +47,17 @@ ## [Ashish0804](https://github.com/Ashish0804) <sub><sup>[Inactive]</sup></sub>
## [bashonly](https://github.com/bashonly)
* `--update-to`, automated release, nightly builds
* `--cookies-from-browser` support for Firefox containers
* Added support for new websites Genius, Kick, NBCStations, Triller, VideoKen etc
* Improved/fixed support for Anvato, Brightcove, Instagram, ParamountPlus, Reddit, SlidesLive, TikTok, Twitter, Vimeo etc
* `--update-to`, self-updater rewrite, automated/nightly/master releases
* `--cookies-from-browser` support for Firefox containers, external downloader cookie handling overhaul
* Added support for new websites like Dacast, Kick, NBCStations, Triller, VideoKen, Weverse, WrestleUniverse etc
* Improved/fixed support for Anvato, Brightcove, Reddit, SlidesLive, TikTok, Twitter, Vimeo etc
## [Grub4K](https://github.com/Grub4K)
[![ko-fi](https://img.shields.io/badge/_-Ko--fi-red.svg?logo=kofi&labelColor=555555&style=for-the-badge)](https://ko-fi.com/Grub4K) [![gh-sponsor](https://img.shields.io/badge/_-Github-white.svg?logo=github&labelColor=555555&style=for-the-badge)](https://github.com/sponsors/Grub4K)
[![gh-sponsor](https://img.shields.io/badge/_-Github-white.svg?logo=github&labelColor=555555&style=for-the-badge)](https://github.com/sponsors/Grub4K) [![ko-fi](https://img.shields.io/badge/_-Ko--fi-red.svg?logo=kofi&labelColor=555555&style=for-the-badge)](https://ko-fi.com/Grub4K)
* `--update-to`, automated release, nightly builds
* Rework internals like `traverse_obj`, various core refactors and bugs fixes
* Helped fix crunchyroll, Twitter, wrestleuniverse, wistia, slideslive etc
* `--update-to`, self-updater rewrite, automated/nightly/master releases
* Reworked internals like `traverse_obj`, various core refactors and bugs fixes
* Implemented proper progress reporting for parallel downloads
* Improved/fixed/added Bundestag, crunchyroll, pr0gramm, Twitter, WrestleUniverse etc

View file

@ -159,6 +159,7 @@ ### Differences in default behavior
* yt-dlp versions between 2021.09.01 and 2023.01.02 applies `--match-filter` to nested playlists. This was an unintentional side-effect of [8f18ac](https://github.com/yt-dlp/yt-dlp/commit/8f18aca8717bb0dd49054555af8d386e5eda3a88) and is fixed in [d7b460](https://github.com/yt-dlp/yt-dlp/commit/d7b460d0e5fc710950582baed2e3fc616ed98a80). Use `--compat-options playlist-match-filter` to revert this
* yt-dlp versions between 2021.11.10 and 2023.06.21 estimated `filesize_approx` values for fragmented/manifest formats. This was added for convenience in [f2fe69](https://github.com/yt-dlp/yt-dlp/commit/f2fe69c7b0d208bdb1f6292b4ae92bc1e1a7444a), but was reverted in [0dff8e](https://github.com/yt-dlp/yt-dlp/commit/0dff8e4d1e6e9fb938f4256ea9af7d81f42fd54f) due to the potentially extreme inaccuracy of the estimated values. Use `--compat-options manifest-filesize-approx` to keep extracting the estimated values
* yt-dlp uses modern http client backends such as `requests`. Use `--compat-options prefer-legacy-http-handler` to prefer the legacy http handler (`urllib`) to be used for standard http requests.
* The sub-module `swfinterp` is removed.
For ease of use, a few more compat options are available:
@ -299,7 +300,7 @@ ### Misc
* [**pycryptodomex**](https://github.com/Legrandin/pycryptodome)\* - For decrypting AES-128 HLS streams and various other data. Licensed under [BSD-2-Clause](https://github.com/Legrandin/pycryptodome/blob/master/LICENSE.rst)
* [**phantomjs**](https://github.com/ariya/phantomjs) - Used in extractors where javascript needs to be run. Licensed under [BSD-3-Clause](https://github.com/ariya/phantomjs/blob/master/LICENSE.BSD)
* [**secretstorage**](https://github.com/mitya57/secretstorage) - For `--cookies-from-browser` to access the **Gnome** keyring while decrypting cookies of **Chromium**-based browsers on **Linux**. Licensed under [BSD-3-Clause](https://github.com/mitya57/secretstorage/blob/master/LICENSE)
* [**secretstorage**](https://github.com/mitya57/secretstorage)\* - For `--cookies-from-browser` to access the **Gnome** keyring while decrypting cookies of **Chromium**-based browsers on **Linux**. Licensed under [BSD-3-Clause](https://github.com/mitya57/secretstorage/blob/master/LICENSE)
* Any external downloader that you want to use with `--downloader`
### Deprecated

View file

@ -114,5 +114,11 @@
"action": "add",
"when": "f04b5bedad7b281bee9814686bba1762bae092eb",
"short": "[priority] Security: [[CVE-2023-46121](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-46121)] Patch [Generic Extractor MITM Vulnerability via Arbitrary Proxy Injection](https://github.com/yt-dlp/yt-dlp/security/advisories/GHSA-3ch3-jhc6-5r8x)\n\t- Disallow smuggling of arbitrary `http_headers`; extractors now only use specific headers"
},
{
"action": "change",
"when": "15f22b4880b6b3f71f350c64d70976ae65b9f1ca",
"short": "[webvtt] Allow spaces before newlines for CueBlock (#7681)",
"authors": ["TSRBerry"]
}
]

View file

@ -40,20 +40,6 @@ def subgroup_lookup(cls):
return {
name: group
for group, names in {
cls.CORE: {
'aes',
'cache',
'compat_utils',
'compat',
'cookies',
'dependencies',
'formats',
'jsinterp',
'outtmpl',
'plugins',
'update',
'utils',
},
cls.MISC: {
'build',
'ci',
@ -404,9 +390,9 @@ def groups(self):
if not group:
if self.EXTRACTOR_INDICATOR_RE.search(commit.short):
group = CommitGroup.EXTRACTOR
logger.error(f'Assuming [ie] group for {commit.short!r}')
else:
group = CommitGroup.POSTPROCESSOR
logger.warning(f'Failed to map {commit.short!r}, selected {group.name.lower()}')
group = CommitGroup.CORE
commit_info = CommitInfo(
details, sub_details, message.strip(),

View file

@ -9,11 +9,7 @@
import re
from devscripts.utils import (
get_filename_args,
read_file,
write_file,
)
from devscripts.utils import get_filename_args, read_file, write_file
VERBOSE_TMPL = '''
- type: checkboxes

View file

@ -1,6 +1,5 @@
mutagen
pycryptodomex
websockets
brotli; implementation_name=='cpython'
brotlicffi; implementation_name!='cpython'
certifi

View file

@ -730,7 +730,7 @@ def expect_same_infodict(out):
self.assertEqual(got_dict.get(info_field), expected, info_field)
return True
test('%()j', (expect_same_infodict, str))
test('%()j', (expect_same_infodict, None))
# NA placeholder
NA_TEST_OUTTMPL = '%(uploader_date)s-%(width)d-%(x|def)s-%(id)s.%(ext)s'

View file

@ -9,7 +9,7 @@
from test.helper import FakeYDL, report_warning
from yt_dlp.update import Updater, UpdateInfo
from yt_dlp.update import UpdateInfo, Updater
# XXX: Keep in sync with yt_dlp.update.UPDATE_SOURCES

View file

@ -2110,6 +2110,8 @@ def test_traverse_obj(self):
self.assertEqual(traverse_obj(_TEST_DATA, (..., {str_or_none})),
[item for item in map(str_or_none, _TEST_DATA.values()) if item is not None],
msg='Function in set should be a transformation')
self.assertEqual(traverse_obj(_TEST_DATA, ('fail', {lambda _: 'const'})), 'const',
msg='Function in set should always be called')
if __debug__:
with self.assertRaises(Exception, msg='Sets with length != 1 should raise in debug'):
traverse_obj(_TEST_DATA, set())

View file

@ -1 +1 @@
@py -bb -Werror -Xdev "%~dp0yt_dlp\__main__.py" %*
@py -Werror -Xdev "%~dp0yt_dlp\__main__.py" %*

View file

@ -1,2 +1,2 @@
#!/usr/bin/env sh
exec "${PYTHON:-python3}" -bb -Werror -Xdev "$(dirname "$(realpath "$0")")/yt_dlp/__main__.py" "$@"
exec "${PYTHON:-python3}" -Werror -Xdev "$(dirname "$(realpath "$0")")/yt_dlp/__main__.py" "$@"

View file

@ -60,7 +60,13 @@
get_postprocessor,
)
from .postprocessor.ffmpeg import resolve_mapping as resolve_recode_mapping
from .update import REPOSITORY, _get_system_deprecation, _make_label, current_git_head, detect_variant
from .update import (
REPOSITORY,
_get_system_deprecation,
_make_label,
current_git_head,
detect_variant,
)
from .utils import (
DEFAULT_OUTTMPL,
IDENTITY,

View file

@ -152,7 +152,7 @@ def page_func(page_num):
'sort': 'new',
'limit': self._PAGE_SIZE,
'offset': page_num * self._PAGE_SIZE,
}, note=f'Downloading page {page_num+1}')
}, note=f'Downloading page {page_num + 1}')
return [
self.url_result(f"{self._VIDEO_BASE}/{video['_id']}", BanByeIE)
for video in data['items']

View file

@ -53,21 +53,6 @@ class DuoplayIE(InfoExtractor):
'episode_id': 14,
'release_year': 2010,
},
}, {
'note': 'Movie',
'url': 'https://duoplay.ee/4325/naljamangud',
'md5': '2b0bcac4159a08b1844c2bfde06b1199',
'info_dict': {
'id': '4325',
'ext': 'mp4',
'title': 'Näljamängud',
'thumbnail': r're:https://.+\.jpg(?:\?c=\d+)?$',
'description': 'md5:fb35f5eb2ff46cdb82e4d5fbe7b49a13',
'cast': ['Jennifer Lawrence', 'Josh Hutcherson', 'Liam Hemsworth'],
'upload_date': '20231109',
'timestamp': 1699552800,
'release_year': 2012,
},
}, {
'note': 'Movie without expiry',
'url': 'https://duoplay.ee/5501/pilvede-all.-neljas-ode',

View file

@ -173,8 +173,8 @@ def format_path(params):
'formats': formats,
})
uploader_url = format_field(traverse_obj(
post_data, 'creator'), 'urlname', 'https://www.floatplane.com/channel/%s/home', default=None)
uploader_url = format_field(
post_data, [('creator', 'urlname')], 'https://www.floatplane.com/channel/%s/home') or None
channel_url = urljoin(f'{uploader_url}/', traverse_obj(post_data, ('channel', 'urlname')))
post_info = {
@ -248,7 +248,7 @@ def _fetch_page(self, display_id, creator_id, channel_id, page):
for post in page_data or []:
yield self.url_result(
f'https://www.floatplane.com/post/{post["id"]}',
ie=FloatplaneIE, video_id=post['id'], video_title=post.get('title'),
FloatplaneIE, id=post['id'], title=post.get('title'),
release_timestamp=parse_iso8601(post.get('releaseDate')))
def _real_extract(self, url):
@ -264,5 +264,5 @@ def _real_extract(self, url):
return self.playlist_result(OnDemandPagedList(functools.partial(
self._fetch_page, display_id, creator_data['id'], channel_data.get('id')), self._PAGE_SIZE),
display_id, playlist_title=channel_data.get('title') or creator_data.get('title'),
playlist_description=channel_data.get('about') or creator_data.get('about'))
display_id, title=channel_data.get('title') or creator_data.get('title'),
description=channel_data.get('about') or creator_data.get('about'))

View file

@ -35,8 +35,8 @@
unified_timestamp,
unsmuggle_url,
update_url_query,
urlhandle_detect_ext,
url_or_none,
urlhandle_detect_ext,
urljoin,
variadic,
xpath_attr,

View file

@ -536,7 +536,7 @@ def _fetch_page(self, base_url, query_params, display_id, page):
}
response = self._call_api(
base_url, '/Services/Data.svc/GetSessions', f'{display_id} page {page+1}',
base_url, '/Services/Data.svc/GetSessions', f'{display_id} page {page + 1}',
data={'queryParameters': params}, fatal=False)
for result in get_first(response, 'Results', default=[]):

View file

@ -264,7 +264,7 @@ def _real_extract(self, url):
}
class RadioFrancePlaylistBase(RadioFranceBaseIE):
class RadioFrancePlaylistBaseIE(RadioFranceBaseIE):
"""Subclasses must set _METADATA_KEY"""
def _call_api(self, content_id, cursor, page_num):
@ -308,7 +308,7 @@ def _real_extract(self, url):
})})
class RadioFrancePodcastIE(RadioFrancePlaylistBase):
class RadioFrancePodcastIE(RadioFrancePlaylistBaseIE):
_VALID_URL = rf'''(?x)
{RadioFranceBaseIE._VALID_URL_BASE}
/(?:{RadioFranceBaseIE._STATIONS_RE})
@ -369,7 +369,7 @@ def _call_api(self, podcast_id, cursor, page_num):
note=f'Downloading page {page_num}', query={'pageCursor': cursor})
class RadioFranceProfileIE(RadioFrancePlaylistBase):
class RadioFranceProfileIE(RadioFrancePlaylistBaseIE):
_VALID_URL = rf'{RadioFranceBaseIE._VALID_URL_BASE}/personnes/(?P<id>[\w-]+)'
_TESTS = [{

View file

@ -70,7 +70,7 @@ def _extract_from_webpage(self, url, webpage):
'height': int_or_none(traverse_obj(track, ('dimensions', 'original', 'height'))),
'width': int_or_none(traverse_obj(track, ('dimensions', 'original', 'width'))),
} for track in traverse_obj(playlist_json, ('tracks', ...), expected_type=dict)]
yield self.playlist_result(entries, self._generic_id(url) + f'-wp-playlist-{i+1}', 'Wordpress Playlist')
yield self.playlist_result(entries, self._generic_id(url) + f'-wp-playlist-{i + 1}', 'Wordpress Playlist')
class WordpressMiniAudioPlayerEmbedIE(InfoExtractor):

View file

@ -5297,6 +5297,7 @@ def _extract_webpage(self, url, item_id, fatal=True):
# See: https://github.com/yt-dlp/yt-dlp/issues/116
if not traverse_obj(data, 'contents', 'currentVideoEndpoint', 'onResponseReceivedActions'):
retry.error = ExtractorError('Incomplete yt initial data received')
data = None
continue
return webpage, data

View file

@ -28,4 +28,3 @@
pass
except Exception as e:
warnings.warn(f'Failed to import "websockets" request handler: {e}' + bug_reports_message())

View file

@ -219,7 +219,7 @@ def _socket_connect(ip_addr, timeout, source_address):
sock.bind(source_address)
sock.connect(sa)
return sock
except socket.error:
except OSError:
sock.close()
raise
@ -237,7 +237,7 @@ def create_socks_proxy_socket(dest_addr, proxy_args, proxy_ip_addr, timeout, sou
sock.bind(source_address)
sock.connect(dest_addr)
return sock
except socket.error:
except OSError:
sock.close()
raise
@ -255,7 +255,7 @@ def create_connection(
host, port = address
ip_addrs = socket.getaddrinfo(host, port, 0, socket.SOCK_STREAM)
if not ip_addrs:
raise socket.error('getaddrinfo returns an empty list')
raise OSError('getaddrinfo returns an empty list')
if source_address is not None:
af = socket.AF_INET if ':' not in source_address[0] else socket.AF_INET6
ip_addrs = [addr for addr in ip_addrs if addr[0] == af]
@ -272,7 +272,7 @@ def create_connection(
# https://bugs.python.org/issue36820
err = None
return sock
except socket.error as e:
except OSError as e:
err = e
try:

View file

@ -188,6 +188,7 @@ class RequestsSession(requests.sessions.Session):
"""
Ensure unified redirect method handling with our urllib redirect handler.
"""
def rebuild_method(self, prepared_request, response):
new_method = get_redirect_method(prepared_request.method, response.status_code)
@ -218,6 +219,7 @@ def filter(self, record):
class Urllib3LoggingHandler(logging.Handler):
"""Redirect urllib3 logs to our logger"""
def __init__(self, logger, *args, **kwargs):
super().__init__(*args, **kwargs)
self._logger = logger
@ -367,7 +369,7 @@ def _new_conn(self):
self, f'Connection to {self.host} timed out. (connect timeout={self.timeout})') from e
except SocksProxyError as e:
raise urllib3.exceptions.ProxyError(str(e), e) from e
except (OSError, socket.error) as e:
except OSError as e:
raise urllib3.exceptions.NewConnectionError(
self, f'Failed to establish a new connection: {e}') from e

View file

@ -5,20 +5,26 @@
import ssl
import sys
from ._helper import create_connection, select_proxy, make_socks_proxy_opts, create_socks_proxy_socket
from .common import Response, register_rh, Features
from ._helper import (
create_connection,
create_socks_proxy_socket,
make_socks_proxy_opts,
select_proxy,
)
from .common import Features, Response, register_rh
from .exceptions import (
CertificateVerifyError,
HTTPError,
ProxyError,
RequestError,
SSLError,
TransportError, ProxyError,
TransportError,
)
from .websocket import WebSocketRequestHandler, WebSocketResponse
from ..compat import functools
from ..dependencies import websockets
from ..utils import int_or_none
from ..socks import ProxyError as SocksProxyError
from ..utils import int_or_none
if not websockets:
raise ImportError('websockets is not installed')

View file

@ -2,7 +2,7 @@
import abc
from .common import Response, RequestHandler
from .common import RequestHandler, Response
class WebSocketResponse(Response):

View file

@ -49,7 +49,7 @@ class Socks5AddressType:
ATYP_IPV6 = 0x04
class ProxyError(socket.error):
class ProxyError(OSError):
ERR_SUCCESS = 0x00
def __init__(self, code=None, msg=None):

View file

@ -558,7 +558,7 @@ def decode(self, s):
s = self._close_object(e)
if s is not None:
continue
raise type(e)(f'{e.msg} in {s[e.pos-10:e.pos+10]!r}', s, e.pos)
raise type(e)(f'{e.msg} in {s[e.pos - 10:e.pos + 10]!r}', s, e.pos)
assert False, 'Too many attempts to decode JSON'
@ -1885,6 +1885,7 @@ def setproctitle(title):
buf = ctypes.create_string_buffer(len(title_bytes))
buf.value = title_bytes
try:
# PR_SET_NAME = 15 Ref: /usr/include/linux/prctl.h
libc.prctl(15, buf, 0, 0, 0)
except AttributeError:
return # Strange libc, just skip this
@ -2260,6 +2261,9 @@ def __getitem__(self, idx):
raise self.IndexError()
return entries[0]
def __bool__(self):
return bool(self.getslice(0, 1))
class OnDemandPagedList(PagedList):
"""Download pages until a page with less than maximum results"""
@ -5070,7 +5074,7 @@ def truncate_string(s, left, right=0):
assert left > 3 and right >= 0
if s is None or len(s) <= left + right:
return s
return f'{s[:left-3]}...{s[-right:] if right else ""}'
return f'{s[:left - 3]}...{s[-right:] if right else ""}'
def orderedSet_from_options(options, alias_dict, *, use_regex=False, start=None):

View file

@ -23,7 +23,7 @@ def traverse_obj(
>>> obj = [{}, {"key": "value"}]
>>> traverse_obj(obj, (1, "key"))
"value"
'value'
Each of the provided `paths` is tested and the first producing a valid result will be returned.
The next path will also be tested if the path branched but no results could be found.