One annoying thing about YouTube is that, by default, some videos are now served in .webm format or use VP9 encoding. However, I prefer storing media in more widely supported codecs and formats, like .mp4, which has broader support and runs on more devices than .webm files. And sometimes I prefer AVC1 MP4 encoding because it just works out of the box on OSX with QuickTime, as QuickTime doesn't natively support VP9/VPO9. AVC1-encoded MP4s are still the most portable video format.
AVC1 ... is by far the most commonly used format for the recording, compression, and distribution of video content, used by 91% of video industry developers as of September 2019.[1]
yt-dlp, the command-line audio/video downloader for YouTube videos, is a great project. But between YouTube supporting various codecs and compatibility issues with various video players, this can make getting what you want out of yt-dlp a bit more challenging:
$ yt-dlp -f "bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best" https://www.youtube.com/watch?v=dQw4w9WgXcQ
For example, the format command above does not actually properly extract the best possible formats for all YouTube urls on my OSX machine.
This usually happens in cases where a YouTube URL tries to serve a .webm file. If you were to try using the above format flag to attempt extracting the best quality mp4 compatible audio and video from a list of youtube urls -- and you come across a YouTube url that serves a .webm file -- yt-dlp won't error out, abort, or skip the url. Instead, yt-dlp will extract and generate video that's improperly formatted -- .mp4 files that cannot be opened or played.
However, we can fix this problem without even bothering yt-dlp with a pull request. Because yt-dlp does give us the capability to dump out all of the possible audio and video formats available for any video by using the -F
flag:
$ yt-dlp -F "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
[youtube] Extracting URL: https://www.youtube.com/watch?v=dQw4w9WgXcQ
[youtube] dQw4w9WgXcQ: Downloading webpage
[youtube] dQw4w9WgXcQ: Downloading tv client config
[youtube] dQw4w9WgXcQ: Downloading player b21600d5
[youtube] dQw4w9WgXcQ: Downloading tv player API JSON
[youtube] dQw4w9WgXcQ: Downloading ios player API JSON
[youtube] dQw4w9WgXcQ: Downloading m3u8 information
[info] Available formats for dQw4w9WgXcQ:
ID EXT RESOLUTION FPS CH │ FILESIZE TBR PROTO │ VCODEC VBR ACODEC ABR ASR MORE INFO
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
sb3 mhtml 48x27 0 │ mhtml │ images storyboard
sb2 mhtml 80x45 1 │ mhtml │ images storyboard
sb1 mhtml 160x90 1 │ mhtml │ images storyboard
sb0 mhtml 320x180 1 │ mhtml │ images storyboard
233 mp4 audio only │ m3u8 │ audio only unknown [en] Default
234 mp4 audio only │ m3u8 │ audio only unknown [en] Default
249 webm audio only 2 │ 1.18MiB 46k https │ audio only opus 46k 48k [en] low, webm_dash
250 webm audio only 2 │ 1.55MiB 61k https │ audio only opus 61k 48k [en] low, webm_dash
140 m4a audio only 2 │ 3.27MiB 130k https │ audio only mp4a.40.2 130k 44k [en] medium, m4a_dash
251 webm audio only 2 │ 3.28MiB 130k https │ audio only opus 130k 48k [en] medium, webm_dash
602 mp4 256x144 13 │ ~ 2.04MiB 81k m3u8 │ vp09.00.10.08 81k video only
269 mp4 256x144 25 │ ~ 3.95MiB 156k m3u8 │ avc1.4D400C 156k video only
160 mp4 256x144 25 │ 1.78MiB 70k https │ avc1.4d400c 70k video only 144p, mp4_dash
...
270 mp4 1920x1080 25 │ ~123.87MiB 4902k m3u8 │ avc1.640028 4902k video only
//snipped
It turns out it's actually much better to first manually list the formats this way, use grep and awk to extract the best possible codecs for an mp4 file, and then run yt-dlp with the specifically related codecs for each video URL. Here's a Bash script to automate this process, which makes downloading stuff from YouTube easier, in my opinion:
#!/bin/bash
if [ -z "$1" ]; then
echo "Usage: $0 <youtube_url>"
exit 1
fi
url="$1"
processVideo() {
local videoUrl="$1"
echo "Fetching available formats for video: $videoUrl"
formats=$(yt-dlp -F "$videoUrl")
if [ $? -ne 0 ]; then
echo "Error: Failed to fetch formats for $videoUrl. Is yt-dlp installed and the URL valid?"
return
fi
videoFormat=$(echo "$formats" | grep 'mp4' | grep -E 'avc1' | \
awk '{for (i=1; i<=NF; i++) if ($i ~ /k$/) tbr=$i; print $1, tbr}' | \
sort -k2 -nr | awk '{print $1}' | head -1)
if [ -z "$videoFormat" ]; then
echo "No AVC1 video format found, falling back to any MP4 format."
videoFormat=$(echo "$formats" | grep 'mp4' | \
awk '{for (i=1; i<=NF; i++) if ($i ~ /k$/) tbr=$i; print $1, tbr}' | \
sort -k2 -nr | awk '{print $1}' | head -1)
fi
audioFormat=$(echo "$formats" | grep 'm4a' | \
awk '{for (i=1; i<=NF; i++) if ($i ~ /k$/) tbr=$i; print $1, tbr}' | \
sort -k2 -nr | awk '{print $1}' | head -1)
if [ -z "$videoFormat" ] || [ -z "$audioFormat" ]; then
echo "Error: No compatible MP4 video or M4A audio formats found for $videoUrl!"
return
fi
echo "Selected video format: $videoFormat [MP4 : AVC1 preferred]"
echo "Selected audio format: $audioFormat [M4A : highest quality]"
echo "Downloading video with yt-dlp..."
yt-dlp --restrict-filenames \
-f "${videoFormat}+${audioFormat}" \
--merge-output-format mp4 "$videoUrl"
if [ $? -ne 0 ]; then
echo "Error: Failed to download video. Check the format IDs and URL."
fi
}
isPlaylist() {
if echo "$url" | grep -q "list="; then
return 0
else
return 1
fi
}
if isPlaylist; then
echo "Processing playlist..."
videoUrls=$(yt-dlp --flat-playlist --get-url "$url")
if [ -z "$videoUrls" ]; then
echo "Error: No videos found in the playlist. Is the URL correct?"
exit 1
fi
for videoUrl in $videoUrls; do
echo "Processing video: $videoUrl"
processVideo "$videoUrl"
done
else
echo "Processing single video..."
processVideo "$url"
fi
We grab the entire "available formats" table as input, storing it as plaintext in the $formats
variable. We then grep $formats
for 'mp4' listings, then grep again, further filtering for listings that use the AVC1 H.264 codec. If it doesn't find AVC1, we fall back to simply whatever is MP4 compatible. After filtering twice with grep, our list looks something like this:
269 mp4 256x144 25 | ~ 3.95MiB 156k m3u8 | avc1.4D400C 156k video only
160 mp4 256x144 25 | 1.78MiB 70k https | avc1.4d400c 70k video only 144p, mp4_dash
229 mp4 426x240 25 | ~ 5.73MiB 227k m3u8 | avc1.4D4015 227k video only
133 mp4 426x240 25 | 2.88MiB 114k https | avc1.4d4015 114k video only 240p, mp4_dash
230 mp4 640x360 25 | ~ 12.09MiB 478k m3u8 | avc1.4D401E 478k video only
134 mp4 640x360 25 | 5.42MiB 214k https | avc1.4d401e 214k video only 360p, mp4_dash
18 mp4 640x360 25 2 | ≈ 8.68MiB 343k https | avc1.42001E mp4a.40.2 44k [en] 360p
231 mp4 854x480 25 | ~ 16.69MiB 660k m3u8 | avc1.4D401E 660k video only
135 mp4 854x480 25 | 8.28MiB 328k https | avc1.4d401e 328k video only 480p, mp4_dash
232 mp4 1280x720 25 | ~ 28.59MiB 1131k m3u8 | avc1.4D401F 1131k video only
136 mp4 1280x720 25 | 16.01MiB 633k https | avc1.4d401f 633k video only 720p, mp4_dash
270 mp4 1920x1080 25 | ~123.87MiB 4902k m3u8 | avc1.640028 4902k video only
137 mp4 1920x1080 25 | 76.46MiB 3025k https | avc1.640028 3025k video only 1080p, mp4_dash
//snipped
Then we use a for statement with awk
and NF
to loop through all of the fields, parsing the ID and TBR columns. The TBR column contains the bitrate. awk helps to extract the bitrate from the tbr table column, the first field the parser sees ending with a lowercase "k.":
awk '{for (i=1; i<=NF; i++) if ($i ~ /k$/) tbr=$i; print $1, tbr}'
At this point, our output looks something like this -- just a list of mp4 IDs and bitrates from our AVC1 list:
269 135k
160 66k
230 565k
134 353k
232 2396k
...
137 3025k
270 4902k
//snipped
Afterward, we use sort
to further select for the listing with the highest bitrate -- then awk
and head -1
to ensure we print back only the ID of the mp4 video file listing with the highest bitrate.
sort -k2 -nr | awk '{print $1}' | head -1)
Our final output is just 270
, the ID, which is what we pass to yt-dlp for the video portion of the download.
We repeat the process for the audio file listings by grepping for lines containing the m4a format extension. Again, we print the ID and TBR bitrate columns, sorting and extracting the related ID for the audio file with the highest bitrate.
We pass both the high quality video and audio IDs to yt-dlp for downloading. yt-dlp automagically merges these two files to produce a finalized MP4.
You could modify the grep and awk statements any other preferred video format, but this bash script works for downloading lectures I can natively watch and listen to on OSX. Here's the default yt-dlp package listing the available video formats, and below is an example of our Bash script that uses yt-dlp to help us extract the highest quality AVC1 MP4 files and make a portable, high quality video.
% yt-dlp -F "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
[youtube] Extracting URL: https://www.youtube.com/watch?v=dQw4w9WgXcQ
[youtube] dQw4w9WgXcQ: Downloading webpage
[youtube] dQw4w9WgXcQ: Downloading tv client config
[youtube] dQw4w9WgXcQ: Downloading player 6b3caec8
[youtube] dQw4w9WgXcQ: Downloading tv player API JSON
[youtube] dQw4w9WgXcQ: Downloading ios player API JSON
[youtube] dQw4w9WgXcQ: Downloading m3u8 information
[info] Available formats for dQw4w9WgXcQ:
ID EXT RESOLUTION FPS CH │ FILESIZE TBR PROTO │ VCODEC VBR ACODEC ABR ASR MORE INFO
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
sb3 mhtml 48x27 0 │ mhtml │ images storyboard
sb2 mhtml 80x45 1 │ mhtml │ images storyboard
sb1 mhtml 160x90 1 │ mhtml │ images storyboard
sb0 mhtml 320x180 1 │ mhtml │ images storyboard
233 mp4 audio only │ m3u8 │ audio only unknown [en] Default
234 mp4 audio only │ m3u8 │ audio only unknown [en] Default
249 webm audio only 2 │ 1.18MiB 46k https │ audio only opus 46k 48k [en] low, webm_dash
250 webm audio only 2 │ 1.55MiB 61k https │ audio only opus 61k 48k [en] low, webm_dash
140 m4a audio only 2 │ 3.27MiB 130k https │ audio only mp4a.40.2 130k 44k [en] medium, m4a_dash
251 webm audio only 2 │ 3.28MiB 130k https │ audio only opus 130k 48k [en] medium, webm_dash
602 mp4 256x144 13 │ ~ 2.04MiB 81k m3u8 │ vp09.00.10.08 81k video only
269 mp4 256x144 25 │ ~ 3.95MiB 156k m3u8 │ avc1.4D400C 156k video only
160 mp4 256x144 25 │ 1.78MiB 70k https │ avc1.4d400c 70k video only 144p, mp4_dash
603 mp4 256x144 25 │ ~ 3.88MiB 154k m3u8 │ vp09.00.11.08 154k video only
278 webm 256x144 25 │ 2.29MiB 91k https │ vp9 91k video only 144p, webm_dash
394 mp4 256x144 25 │ 1.41MiB 56k https │ av01.0.00M.08 56k video only 144p, mp4_dash
229 mp4 426x240 25 │ ~ 5.73MiB 227k m3u8 │ avc1.4D4015 227k video only
133 mp4 426x240 25 │ 2.88MiB 114k https │ avc1.4d4015 114k video only 240p, mp4_dash
604 mp4 426x240 25 │ ~ 7.26MiB 287k m3u8 │ vp09.00.20.08 287k video only
242 webm 426x240 25 │ 3.72MiB 147k https │ vp9 147k video only 240p, webm_dash
395 mp4 426x240 25 │ 2.77MiB 109k https │ av01.0.00M.08 109k video only 240p, mp4_dash
230 mp4 640x360 25 │ ~ 12.09MiB 478k m3u8 │ avc1.4D401E 478k video only
134 mp4 640x360 25 │ 5.42MiB 214k https │ avc1.4d401e 214k video only 360p, mp4_dash
18 mp4 640x360 25 2 │ ≈ 8.68MiB 343k https │ avc1.42001E mp4a.40.2 44k [en] 360p
605 mp4 640x360 25 │ ~ 14.26MiB 564k m3u8 │ vp09.00.21.08 564k video only
243 webm 640x360 25 │ 6.32MiB 250k https │ vp9 250k video only 360p, webm_dash
396 mp4 640x360 25 │ 4.85MiB 192k https │ av01.0.01M.08 192k video only 360p, mp4_dash
231 mp4 854x480 25 │ ~ 16.69MiB 660k m3u8 │ avc1.4D401E 660k video only
135 mp4 854x480 25 │ 8.28MiB 328k https │ avc1.4d401e 328k video only 480p, mp4_dash
606 mp4 854x480 25 │ ~ 19.74MiB 781k m3u8 │ vp09.00.30.08 781k video only
244 webm 854x480 25 │ 8.92MiB 353k https │ vp9 353k video only 480p, webm_dash
397 mp4 854x480 25 │ 8.18MiB 324k https │ av01.0.04M.08 324k video only 480p, mp4_dash
232 mp4 1280x720 25 │ ~ 28.59MiB 1131k m3u8 │ avc1.4D401F 1131k video only
136 mp4 1280x720 25 │ 16.01MiB 633k https │ avc1.4d401f 633k video only 720p, mp4_dash
609 mp4 1280x720 25 │ ~ 29.81MiB 1180k m3u8 │ vp09.00.31.08 1180k video only
247 webm 1280x720 25 │ 14.65MiB 580k https │ vp9 580k video only 720p, webm_dash
398 mp4 1280x720 25 │ 14.98MiB 593k https │ av01.0.05M.08 593k video only 720p, mp4_dash
270 mp4 1920x1080 25 │ ~123.87MiB 4902k m3u8 │ avc1.640028 4902k video only
137 mp4 1920x1080 25 │ 76.46MiB 3025k https │ avc1.640028 3025k video only 1080p, mp4_dash
614 mp4 1920x1080 25 │ ~ 71.55MiB 2831k m3u8 │ vp09.00.40.08 2831k video only
248 webm 1920x1080 25 │ 39.24MiB 1552k https │ vp9 1552k video only 1080p, webm_dash
399 mp4 1920x1080 25 │ 27.67MiB 1095k https │ av01.0.08M.08 1095k video only 1080p, mp4_dash
616 mp4 1920x1080 25 │ ~144.16MiB 5704k m3u8 │ vp09.00.40.08 5704k video only Premium
% ./yt.sh "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
Processing single video...
Fetching available formats for video: https://www.youtube.com/watch?v=dQw4w9WgXcQ
Selected video format: 270 [MP4 : AVC1 preferred]
Selected audio format: 140 [M4A : highest quality]
Downloading video with yt-dlp...
[youtube] Extracting URL: https://www.youtube.com/watch?v=dQw4w9WgXcQ
[youtube] dQw4w9WgXcQ: Downloading webpage
[youtube] dQw4w9WgXcQ: Downloading tv client config
[youtube] dQw4w9WgXcQ: Downloading player 6b3caec8
[youtube] dQw4w9WgXcQ: Downloading tv player API JSON
[youtube] dQw4w9WgXcQ: Downloading ios player API JSON
[youtube] dQw4w9WgXcQ: Downloading m3u8 information
[info] dQw4w9WgXcQ: Downloading 1 format(s): 270+140
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 39
[download] Destination: Rick_Astley_-_Never_Gonna_Give_You_Up_Official_Music_Video-[dQw4w9WgXcQ].f270.mp4
[download] 100% of 78.70MiB in 00:00:27 at 2.83MiB/s
[download] Destination: Rick_Astley_-_Never_Gonna_Give_You_Up_Official_Music_Video-[dQw4w9WgXcQ].f140.m4a
[download] 100% of 3.27MiB in 00:00:00 at 4.50MiB/s
[Merger] Merging formats into "Rick_Astley_-_Never_Gonna_Give_You_Up_Official_Music_Video-[dQw4w9WgXcQ].mp4"
Deleting original file Rick_Astley_-_Never_Gonna_Give_You_Up_Official_Music_Video-[dQw4w9WgXcQ].f270.mp4 (pass -k to keep)
Deleting original file Rick_Astley_-_Never_Gonna_Give_You_Up_Official_Music_Video-[dQw4w9WgXcQ].f140.m4a (pass -k to keep)
% exiftool Rick_Astley_-_Never_Gonna_Give_You_Up_Official_Music_Video-\[dQw4w9WgXcQ\].mp4
ExifTool Version Number : 13.10
File Name : Rick_Astley_-_Never_Gonna_Give_You_Up_Official_Music_Video-[dQw4w9WgXcQ].mp4
Directory : .
File Size : 84 MB
File Modification Date/Time : 2024:05:30 01:43:41-04:00
File Access Date/Time : 2024:05:30 01:43:41-04:00
File Inode Change Date/Time : 2025:03:15 19:30:18-04:00
File Permissions : -rw-r--r--
File Type : MP4
File Type Extension : mp4
MIME Type : video/mp4
Major Brand : MP4 Base Media v1 [IS0 14496-12:2003]
Minor Version : 0.2.0
Compatible Brands : isom, iso2, avc1, mp41
Movie Header Version : 0
Create Date : 0000:00:00 00:00:00
Modify Date : 0000:00:00 00:00:00
Time Scale : 1000
Duration : 0:03:32
Preferred Rate : 1
Preferred Volume : 100.00%
Preview Time : 0 s
Preview Duration : 0 s
Poster Time : 0 s
Selection Time : 0 s
Selection Duration : 0 s
Current Time : 0 s
Next Track ID : 3
Track Header Version : 0
Track Create Date : 0000:00:00 00:00:00
Track Modify Date : 0000:00:00 00:00:00
Track ID : 1
Track Duration : 0:03:32
Track Layer : 0
Track Volume : 0.00%
Image Width : 1920
Image Height : 1080
Graphics Mode : srcCopy
Op Color : 0 0 0
Compressor ID : avc1
Source Image Width : 1920
Source Image Height : 1080
X Resolution : 72
Y Resolution : 72
Bit Depth : 24
Color Profiles : nclx
Color Primaries : BT.709
Transfer Characteristics : BT.709
Matrix Coefficients : BT.709
Video Full Range Flag : Limited
Pixel Aspect Ratio : 1:1
Buffer Size : 0
Max Bitrate : 3023409
Average Bitrate : 3023409
Video Frame Rate : 25
Matrix Structure : 1 0 0 0 1 0 0 0 1
Media Header Version : 0
Media Create Date : 0000:00:00 00:00:00
Media Modify Date : 0000:00:00 00:00:00
Media Time Scale : 44100
Media Duration : 0:03:32
Media Language Code : eng
Handler Description : ISO Media file produced by Google Inc.
Balance : 0
Audio Format : mp4a
Audio Channels : 2
Audio Bits Per Sample : 16
Audio Sample Rate : 44100
Handler Type : Metadata
Handler Vendor ID : Apple
Encoder : Lavf61.7.100
Media Data Size : 83529233
Media Data Offset : 171478
Image Size : 1920x1080
Megapixels : 2.1
Avg Bitrate : 3.15 Mbps
Rotation : 0
Comments
Post a Comment