Thanks liberal justices, very cool!
Thanks liberal justices, very cool!
According to a site admin from that forum post (which is from April 2021–who knows where things stand now):
If you use the OpenSubtitles website manually, you will have advertisements on the web site, NOT inside the subtitles.
If you use some API-software to download subtitles (Plex, Kodi, BSPlayer or whatever), you are not using the web site, so you do NOT have these web advertisements. To compensate this, ads are being added on-the-fly to the subtitles itself.
Also, from a different admin
add few words from my side - it is good you are talking about ads. They not generating a lot of revenue, but on other side we have more VIP subscriptions because of it :) We have in ads something like “Become VIP member and Remove all ads…”
Also, the ads in subtitles are always inserted on “empty” space. It is never in middle of movie. What Roozel wrote - “I think placing those ads at the beginning and end is somewhat OK but not in the middle or at random points in the film” - should not happen, if yes, send me the subtitle.
If the subtitle is from tv series, there are dialogues from beginning usually. System is finding “quiet” place where ads would fit, and yes, this can be after 3 minutes of dialogue…
This is important to know, I hope now it is more clear about subtitle ads - why we are doing this, there is possibility to remove them and how system works.
so a scenario like in the screenshot isn’t supposed to happen. I guess if you really wanted to see if it happens you could grab all the English subs via the API and just do a quick grep or what-have-you
Original Phoronix article which has all the individual benchmarks—weird that they didn’t link to it
There’s a variable that contains the number of cores (called cpus
) which is hardcoded to max out at 8, but it doesn’t mean that cores aren’t utilized beyond 8 cores–it just means that the scheduling scaling factor will not change in either the linear or logarithmic case once you go above that number:
/*
* Increase the granularity value when there are more CPUs,
* because with more CPUs the 'effective latency' as visible
* to users decreases. But the relationship is not linear,
* so pick a second-best guess by going with the log2 of the
* number of CPUs.
*
* This idea comes from the SD scheduler of Con Kolivas:
*/
static unsigned int get_update_sysctl_factor(void)
{
unsigned int cpus = min_t(unsigned int, num_online_cpus(), 8);
unsigned int factor;
switch (sysctl_sched_tunable_scaling) {
case SCHED_TUNABLESCALING_NONE:
factor = 1;
break;
case SCHED_TUNABLESCALING_LINEAR:
factor = cpus;
break;
case SCHED_TUNABLESCALING_LOG:
default:
factor = 1 + ilog2(cpus);
break;
}
return factor;
}
The core claim is this:
It’s problematic that the kernel was hardcoded to a maximum of 8 cores (scaling factor of 4). It can’t be good to reschedule hundreds of tasks every few milliseconds, maybe on a different core, maybe on a different die. It can’t be good for performance and cache locality.
On this point, I have no idea (hope someone more knowledgeable will weigh in). But I’d say the headline is misleading at best.
Great, I’m so glad to hear that! Tartube can be a little intimidating with it’s sprawling menus and sub-menus, but when it comes down to it most of the core functionality is pretty accessible once you know where to look and can ignore all the hyper-specific options for power users.
1. No idea, to be honest. In the environment I tested this in (Windows 10 Sandbox) Windows Defender didn’t complain, and I’ve never had an issue with my actual install either. In fact, I just checked my installation folders on my PC and didn’t even find that executable (maybe it’s only used during installation?) although I do have it on my system for a different program. I only found one Google hit from 5 years ago on the glslang
Github itself, and the user seemed to think it was a false positive for what it’s worth.
2. They are supposed to be there by default (they store metadata) but you can set up Tartube to put them in separate folders if you want to just have a nice clean folder with only videos or just not write them in the first place if you don’t want them. I believe the metadata is copied into Tartube’s database, so deleting them shouldn’t change anything (they’re mostly useful for archival purposes or if you want to do some processing with external tools), but Tartube references the thumbnail image files for display in the GUI so removing them will remove the thumbnail from the GUI like so:
This is pretty straightforward to configure, thankfully:
3. (hosted externally due to Lemmy sanitization bug causing less-than symbols to be HTML escaped)
edit: accidentally left out a line in the externally hosted markdown
Okay, Tartube can definitely handle what you want with a few additional flags! Here’s the mediainfo for the output file after doing a test run on this MrBeast video (note that the audio track is incorrectly marked as English but is indeed Japanese, and that S_TEXT is how SRT appears in an MKV file):
General
Unique ID : 242275721910232180380466434100717751726 (0xB6449B54C970D7DBA0EB469BBD590DAE)
Complete name : C:\Users\WDAGUtilityAccount\Tartube\Test Audio Playlist\$1 vs $100,000,000 House!.mkv
Format : Matroska
Format version : Version 4
File size : 556 MiB
Duration : 17 min 35 s
Overall bit rate : 4 418 kb/s
Frame rate : 29.970 FPS
Writing application : Lavf60.3.100
Writing library : Lavf60.3.100
ErrorDetectionType : Per level 1
Video
ID : 1
Format : VP9
Format profile : 0
Codec ID : V_VP9
Duration : 17 min 35 s
Width : 1 920 pixels
Height : 1 080 pixels
Display aspect ratio : 16:9
Frame rate mode : Constant
Frame rate : 29.970 (29970/1000) FPS
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Title : ISO Media file produced by Google Inc. Created on: 11/09/2023.
Default : Yes
Forced : No
Color range : Limited
Color primaries : BT.709
Transfer characteristics : BT.709
Matrix coefficients : BT.709
VENDOR_ID : [0][0][0][0]
Audio
ID : 2
Format : Opus
Codec ID : A_OPUS
Duration : 17 min 35 s
Channel(s) : 2 channels
Channel layout : L R
Sampling rate : 48.0 kHz
Bit depth : 32 bits
Compression mode : Lossy
Delay relative to video : 7 ms
Language : English
Default : Yes
Forced : No
Text
ID : 3
Format : UTF-8
Codec ID : S_TEXT/UTF8
Codec ID/Info : UTF-8 Plain Text
Duration : 17 min 30 s
Language : English
Default : No
Forced : No
I did this all on Windows 10 in Windows Sandbox with a fresh Tartube install to make sure I didn’t have some lurking non-default setting causing unexplained behavior. Here’s what to do to get the same results, along with a screen recording of the same process with some rough edits (don’t be scared off by the long instructions–it’s mostly me just explaining what the options do, and it should only take about five to ten minutes!):
Grab the 64-bit Windows installer here
Go through the install process leaving everything as default and installing yt-dlp
and FFmpeg when prompted
Go through the tutorial just to get a sense of how things are laid out (it’s a lot to take in so don’t expect to remember everything, and I’m going to guide you through the exact steps do don’t worry)
Add your channel by first copying the /videos URL (e.g. https://youtube.com/@MrBeast/videos), clicking the Add channel icon (second from the left in the toolbar), and entering the channel name (this will be the name of the folder that videos are stored in). If the URL isn’t automatically grabbed, paste it into the second box.
Right-click the channel in the left-hand menu and select Downloads -> Apply download options
Make sure Create new download options is selected and click OK
(Optional) Give the options a sensible name, e.g. “Japanese Audio with embedded English SRT”
Paste these options into the Additional download options box:
--format bv*+ba[language=ja]/bv*+ba[language=en]/bv*+ba/best
--convert-subs srt
--compat-options no-keep-subs
Explanation of the options:
--convert-subs
is pretty self explanatory–it will convert the YouTube VTT subs to SRT.
--format
: the format selection is a hierarchy delineated by the /
. First, it tries to download the best video with the best audio in Japanese (bv*+ba[language=ja]
). If Japanese audio isn’t present, it tries English audio (bv*+ba[language=en]
). If neither are present (which can also happen if the uploader failed to mark the language correctly), it grabs whatever the default audio track is. If all else fails, it grabs the best combined format (this should realistically never happen on YouTube). If you dislike any of those fallback options and/or would prefer that the download simply fail, feel free to delete any/all of them along with the preceding /
, although I recommend at least leaving bv*+ba
. For your use case, --format bv*+ba[language=ja]
is the bare minimum which will fail if there isn’t an audio track explicitly labeled as Japanese.)
--compat-options no-keep-subs
is necessary to make sure the subtitles are deleted after merging them into the MKV, since the options we will be setting through the GUI include both --write-subs
and --embed-subs
, and the default behavior in this scenario is to both embed the subs and write them to an external file. If you prefer to keep the external SRT file, simply remove this line.
(Optional) Click the Files tab at the top and customize the filename format. Personally, I’m partial to
%(upload_date)s %(title)s-%(id)s.%(ext)s
so that I can naturally sort things by upload date and easily go between URLs and videos (since YouTube URLs are just https://youtube.com/watch?v=[id]
), but if you’re happy with the default title-only you can leave this be.
Click the Formats tab at the top. Set the drop-down for If merge is required after post-processing, output this format: to mkv. It will give you a warning that you need to also add it above, but as far as I can tell this is neither true (works fine without it) nor possible (mkv isn’t even listed there). If you do prefer specific video/audio formats or want a specific/maximum resolution, let me know and I can change the format
option to accommodate that preference, since unfortunately this tab doesn’t account for multiple audio tracks.
Click the Subtitles tab at the top. Ensure that Download subtitle file for these languages: is selected and that English [en] is listed (if your default Windows language is English I think it’ll already be there, but if not, add it from the list on the left). Note that this will not grab the automatically-generated subtitles from YouTube, but it sounds like you don’t need these for your specific situation.
Click the More options sub-tab. Under Preferred subtitle formats write srt/best
(I honestly don’t think this will affect YouTube since all subs seem to be VTT, but it can’t hurt). More importantly, check the box for During post-processing, merge subtitles file with video.
Click OK in the lower-right to save the download options. You’re done with the setup!
If you want to download the entire channel in this way, right-click the channel in the left-hand menu and click Download channel. You can monitor the download progress in the Progress tab and see the raw yt-dlp
command line output in the Output tab. If you only want certain videos, instead choose Check channel. This will grab all the metadata for the channel’s videos, displaying them as a grid of thumbnails, and then you can select them through the GUI and download the specific ones you want. It also might be a good idea to do this if you want to test the options on one video to make sure you’re getting the result you want before going all-in on downloading the channel.
Looking over the yt-dlp
output as a sanity-check, I can confirm it does the following things:
Writes en.vtt
subtitles (English subtitles in the default YouTube format)
Selects the best video format (1080p VP9)
Selects the audio format 251-1
(which is the best Japanese audio on this particular video)
Converts subtitles to SRT
Merges all three tracks into MKV
Deletes external SRT
which I think is all the functionality you requested! Let me know if you have any further questions and I’ll do my best to answer them.
yt-dlp
is gonna be the go-to tool for any YouTube downloading, but I don’t have much experience with frontends for it. I use Tartube for archiving channels, but it can be a bit byzantine and might be overkill for what you need–plus, there’s a decent chance you will need to manually enter some yt-dlp
options anyway (although only during the setup process). That being said, it’s the only one I have experience with, so it’s the one I’ll recommend!
Couple of clarifying questions:
When you say “download a YouTube channel in a particular language”, do you just mean a general monolingual channel (e.g. Masahiro Sakurai’s Japanese channel), or do you mean a channel that has videos with multiple audio tracks (such as this video with three different language tracks)? Both are doable, but I think you’ll need to add an actual command line flag for the latter whereas the former should be achievable pretty simply through Tartube’s GUI.
Are the subtitles you’re talking about added by the uploader, or are they auto subs (in this case, auto subs that are auto translated)? Both are easily achievable through the GUI, just slightly different instructions for either one. Also, depending on the scope of things, the simplest approach might be to simply download all subtitles (may not want to do that for like a MrBeast video with a dozen subtitle tracks), which also sidesteps the possible issue where the language of tracks isn’t properly indicated by the uploader.
When you say “put all streams for a single video together”, do you mean that you don’t want the video and audio tracks merged into a single file, or just that when you try to download the video you get a pre-merged file that doesn’t contain the tracks that you want? Was a little confused by this part.
I know you’re looking for a GUI solution, but while I wait for clarification I might as well drop a basic yt-dlp
command to give you an idea of the parameters we’re dealing with (here I’m assuming separate audio tracks and uploader-added subs):
yt-dlp --format bv+ba[language=ja] --sub-langs en --write-subs --convert-subs srt --download-archive channel_archive.txt video_or_channel_url_goes_here
--format bv+ba[language=ja]
: gets the “best” video track and Japanese audio track (for a 4K video yt-dlp
prefers the VP9 encode, but if it’s a video with a lot of views there may also be an AV1 encode–if you want that AV1 encode you have to explicitly opt for it by using bv[vcodec^=av01]
instead of plain bv
)
--sub-langs en
: downloads English subtitle(s)
--write-subs
: write subs to an external file (as opposed to embedding them)
--convert-subs srt
: converts subs to srt format, if possible
--download-archive channel_archive.txt
: writes the IDs of successfully downloaded videos to the specified file channel_archive.txt
. If you re-run this command, these videos will be automatically and very speedily skipped over without needing to fetch any additional information. Even without this option, yt-dlp
is smart enough to skip over videos that have already been downloaded (assuming the output filenames will be the same), but it will go through the entire process of fetching all the video information for each video up to the point it is about to start downloading, which is a huge waste of time if you’re just updating a channel archive and need only the newest three videos.
Everything in that command (except for the audio track bit, to my knowledge) can be handled in the Tartube GUI in relatively simple fashion, provided you know which menus to dig into.
edit: forgot the URL in my command, kinda important!
Real babies in incubators
Link to the report