@weirdwriter tsv_edl
is an amazing caption editing system that also outputs video
https://github.com/scateu/tsv_edl.vim
Basically you convert your video into text, then edit the text to create breakpoints (and fix typos), and you can output the roughcut srt and video files