**Sebastian Krzyszkowiak** @dos@librem.one · Jul 05, 2025, 12:28

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 05, 2025, 12:28

Sebastian Krzyszkowiak @dos@librem.one

Sebastian Krzyszkowiak @dos@librem.one

2.55K Posts

630 Following

1.08K Followers

Homepage: https://dosowisko.net

Games: https://dos.itch.io

Holy Pangolin: https://holypangolin.com

Liberapay: https://liberapay.com/dos

Hi, I'm dos. Silly FLOSS games, open smartphones, terrible music and more. 50% of @holypangolin; 100% of dosowisko.net. he/him/any. I don't receive DMs.

Joined Apr 2019

630 Following 1.08K Followers

Posts Posts and replies Media

Jul 05, 2025, 12:28

Sebastian Krzyszkowiak @dos@librem.one

@pavel Not sure what you mean. GStreamer is internally multi-threaded, but its API is thread-safe and there's only one thread in this code. Of course any kind of production-quality code will use some mainloop and enqueue buffers based on callbacks rather than while(!processed){} loop, but it's not exactly rocket science.

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 05, 2025, 12:22

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 05, 2025, 12:22

Jul 05, 2025, 12:22

Sebastian Krzyszkowiak @dos@librem.one

@pavel You've got a dma-buf handle, already mapped buffer and even GStreamer with all its sinks available, so... however you want? Pretty much anything will be able to consume it easily.

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 04, 2025, 16:04

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 04, 2025, 16:04

Jul 04, 2025, 16:04

Sebastian Krzyszkowiak @dos@librem.one

@pavel I'm playing with GStreamer now (which is new for me) and it seems like most of this code could be replaced with GStreamer elements, and the rest should neatly plug in as custom elements 😂

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 04, 2025, 13:45

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 04, 2025, 13:45

Jul 04, 2025, 13:45

Sebastian Krzyszkowiak @dos@librem.one

@pavel Yes, of course.

BTW. Turns out that streaming to YouTube instead of a local file is just a matter of using rtmpsink instead of filesink 😁

eff4dcd93e28ca96.png?1751636644

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 04, 2025, 09:01

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 04, 2025, 09:01

Jul 04, 2025, 09:01

Sebastian Krzyszkowiak @dos@librem.one

@pavel Pretty sure it will just work fine once it's rewritten cleanly and does such arcane magic as releasing the buffers at the right time etc. :)

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 23:01

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 23:01

Jul 03, 2025, 23:01

Sebastian Krzyszkowiak @dos@librem.one

@pavel When I lie to GStreamer and tell it that its input is in YUY2, it gets faster - perhaps even fast enough to encode at 1052x780. That's another opportunity for improvement.

(and there's nothing magic about fences, it's just a simple synchronization primitive 😛)

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 22:57

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 22:57

Jul 03, 2025, 22:57

Sebastian Krzyszkowiak @dos@librem.one

@pavel Toggling the killswitch makes it appear though.

IIRC PDAF was also usable at half-res.

RAW10 is just a matter of setting up clocks for higher bandwidth and more lanes. Switching data format is then just a single register away.

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 22:19

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 22:19

Jul 03, 2025, 22:19

Sebastian Krzyszkowiak @dos@librem.one

@pavel There's a question whether it will be worth elevated power consumption though. I've also stumbled upon csi erroring out with "Rx fifo overflow" requiring a reboot to recover that I haven't seen at lower resolutions, but haven't looked closer.

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 21:58

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 21:58

Jul 03, 2025, 21:58

Sebastian Krzyszkowiak @dos@librem.one

@pavel LDLIBS = -lEGL -lGLESv2 -lm -ldrm -I/usr/include/libdrm -lgbm -lgstvideo-1.0 -lgstapp-1.0 -lgstallocators-1.0 -lgstreamer-1.0 -lgobject-2.0 -lglib-2.0 -I/usr/include/gstreamer-1.0 -I/usr/include/glib-2.0 -I/usr/lib/aarch64-linux-gnu/glib-2.0/include

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 21:57

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 21:57

Jul 03, 2025, 21:57

Sebastian Krzyszkowiak @dos@librem.one

@pavel There's plenty of low-hanging fruits in there. Higher frame rates and 10-bit output are also likely some debugging session or two away 😜

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 21:45

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 21:45

Jul 03, 2025, 21:45

Sebastian Krzyszkowiak @dos@librem.one

@pavel (the parts that I added at least, there are parts of your code in there still)

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 21:44

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 21:44

Jul 03, 2025, 21:44

Sebastian Krzyszkowiak @dos@librem.one

@pavel Good question. Not sure what license would be appropriate to put on something that's mostly an output of a model trained on code on all sorts of licenses anyway...

But given that it's just a bit of glue code between three APIs put together as an example, consider it to be under MIT-0 😜

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 21:36

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 21:36

Jul 03, 2025, 21:36

Sebastian Krzyszkowiak @dos@librem.one

@pavel BTW. The fact that I could stream full-res frames and bin them down in the shader at real time is an interesting news, as this may open up possibility to use phase detection autofocus.

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 21:12

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 21:12

Jul 03, 2025, 21:12

Sebastian Krzyszkowiak @dos@librem.one

@pavel The first thing to do to improve it (after cleaning it up) would be to actually make use of the buffer pool. Dequeue the buffer, attach it as a texture, kick off rendering, get a fence and pass it with the output buffer to GStreamer without waiting on rendering to finish, then queue it back asynchronously once rendering is done. This should allow for much more complex shaders than this sequential code does.

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 21:04

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 21:04

Jul 03, 2025, 21:04

Sebastian Krzyszkowiak @dos@librem.one

@pavel https://paste.debian.net/1384224/

It's ugly, hardcodes everything, lies on frame timing, occasionally segfaults. Most of it is copied straight from LLM, I just massaged the pieces to work together. Not the kind of code I'd like to sign off on :) But it's a working example, so have fun with it.

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 20:26

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 20:26

Jul 03, 2025, 20:26

Sebastian Krzyszkowiak @dos@librem.one

@pavel Seems it's the latter, as the result's exactly the same with 1052x780 camera frames and 263x195 video 😁

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 20:15

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 20:15

Jul 03, 2025, 20:15

Sebastian Krzyszkowiak @dos@librem.one

@pavel Plugged it into V4L2 - with a caveat that for now I fed the GPU full-res 13MP frames to meet stride alignment requirement (the shader output is still 526x390). It says it does 240 frames in 10.55s. I wonder if it's really slightly too slow, or just bad timing from our camera stack :)

d74ce9e196831394.mp4?1751573515

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 19:06

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 19:06

Jul 03, 2025, 19:06

Sebastian Krzyszkowiak @dos@librem.one

@pavel > I can't easily connect gstreamer to that

Why not? I quickly hacked up passing dma-bufs to GStreamer and even though I'm glFinishing and busy-waiting on a frame to get encoded sequentially it still manages to encode a 526x390 h264 stream in real time on L5.

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 15:07

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 15:07

Jul 03, 2025, 15:07

Sebastian Krzyszkowiak @dos@librem.one

@pavel That said, rendering to a linear buffer can be slower, that's expected. The question is whether gains from passing buffers around for free are higher, which for an actual "record video from a camera" use case will almost certainly be true (and which has very different performance characteristics from reading images from files - you can't directly attach a file as a texture).

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 15:00

**Sebastian Krzyszkowiak** @dos@librem.one · Jul 03, 2025, 15:00

Jul 03, 2025, 15:00

Sebastian Krzyszkowiak @dos@librem.one

@pavel I left the memcpy line commented out for a reason - with it uncommented, the result is exactly the same as with glReadPixels (which is effectively a memcpy on steroids). The point is to pass that buffer to the encoder directly, so it can read the data straight from the output buffer without waiting for memcpy to conclude.

I've also verified that the approach is sound by having the shader output different values each frame and accessing it via hexdump_pixels inside the loop. Still fast ;)

Homepage: https://dosowisko.net

Games: https://dos.itch.io

Holy Pangolin: https://holypangolin.com

Liberapay: https://liberapay.com/dos

Hi, I'm dos. Silly FLOSS games, open smartphones, terrible music and more. 50% of @holypangolin; 100% of dosowisko.net. he/him/any. I don't receive DMs.

Joined Apr 2019

shotonlibrem5 Jan 31, 2026, 16:58

143

librem5 Feb 17, 2026, 15:52

103

Sebastian Krzyszkowiak @dos@librem.one

Sebastian Krzyszkowiak's choices:

shotonlibrem5 Jan 31, 2026, 16:58

librem5 Feb 17, 2026, 15:52