I have a big 49" wide screen monitor and sharing my screen in Google Meet was cumbersome because you can only share a window or the whole screen, but not a screen region.
So I wrote a small tool that uses the xrandr extension to mirror an area to a virtual monitor which then can be shared.
See my blog post for some more details: https://www.splitbrain.org/blog/2024-10/11-introducing_clips...
I love how simple this is- Barely 100 lines or C++ (ignoring comments). That's one thing that makes me prefer X11 over Wayland.
The code is a little weird. There is no XLib event loop. It calls sleep(100) in a loop until it hits SIGINT. That will have high cpu usage for no reason.
It will not, even adding just a 1ms sleep in a loop will drop CPU usage to barely noticeable levels, 10 wakes a second is barely anything for any CPU from the past 3 decades.
Not my experience at all. Granted I haven't tried writing a loop like this in 20ish years, because once you spot that mistake you don't tend to make it again, and CPUs are better now.
Another thing to note is when you call sleep with a low value it may decide not to sleep at all, so this loop just might be constantly doing syscalls in a tight loop.
> Not my experience at all. Granted I haven't tried writing a loop like this in 20ish years, because once you spot that mistake you don't tend to make it again, and CPUs are better now.
You can trivially verify it by running the following, I have personally been using "sleep for 1ms in a loop to prevent CPU burn" for years and never noticed it having any impact, it's not until I go into microseconds when I can start noticing my CPU doing more busy work.
// g++ -std=c++20 -osleep sleep.cpp
#include <thread>
#include <chrono>
int main(int, char **)
{
while (true) {
std::this_thread::sleep_for(std::chrono::milliseconds {1});
}
return 0;
}
> Another thing to note is when you call sleep with a low value it may decide not to sleep at all, so this loop just might be constantly doing syscalls in a tight loop.On what system? AFAIK, if your sleep time is low enough, it will round up to whatever is the OS clock resolution multiple, not skip the sleep call completely. On Linux, it will use nanosleep(2) and I cannot see any mention of the sleep not suspending the thread at all with low values.
If memory serves, Windows treats a sleep under the scheduler quantum length as a yield. It may take you off the cpu if there's something else to run but it may not.
At any rate, back to the code at hand, there are many ways to block on SIGINT without polling. But it's also hugely odd that this code does not read events from the X11 socket while it does so. This is code smell, and a poorly behaved X client.
In Wayland you just start a capture with the xdg-desktop-portal API and it notifies the user and let them select the area to capture.
Yes, but I believe op was refering to how interacting with all things Wayland seems to be more involved than with x11. I'm not sure this is indeed like this, I have zero experience in developing for Wayland, but I think this is what op meant.
From a quick "how do I implement this in Python" with ChatGPT it seems to be about 30 lines, since most of the heavy lifting is done for you by the API.
As someone who uses LLM's regularly to assist in code creation, take that output with a huge grain of salt until you've actually tested it. Especially as it relates to Wayland, I've pulled my hair out trying to get an LLM to assist with very similar tasks to this.
there's very little code because there's very little error handling / sanity checking. not saying X11 isn't hackable and cool, but a lot of code gets bloated and complex (and robust!) by not assuming perfect usage.
for example. run ./clipscreen 1 2 3 4
True. If something goes wrong this will just crash. But to be fair, the only error handling I could think of would probably just exit with a vague error message... Pull requests to make it more robust welcome anyway!
To the parent, splitbrain just got you to QA this for him. The true cost of software is the maintenance and QA, and he got you to do free work, and here I am doing free work writing about it. How hard we BOTH just got pwned! </joke>
will work for food
haha yeah, its ok for a tool its really cool honestly :p just commenting on the 'so little code' might be good to check if the x y etc. are within the screen / set resolution perhaps.
This certainly is an elegant X.org party trick that can't be done easily in almost any other windowing system: creating a virtual Xrandr display that overlaps with existing physical displays. It's slightly awkward since if it exits outside of sigint it will leave a virtual output and no overlay window but that's a pretty minor issue. (All of that having been said, I would strongly advise to not over-index on SLoC as a measure of quality or elegance.)
This flat-out can't be done in Wayland. Though all is not lost, you might not need this at all in Wayland. The standard way to capture the screen from an unprivileged process in Wayland is through desktop portals, and at least KDE supports a wide variety of different capture options including capturing a rectangle of the screen. I haven't tried, but I suspect this is even true when running X.org applications, thanks to XWaylandVideoBridge.
I am not really thrilled about D-Bus stuff everywhere, but it is nice that you can pretty much override any screen capture behavior you want by changing the org.freedesktop.impl.portal.ScreenCast implementation: I think that's actually a step in a better direction versus having every application implement its own functionality for selecting capture targets.
Is it much more difficult under Wayland?
Yeah. I mean, not to deny the decades of arguments over its warts, but it's kind of amazing to me the extent to which X11 has emerged as, well, the simplest/best and most hackable desktop graphics environment available. You want to play a trick, it's right there. The ICCCM got a ton of hate back in the early 90's, but... no one else has an equivalent and people still innovate in the WM space even today.
Hackable is right. But not always in the positive sense of the word.
I find it very interesting how much our threat model has changed in the last 10-15 years. We no longer trust even local software, as we have to assume everything is now malicious. Commercial software from "reputable" companies can't be trusted to not pull a ton of analytics and personal data off your computer. We now have to worry about every piece of software being a keylogger and spying on other windows/applications and reporting back.
We've had to give up so much flexibility. Wayland certainly focuses on plugging this hole, but it means we've lost all these cool utilities like this one. There was just so much you could do with devilspie, xdotool, and others to make sure my operating system and window environment worked for me.
I still really miss X11's Zaphod mode, where you had two independent X sessions (:0.0 and :0.1) on two different monitors, with different window managers and different windowing rules.
I miss the days of being able to trust my computer and trust my software.
If you can't trust your locally installed software, everything is lost. I understand where this new threat model comes from for some people but I'd rather continue to avoid bad software sources than hamstring my OS in the hopes of avoiding malware I installed on purpose.
I agree. But can you trust Zoom? What about Office or Photoshop? Can you trust Websites or your browser anymore? Even open source apps have analytics in them that may not be trustworthy anymore (firefox, audacity, ...).
> If you can't trust your locally installed software, everything is lost.
That's only true if you decide to trust it.
You can deal perfectly well with software you distrust, and not have it harm your system.
FWIW, the threat model you're imagining is an attacker being able to run code to display directly to the desktop using the lowest level native API. A local[1] code exploit at the level of an interactive user is already a huge failure in the modern world.
Is that a reasonable argument against using X11? Sure, for some use cases. Is it a good argument for wayland/windows/OSX/whatever to do your tiling WM experimentation? Not really, those environments kinda suck for playing around with.
[1] Or "local-ish", your system or a trusted remote has to have been compromised already. Untrusted X11 protocol still exists but is deliberately disabled (and often blocked) everywhere. Even ssh won't forward it anymore unless you dig out the option and turn it on manually.
Isn't any app that can access read the x11 socket able to read any input? It's not just running an explicitly malicious app but also the risk of compromising an app which can read the x11 socket (e.g. Firefox)
It's also why there existed more advanced security extensions for X11 (like security labels for windows), but also why even bare-bones X11 had methods to ensure that only one specific application was getting input, specifically to handle secure input like with passwords.
Yes, exactly. I'm just saying that the response to a remote browser exploit in firefox is more likely to be "YIKES ZERO DAY IN FIREFOX!!!!!" and not "well it's a good thing we're running it in windows so it can't screenshot other apps or inject key events".
It's not like it's not a valid argument, just that it's sort of a nitpick. Security is hard, and defense in depth is a thing, but this particular attack surface is way, way back in the "depth" stack for a modern app deployment.
Javascript has managed to even ruin the linux desktop. Running every random JS application sent to your browser VM makes the browser insecure which means the entire computer can't be trusted. This is the reason things like the waylands enforce a smartphone like model of security where the user's applications aren't allowed to communicate or interact with other elements of the graphical desktop. Applications aren't trusted. So the user isn't trusted. A trade-off not worth it.
X11 is the opposite of simple and hackable. What you are thinking of as "hackable" is actually the result of it having a ton of legacy features that enable users to do neat tricks.
Wayland breaks a lot of these tools because it is so much simpler than X.
By window manager started out as ~50 lines of Ruby copying an equivalent amount of C.
You can say many things about Wayland, but it's "simple" from a point of view I for one really do not care about. Wayland may be "simple" in some respects, but it makes most of the things I care about doing unnecessarily complex.
Walyand probably would have been better if wlroots had been developed as a (whatever this means) first-party “built-in” library.
Lacking features isn't the same thing as "simpler", Wayland is great, but is very much a subset of the features implemented on an X11 desktop. Wayland doesn't do selections or provide any IPC mechanism of its own, much less something like an ICCCM that allows you to identify/target other users of the desktop and interact with them in a flexible way. In fact as I understand it the linked tool is in fact impossible to write in Wayland.
Again, this isn't the fault of "Wayland", which is just a compositor framework. The complaint is that the ecosystem of "desktop" software which evolved around Wayland is an ad hoc monstrosity that lacks the unified structure that its ancestor had way back in the X11R5 days.
The most hackable would have been a Lisp based desktop.
Do I understand correctly that you could to this with OBS on any platform, including Wayland? I'm reading many comments that make me think either many people don't know about OBS, or I'm overestimating it's abilities.
You probably can. I never used OBS, but it's probably a bit more than a 20kb binary though ;-)
I don't understand, what is the significance of a 20kb binary? The only person using this would be someone who takes Zoom meetings on a company-issued computer and I can't imagine such machines are disk space-constrained.
Also, I remember a friend showing me in Zoom that you can share not just one but multiple screens/windows—press the SHFT key while clicking the windows you want to share.
How do people discover these things?
Isn't that the same of how you select multiple files in most file managers?
Shift+Click: select from currently selected item to clicked item
Ctrl+Click: add/remove clicked item to set of selected items
The same question I asked (that was me after using Zoom for 3+ years).
This is surely useful right now. I wonder what will happens to all the nice X11 tools once Wayland (hopefully soon) will be the golden standard. There are options to enable X11 behaviors in Wayland but I guess that is just a fallback to the insecure implementation.
Wow, this is fantastic! This exact use case, on Linux, is why our company selected Zoom instead of Meet.
Awesome!
Built it and took a fullscreen screenshot with GIMP to figure out the width/height/x/y coordinates I wanted and tested with Google Meet. Working perfectly!
https://github.com/naelstrof/slop Can also use a utility like this one, which lets you select an area of the screen and output it in a specified format.
Wow that is also very cool. For those wondering, this is what it looks like:
$ sudo apt install slop
$ slop
<selects an area on screen>
1719x1403+1080+277
Putting the two together is easy too:
$ clipscreen $(slop -F "%x %y %w %h")
NB. The lack of quotes around $() enables wordsplitting to occur.
I think you got the size and position switched.
Or
$ clipscreen $(slop | tr -s "+x" " ")
I've looking for something like this for quite sometime. It's simple, clean and elegant.
This is only helpful if you are using a desktop environment. What about window managers like i3?
I've always wanted something like this, but for i3 workspaces. Something like "share workspace 2." Anyone know how to accomplish this?
You can literally do this with just xrandr.
Nice. This is the first time I read about creating a virtual monitor in X.
This is brilliant. I've wanted this so many times and had to awkwardly switch between window being shared instead.
I wouldn't mind switching between windows if I could use the GNOME Activities overview for that. But maybe that is not possible because there is no way to communicate the change in stream size if the windows have different sizes?
Can you not use std::condition_variable to avoid the active waiting of the signal?
Dang. I need this for Mac. I’ve been wishing I had exactly this for years.
Wasn't the same thing posted for MacOS a few days back, can't recall the name? Looking at the time on the repo makes me think the author pushed after seeing people requesting something similar for Linux.
edit: Here you go https://github.com/Stengo/DeskPad
That's not quite the same. With DeskPad you have to move the window to the virtual monitor. clipscreen allows you to select a portion of your screen without moving any windows.
I use "Advanced Screen Share" for this purpose. it has a one time purchase if you want to remove a small overlay but it gets the job done and is installable through the app store.
That s very cool... Speaking of which: any easy way to allow two people, both on X, to both share and interact (keyboard and mouse) with a common X window?
The app that we d like to share and both control is a browser (running on a machine on our LAN) so a browser extension would work too I guess.
I think there was some way to do that with existing tools. I forget the details because I only threw it together as a bit of fun novelty. I think the terms to google are x2x and multiseat though, at least to start your search…
My preferred solution for that would be a VNC server (so that it shares the whole screen) installed in a VM.
Neat. Now I want for Wayland. Don’t use X11 for some years.
Never to late to upgrade to X11 :)