@tetha

tetha@feddit.de · edit-2 1 year ago

I mean to a certain degree, I can understand if people find a problem with Poetterings approach of doing things !CORRECTLY!. Like, systemd-resolved resolving A-records with multiple addresses ina deterministic fashion because it’s not defined not to be deterministic, and because actual load balancing would be better. It’s not wrong, but it’s breaking everything. And it got patched after some uproar. And there are a few things like that.

But at the same time - I don’t think people appreciate how hard doing process management right on linux can be, especially if the daemon to run is shitty. Like, init scripts just triggering the shutdown port on a tomcat - except the tomcat is stuck and not reacting to the normal shutdown port and now you have a zombie process and an init script in a fucked up state. Or, just killing the main process and for some reason not really removing the children, now there’s zombies all over the place. Or, not trying appropriate shutdown procedures first and just killing things, “because it’s easier” - except my day just got harder with a corrupt dataset. Or, just trying soft and “Pwease wexit Mr Pwocess” signals and then just giving up. Or having “start” just crash because there was a stale PID from an OOM killed process around. Man I’m getting anxiety just thinking about this.

And that’s just talking about ExecStart and ExecStop, pretty much, which I have done somewhat correct in a few init scripts back in the day (over months of iteration of edge cases). Now start thinking about the security features systemd-analyze can tell you about, like namespaces, unmapping syscalls, masking parts of the filesystem, … imagine doing that with the jankyness of the average init.d script. At that point I’d start thinking about rebooting systems instead of trying to restart services, honestly.

And similarly, I’m growing fond of things like systemd-networkd, systemd-timesyncd. I’ve had to try to manage NetworkManager automatically and jeez… Or just directly handling networking with network-scripts. Always a pleasure. Chucking a bunch of pretty readable ini-files into /etc/systemd/networkd is a blessing. They are even readable even to people rather faint on the networking heart.

tetha@feddit.de · edit-2 1 year ago

And even password based disk encryption can be defeated with 2-3 physical accesses if an organization wants to hard enough. Keyloggers can be very, very sneaky.

At that point you’d have to roll something like Yubikey-based disk encryption to be safe, because this re-establishes control over some physical parts of the system. Until they find the backup Yubikey you had to not lose all data by losing the primary key you’re carrying around to maintain control over it.

It’s not a battle the defending side can win.

tetha@feddit.de · 1 year ago

And that skeleton of a system becomes easier to test.

I don’t need to test ~80 - 100 different in-house applications on whatever many different versions of java, python, .net and so on.

I rather end up with 12 different classes of systems. My integration tests on a buildserver can check these thoroughly every night against multiple versions of the OS. And if the integration tests are green, I can be 95 - 99% sure things will work right. The dev and testing environments will figure out the rest if something wonky is going on with docker and new kernels.

tetha@feddit.de · 1 year ago

Inside, I like Abyss, similar to the OceanDeep theme of Vim.

When I’m on the balcony, it’s hard to see anything, so them I switch to Solarized Light.

tetha@feddit.de · 1 year ago

IMO, this is the elephant in the room.

If you’re looking at what people used CentOS or Rocky or Alma for - dev systems, CI systems, … These aren’t lost sales. If you forced them to off of their solution, they aren’t going to pay the price tag and management/installation pain of RHEL. If they have people knowing how to run Linux, they’ll use something else. And sure, they are drawing some resources from RH (bandwidth for packages at the very least), but they are giving the RH system a larger footprint in deployed systems. And people running it had a positive opinion about the system.

But Oracle Linux is a different beast. Here a company is poaching large customers willing to pay for support by repackaging your product for less effort. It sucks, but it’s entirely consistent for Oracle to be part of ruining a good thing.

tetha@feddit.de · 1 year ago

A bit from an ops-side, but I think it applies. I think pair-work, pair-programming, pair-troubleshooting is a tool for specific situations. It’s amazing in some places, and an exhausting waste of company resources in other places.

Like, if we’re in a hard situation with many unknowns and possibly horrible consequences of mistakes. Critical systems down, situation is weird. Or, implementing config management for something entirely new. Or, trying out new code structures, ideas. That’s when being two is great. You can bounce changes you make to the system off of your copilot, so you can be very safe while being fast. You can have two opinions about shaping a piece of code and APIs. You can iterate very quickly if necessary.

On the other hand though, there are things that require deep thought. Like, I had to figure out how 4-5 teams use an infrastructural component, what’s the live cycle of the component, when to create it, when to delete it, how to remove it. It ended up being twelve lines of code in the end, but like 1 phone call every two lines of code, and an hour of thinking per line of code. Pair programming would not have been compatible with this.

Or, the third kind, is crunch work. The best way to do crunch work (besides automating it) is to just put up headphones, find flow and hammer it down. Have it reviewed later if necessary. But why would we need 2 guys following the same runbook for the upteemth time?

It’s a great tool to share knowledge and to handle critical tasks with high error potential and I wouldn’t want to miss it for that. But it can be overused in my book.

tetha@feddit.de · 1 year ago

Well, you’re looking at a method, and imagine two things.

The first is a link to a confluence article. You click on it. Nothing loads. Ah, right. Activate the VPN. Click the link again. You have no access. So you send your IT a ticket to gain access. One week later you get a mail you have access know. For what? Who’d remember last week?

Alternatively, there’s an inline comment, or a markdown file in the same repo so you click on it and your IDE opens it. And then you modify the piece of code and you realize you still have that markdown file still open, so you adjust a few things and also note down a weird quirk you found during debugging.

However, in the first case… well, you finally had access to the documentation, so you want to modify it to bring it up to date. Well, guess what. You have read access. So back to another ticket with IT that you’re actually working on this and you’d like to update the documentation. After a week, you’re notified: Well they need approval of the documentation owner for you to get write access. They are on vacation. When they get back after 2 weeks, they approve the request, and it goes into a second round of approvai with your teamlead. And guess what? Right, he’s not in for the next 2 weeks. By the time you finally have write access, you’re not working in that department anymore. And no, that other department doesn’t use that confluence.

Overall, documentation tends to be somewhat of a chore for many people. If it’s close - it’s in the same repo, you can open the file in your IDE, you can commit updated documentation with your code in the same PR - there’s a slightly higher chance for folks to update documentation. If you put in the hellscape of a process some companies have for their tooling there, no one will ever touch the documentation.