• 2 Posts
  • 20 Comments
Joined 1 year ago
cake
Cake day: June 2nd, 2023

help-circle
  • It’s not directly related to the torrent or its content no. It’s more related to the potential bugs in Transmission that might be exploited to propagate viruses.

    Since Transmission has to exchange data with un-trusted parties, before knowing whether the data is relevant to the torrent you are downloading, anyone could exploit bugs that exist in the parsing of these messages.

    So running Transmission as a dedicated user limits what an attacker may have access to once they take control of Transmission through the exploit of known or unknown bugs.

    Obviously, this user need to have many restriction in place as to prevent the attacker from installing malware permanently on the machine. And when you copy over data that has been downloaded by Transmission, you’d have to make sure it has not been tampered with by the attacker in an attempt to get access to the data available to your real account.

    If you just use transmission occasionally, not on a server, I would not bother with it. Either use the flatpak version for some sandboxing and similar security guarantees as having a dedicated user running Transmission, or use an up to date version (the one from your distro should be fine) and don’t leave it running when you do not need to.


  • the common practice is to relax the dependencies

    I found this a bit disturbing

    I find that funny that, since this is rust, this is now an issue.

    I have not dwelved in packaging in a long while, but I remember that this was already the case for C programs. You need to link against libfoo? It better work with the one the distribution ship with. What do you mean you have not tested all distributions? You better have some tests to catch those elusive ABI/API breakage. And then, you have to rely on user reported errors to figure out that there is an issue.

    On one hand, the package maintainer tend to take full ownership and will investigate issues that look like integration issue themselves. On the other hand, your program is in a buggy or non-working state until that’s sorted.

    And the usual solutions are frown upon. Vendoring the dependencies or static linking? Are you crazy? You’re not the one paying for bandwidth and storage. Which is a valid concern, but that just mean we reached a stalemate.

    Which is now being broken by

    • slower moving C/C++ projects (though the newer C++ standards did some waves a few years back) which means that even Debian is likely to have a “recent” enough version of your dependencies.
    • flatpack and the likes, which are vendoring everything and the kitchen sink
    • newer languages that static link by default (and some distributions being OK with it)

    In other words, we never figured out a proper solution for C projects that will link with a different minor than the one the developer tested.

    Well, /rant I guess. The point I’m raising does not seem to be the only one, and maybe far from the main one, for which bcachefs-tools is now orphaned. But I’ve seen very dubious arguments to try and push back against rust adoption. I feel like people have forgotten where we came from. And while there is no reason to go back per say, any new language that integrate this deep into the system will face similar challenges.


  • I don’t think that’s the issue. As said in the article, the researchers found the flaw by reading the architecture documentation. So the flaw is in the design of the API the operating system uses to configure the CPU and related resources. This API is public (though not open source) as to allow operating system vendors to do their job. It usually comes with examples and pseudo code on how some operations work. Here is an example (PDF).

    Knowing how this feature is actually implemented in hardware (if the hardware was open source) would not have helped much. I would argue you are one level too low to properly understand the consequences of the implementation.

    By the vague description in the article it actually looks like a meltdown or specter like issue where some code gets executed with the inappropriate privileges. Such issues are inherent to complex designs and no amount of open-source will save you there. We need a cultural and maybe a paradigm shift on how we design CPU to fully address those issues.



  • making the place less equal, more of a broadcast medium, and less accessible to unconnected individuals and small groups.

    I do not think it is a very good analogy. I do not see how this would turn into a broadcast medium. Though I do agree it can feel less accessible and there is a risk of building echo chambers.

    How does an instance get into one of these archipelagos if they use allowlists?

    By reaching out, I would say. It’s most likely a death sentence for one-persone instances. Which is not ideal. On the other hand, I’ve seen people managing their own instance give up on the idea when they realized how little control they have over what gets replicated on their instance and how much work is required to moderate replies and such. In short, the tooling is not quite there.


  • I think both models (i.e. allowlist/blocklist) have their own perks and drawbacks and are all necessary for a healthy and enjoyable internet.

    I would tend to agree. I think both methods have their merits. Though ideally I’d rather have most instances use a blocklist model. This is less cumbersome to the average user and this achieves (in my opinion) one of fediverse goal of having an online identity not tied to an instance, an online identity you can easily migrate (including comments, follow, DMs, …) if needed.

    But the blocklist model is too hard to maintain at this time. There are various initiative to try and make it work, such as fediseer, and it might be good enough for most. But I think it’s a trap we should not fall into. On the fediverse, “good enough for most” is not good enough.

    Now that people are fleeing to the Fediverse, we’re just gathering our tribe - and this is a natural phenomenon.

    I think there is indeed something of that effect going on as well, this is true. But I do not think this warrants a move to allowlist by itself.

    I think the move to allowlist is mandated by the fact that building a safe space for “minorities” is hard. The tools to alleviate issues such as harassment and bigotry are not sufficient at this time to keep those communities fully open.

    Which is a shame as I think the best way to fight those issues, as a society, is to have people express themselves and have healthy conversation on issues that are rarely brought up.

    But we are not entirely giving that up by moving to an archipelago model. It just means that individuals would have multiple accounts, on different archipelago. The downside is that it makes the fediverse less approachable to the average person.


  • I think the current technical limitations push us toward this archipelago model.

    The thing is, bigotry and racism, to name only two, will exist on any social media, any platform where anyone is free to post something. And since those are societal issue, I don’t think it is up to the fediverse to solve. Not all by itself by any means.

    What the fediverse can solve however, is to allow instances to protect themselves and their members from such phenomenon. And my limited understanding, as a simple user, is that’s it’s not possible right now. Not on lemmy nor on Mastodon, if I trust the recent communications around moderation and instance blocking. Not without resorting to allow list.

    This is annoying to admit because it goes against the spirit of the fediverse. But the archipelago model is the only sane solution short term IMO. And it will stay that way until the moderation tools make a leap and allow some way to share the load between instances and even between users.



  • Enable permissions for KMS capture.

    Warning

    Capture of most Wayland-based desktop environments will fail unless this step is performed.

    Note

    cap_sys_admin may as well be root, except you don’t need to be root to run it. It is necessary to allow Sunshine to use KMS capture.

    Enable

       sudo setcap cap_sys_admin+p $(readlink -f $(which sunshine))
    

    Disable (for Xorg/X11 only)

       sudo setcap -r $(readlink -f $(which sunshine))
    

    Their install instruction are pretty clear to me. The actual instruction is to run

    sudo setcap cap_sys_admin+p $(readlink -f $(which sunshine))

    This is vaguely equivalent to setting the setuid bit on programs such as sudo which allows you to run as root. Except that the program does not need to be owned by root. There are also some other subtleties, but as they say, it might as well be the same as running the program directly as root. For the exact details, see here: https://www.man7.org/linux/man-pages/man7/capabilities.7.html and look for CAP_SYS_ADMIN.

    In other words, the commands gives all powers to the binary. Which is why it can capture everything.

    Using KMS capture seems way overkill for the task I would say. But maybe the wayland protocol was not there yet when this came around or they need every bit of performance they can gain. Seeing the project description, I would guess on the later as a cloud provider would dedicate a machine per user and would then wipe and re-install between two sessions.



  • On the topic of exposing sequence number in APIs, this has been a security issue in the past. Here is one I remember: https://www.reuters.com/article/us-cyber-travel-idUSKBN14G1I6/

    From the article:

    Two of the three big booking systems - Amadeus and Travelport - assign booking codes sequentially, making brute-force computer guesswork easier. Of the three, Amadeus, through its web portal CheckMyTrip, is especially vulnerable, Nohl said.

    The PNRs (flight booking code) have many more security issues, but at least nowadays, their sequential aspect should no longer be exposed.

    So that’s one more reason to be careful when exposing DB id in APIs, even if converted to a natural looking key or at least something easier to remember.




  • The reason behind kernel mode/user mode separation is to require all user-land programs to have to go through the kernel to do any modification to the system. In other words, would it not be for syscalls, the only thing a user land program could do would be to burn CPU cycles. And even then, the kernel can still preempt it any time to let other, potentially more important programs, run instead.

    So if a program can harm your system from userland, it’s because the kernel allowed it, every time. Which is why we currently see a slow move toward sandboxing everything. Basically, the idea of sandboxing is to give the kernel enough information about the running program so that we can tailor which syscalls it can do and with which arguments. For example: you want to prevent an application from accessing the network? Prevent it from allocating sockets through the associated syscall.

    The reason for this slow move is historical really: introducing all those protections from the get go would require a lot of development time to start with, but it had to be built unpon non-existant security layers and not break all programs in the process. CPUs were not even powerful enough to waste cycles on such concerns.

    Now, to better understand user mode/kernel mode, you have to realize that there are actually more modes than this. I can only speak for the ARM architecture because it’s the one I know, but x86 has similar mechanisms. Basically, from the CPU perspective, you have several privilege levels. On x86 those are called rings, on ARM, they’re called Exception Level. On ARM, a CPU has up to four of those, EL3 to EL0. They also have names based on their purpose (inherited from ARMv7). So EL3 is firmware level, EL2 is hypervisor, EL1 is system and EL0 is user. A kernel typically run on EL2 and EL1. EL3 is reserved for the firmware/boot process to do the most basic setup, partly required by the other ELs. EL2 is called hypervisor because it allows to have several virtual EL1 (and even EL2). In other words, a kernel running at EL2 can run several other kernels at EL1: this is virtualization and how VMs are implemented. Then you have your kernel/user land separation with most of the kernel (and driver) logic running at EL1 and the user programs running at EL0.

    Each level allocates resources for the sub-level (under the form of memory map, as memory maps, which do not necessarily map to RAM, are also used to talk to devices). Would a level try to access a resource (memory address) it has no rights to, an exception would be raised to the upper level, which would then decide what to do: let it through or terminate the program (the later translates to a kernel panic/BSOD when the program in question is the kernel itself or a segmentation fault/bus error for user land programs).

    This mechanism is fairly easy to understand with the swap mechanism: the kernel allows your program to access some page in memory when asked through brk or mmap, used by malloc. But then, when the system is under memory pressure, and it turns out your program has not used that memory region for a little while, the kernel swaps it out. Which means your program is now forbidden from accessing this memory. When the program tries to access that memory again, the kernel is informed of the action through a exception raised (unintentionally) by your program. The kernel then swaps back the memory region from disk, allows your program to access the memory region again, and then let the program resume to a state prior to the memory access (that it will then re-attempt without even realizing).

    So basically, a level is fully responsible for what a sub-level does. In theory, you could have no protection at all: EL1 (the kernel) could allow EL0 to modify all the memory EL1 has access to (again, those are memory maps, that can also map to devices, not necessarily RAM). In practice, the goal of EL1 is to let nothing through without being involved itself: the program wants to write something on the disk: syscall, wants more memory: syscall, wants to draw something on the screen: syscall, use the network: syscall, talk to another program: syscall.

    But the reason is not only security. It is also, and most importantly, abstraction. For example, when talking to a USB device, a user program does not have to know the USB protocol. This is implemented once in the kernel and then userland programs can use that to deal with all the annoying stuff such as timings, buffers, interruptions and so on. So the syscalls were initially designed for that: build a library of functions all user programs can re-use without having to re-implement them, or worse, without having to deal with the specifics of every device/vendor: this is the sole responsibility of the kernel.

    So there you have it: a user program cannot harm the computer without going through the kernel first. But the kernel allows it nonetheless because it was not initially designed as a security feature. The security concerns came afterward and were initially implemented with users, which are mostly enough for servers, and where root has nearly as many privileges as the kernel itself (because the kernel allows it). Those are currently being improved under the form of sandboxes, for which the work started a while ago, with every OS (and CPU architecture) having its own implementation. But we are only seeing widespread adoption by userland since fairly recently on desktop. Partly thanks to the push from smartphones where application-level privileges (to access the camera for example) were born AFAIK.

    Nowadays, CPUs are powerful enough to even have security features to try to protect a userland program from itself: from buffer overflow, return address manipulation and the like. If you’re interested, I recommend you look at the concept of pointer authentication.


  • I feel like since Glassdoor has a job board, there is an obvious conflict of interest that cannot be solved and ultimately yields to the company review part to be mostly useless. If you browse through a few company profiles, some that offer jobs on Glassdoor (and reply to reviews) and some that do not, you will quickly see what I’m talking about.

    I mean, who is going to post job offers on a board that also list your company at <4 stars with plenty of mention of “bad company culture” and “horrendous working condition”?

    I also do not like how most of the would-be-intersting-info is gatekeep behind you sharing various info on your own employer. I mean, it seems to make sense on the surface, but with the above, the whole thing feels like a scam. Where both employers and employees get scammed.

    How does it compare to alternatives ? Well, LinkedIn is terrible, with abysmal search function, results completely irrelevant to your profile, and the same seems to be true for recruiters seeing how often I’m proposed jobs completely outside of my skillset.

    So I guess Glassdoor is a viable source of job offer, especially if you have already selected companies you’d like to work at and those companies do not have their own board. But take all reviews and comments with a huge grain of salt.


  • I disagree. The question is not really “should we give programmer more power at the cost of yet another UB” but more “should we grow the API and add another UB for the select few for whom it might matter”. When you consider choices made on other parts of the STL, such as std::unordered_map, then you realize the STL is not about being the most performant things around, but rather a collection of reliable tools covering basic usage for 80% of the user base.

    With that in mind, I am against adding yet another function, which has its pitfalls, for minimal benefits. Again, such a function would be made almost entirely obsolete by a safe function that works with iterators/generators of known sizes. So I see even less benefit in adding a function that will just become yet another liability down the line.


  • The benchmark looks off. The msvc one may be the only one vaguely reliable. I suspect clang and GCC were able to optimize the synthetic benchmark to a little more than a loop doing additions. At 96ns for 1000 iteration, you are looking at 10G iterations per seconds. Which can only be achieved by a loop of two instructions executing at 2 inst/s on a 5GHz processor. And you will not get a 5x just for removing a highly predictable branch.

    So yes, std::vector leaves performance on the table, but no more than 10~15% for trivial loops that are not that uncommon but are rarely a bottleneck.

    Then you have to ask yourself, is it worth it to add yet another function that can crash your program if misused just for that 10% in a situation where they might not even matter. I mean, I know, it’s c++, zero cost abstraction, yadi yada, but if you’re looking for consistent performance you should have moved away from the STL already. As this post shows, your STL vendor already has a huge impact on the performance, and there are widely available options to optimize specific cases.

    So I’d rather keep the STL fairly simple. Add one function to work well with generators/iterators that have a known size if you want, but adding unchecked versions of every insertion function of every STL container is not worth it IMO.


  • First of, especially in C, you should very carefully read the documentation of the functions you use. It then should be obvious to you you are currently misusing it on two accounts:

    • You are not checking for errors
    • You are assuming the presence of a \n that might never be there (this one leads to your unexpected behavior)

    The manual tells you it will insert a \0 at the end of what it reads within the limits of the buffer. So this \0 is what you will need to look for when determining the size of the input.

    If there is a \n, it will precede the \0. Just make sure the \0 is not at index 0 before trying to erase the \n. If there is no \n before the \0, you are in either of two cases (again, this is detailed in the documentation): the input is truncated (you did not read the full line, as in your unexpected behavior above) or you are reaching the end of the stream. Note that even if the stream ends with \n, you might need to issue an additional fgets to know you are at the end of the stream in which case a \0 will be placed as the first byte of your buffer.

    If you really want to handle input that exceeds your initial buffer, then you need to dynamically allocate one and grow it as needed. A well behaved program will have an upper limit to the size of the input anyway (and this is why you don’t use gets). So you will need a combination of malloc/realloc and string concatenation. That means you need to learn all the pitfalls of dynamic memory allocation in C and how to use valgrind. For the string concatenation, even though strcat should be OK in your case, I’d recommend against it.

    In order to use strcat properly you need to keep track of the usage of the dynamically allocated buffer by hand anyway because you want to know when you will attempt to store more bytes in the buffer than is currently allocated. And once you know the number of bytes stored in the buffer, copying over the bytes that fgets returns by hand is fairly trivial and has less pitfalls. This also circumvent one of the performance pitfall of strcat: it needs to find the \0 in the destination buffer for every call. So effectively, it can transform all by itself a trivial usecase such as yours, that one would expect to be linear in algorithmic complexity, to be of O(N^2) complexity.

    On a final note: fgets does not allow you to handle binary data properly because you wont be able to tell apart a legitimate \0 coming from your input from a \0 inserted by fgets. So you will need to use fread in this case. I actually recommend using fread instead of fgets because it directly returns the number of bytes read, no need to use strlen to guess it and it makes error handling easier. Though you’ll need to add the trailing \0 yourself.


  • Contrieved examples you usually see in design pattern tutorials do not properly highlight the use case and usefulness of a specific pattern.

    What you did might have been fine. You only rarely need the builder design pattern.

    It’s useful when the object you want to build has a complex constructor, or several ones, or you want to more easily enforce the self consistency of the created object (some languages make it hard to share initialization code) or you want to hide some internal details (I.e., you are at an API boundary and want to be able to freely change the objects your main object is composed of).

    In short, Builder is a way to have a handy interface to an object constructor. It’s nice for users of a library, or if you find yourself repeating a complex constructor invocation often, but you usually do not want to have to maintain such an interface that is, at the end of the day, mostly boilerplate code.


  • I second that. You can also go a fair way without coding much. So you can learn the engine as you learn programming.

    Also, look at some example projects. There are plenty, like a vampire survivor like. Try to understand the structure and add some very small feature.

    There are also many resources online to get started. Don’t hesitate to create many small projects where you only try out a concept or idea. Godot is very composable once you know your way around, this will make things easier to move your learnings from one project to the next.



  • I like the idea of aggregating communities. Especially if the modding tools are powerful enough. This could lead to communities being essentially curated lists of other communities. Which is great for new users to discover new communities without being overwhelmed by the unordered list of communities on the instance.

    Another feature that I’d like to see is an equivalent to the mastodon’s lists, a way to aggregate communities for yourself. So that you could browse the content of communities sharing a same theme in a dedicated view.