he/him

Materials Science PhD candidate in Pittsburgh, PA, USA

My profile picture is the cover art from Not A Lot of Reasons to Sing, But Enough, and was drawn by Casper Pham (recolor by me).

  • 11 Posts
  • 76 Comments
Joined 1 year ago
cake
Cake day: June 7th, 2023

help-circle

  • It seems like you’re working under the core assumption that the trained model itself, rather than just the products thereof, cannot be infringing?

    Generally if someone else wants to do something with your copyrighted work – for example your newspaper article – they need a license to do so. This isn’t only the case for direct distribution, it includes things like the creation of electronic copies (which must have been made during training), adaptations, and derivative works. NYT did not grant OpenAI a license to adapt their articles into a training dataset for their models. To use a copyrighted work without a license, you need to be using it under fair use. That’s why it’s relevant: is it fair use to make electronic copies of a copyrighted work and adapt them into a training dataset for a LLM?

    You also seem to be assuming that a generative AI model training on a dataset is legally the same as a human learning from those same works. If that’s the case then the answer to my question in the last paragraph is definitely, “yes,” since a human reading the newspaper and learning from it is something that, as you say, “any intelligent rational human being” would agree is fine. However, as far as I know there’s not been any kind of ruling to support the idea that those things are legally equivalent at this point.

    Now, if you’d like to start citing code or case law go ahead, I’m happy to be wrong. Who knows, this is the internet, maybe you’re actually a lawyer specializing in copyright law and you’ll point out some fundamental detail of one of these laws that makes my whole comment seem silly (and if so I’d honestly love to read it). I’m not trying to claim that NYT is definitely going to win or anything. My argument is just that this is not especially cut-and-dried, at least from the perspective of a non-expert.






  • With all due respect to Penrose – who is indisputably brilliant – in probability when you start to say things like, “X is 10^10^100 times more likely than Y,” it’s actually much more likely that there’s some flaw in your priors or your model of the system than that such a number is actually reflective of reality.

    That’s true even for really high probability things. Like if I were to claim that it’s 10^10^100 times more likely that the sun will rise tomorrow than that it won’t, then I would have made much too strong a claim. It’s doubly true for things like the physics of the early universe, where we know our current laws are at best an incomplete description.



  • Maybe all of those PhD students would have their time better spent on this task than pretending, as if often the case, they’ve done some original work on an important theory that’s found something “for the first time”.

    I mean I’m personally biased as a PhD student myself, but I think this is a great idea. I made the core of my project to basically take a picture of a phenomenon that has been inferred from spectroscopy but not observed directly. So verification, not exactly replication, but same idea. Turns out that doing something like this is very hard and makes a worthy PhD project. (I haven’t managed it yet, and am starting to wonder if my eventual paper might actually end up being in support of the null hypothesis…)

    But I’m also not looking to go into academia after I graduate, so I’m not to worried about trying for something high impact or anything like that. I think for someone angling to be a professor the idea of a replication or verification project may be a harder sell, which is largely down to the culture of academia and how universities do their hiring of post-docs and such. I mean, even in this case more people are still going to be familiar with the names of Lee and Kim than any of the researchers who put in work on replication studies (can you name any of them without checking the article?).

    tl;dr definitely a worthy goal and replications should absolutely be encouraged, but it’s going to take a while to change the whole academic culture to reinforce that they’re valuable contributions.



  • I think the biomechanics of walking and running makes this a little more complicated than that. The efficiency of moving your body in different ways is different. I’m certainly no expert, but if I’m reading this study right (it’s open access so feel free to check me), then walking will pretty much always use less energy to cover a given distance than running/jogging, unless you force yourself to “fast-walk” at high speeds where a running/jogging gait would feel more natural.

    I’m also pretty sure that for a given distance you would count fewer steps while running than you would if you walked the same distance, since each step covers a lot more distance when you run. So in terms of step counting, steps taken while running should be “worth” a lot more in terms of exercise than steps taken while walking.

    In either case, my understanding of the evidence is that it has pretty consistently been shown across many different studies that almost any amount of daily exercise – walking, jogging, cycling, etc – is way, way better than no daily exercise at all. This study seems to fall nicely into that pattern.







  • The one for dispersion feels fishy; is dispersion really expected to be measured by the square root of length?

    Yeah that’s a pretty standard way to do things for all kinds of random walk processes. You don’t pick up error at a constant rate with distance, as you can go either forward or backward and will often be undoing dispersion you’ve already accumulated. The most likely outcome after any distance is always for you to be exactly back where you started. However, as stated in the video, the expectation value of the root-mean-square distance from the origin (i.e. how far from the origin do you end up on average) for a random walker after N steps is the square root of N. There’s a quite good explanation on this page.

    If you really dislike having the square root in there you can of course square everything to get rid of it, but at the cost of your other dimension being squared. I’d personally argue that it’s a lot easier to get a physical intuition from the ps/sqrt(km) units (you can expect to pick up dispersion proportional to the square root of the length of your fiber) than from ps^2/km (which to me just looks like inverse acceleration). The latter is valid though. In fact, if you type that into Wolfram it’ll tell you that those units are physically interpretable as the “group velocity dispersion with respect to angular frequency”!

    A way that I’ve found to avoid “cursing” units is to always include what they refer to

    I actually have a very neglected side project to build a little calculator app that treats units this way, where you can label them to avoid letting them cancel out. I might get some time to work on it in like a month? Or maybe I won’t get around to it until after I graduate, we’ll see 🙃


  • I basically have a layman’s perspective here, but just based on the abstract this particular paper doesn’t seem to be challenging the idea of a cosmological constant or the big bang as a thing that happened. Looking at the author’s other works it seems like he’s pretty big on the idea that the values of physical constants may have changed over time, which it seems like is basically his argument here too?

    I’ll admit, though, I’ve not heard the phrase “tired light” before this morning, so maybe it’s enough of a red flag to discard the work out of hand. I don’t know.





  • I don’t have hundreds of hours

    Don’t start with XIV then!

    So what is the most recent game in the series that I can start with that is worth it to play and wouldn’t confuse a newcomer?

    All of the FF games – baring the ones that are explicitly sequels, like X2 – are totally separate from each other, you can jump in anywhere. At most you might miss some references or easter eggs.

    If you want the most recent then, that’d be XVI, although I’d personally recommend looking up what the gameplay is like in the different games and starting wherever you feel you’ll have the most fun! There are some weirder ones out there, like crystal chronicles (my own first final fantasy game) and tactics, so you have a lot of options!