

Claude is trained on stolen data (the whole Internet), so I can’t have any sympathy for Anthropic when someone steals from them.


Claude is trained on stolen data (the whole Internet), so I can’t have any sympathy for Anthropic when someone steals from them.


Given that the LLMs could follow the short lists of words well but not the longer lists, and that they were processing images, not text, I think it’s more likely that their context just filled up and they forgot the original instructions (or they were assigned a lower weight in the computation).


I think the author is mostly right about the current state of AI, but his future predictions (or worries) are based on a false premise: that the massive LLMs will keep improving in the future.
As far as I have seen the improvements have clearly slowed down, while the energy consumption is rising linearly (or worse). It’s like the energy (money) vs. performance graph is logarithmic, and the companies are expending double the energy to get a 10% improvement. Something like that is not sustainable, and the money seems to indicate so.
I really think that LLMs are a dead-end for AI. A really useful dead-end, once the bubble pops and with time, we get a useful working model for them, probably based mostly on local LLMs, maybe using specialized training data.
It’s not like that. Tokens are an inherent computational property of how a model calculates the probabilities and such to generate text.
Having said that, what a token means in terms of computation varies wildly between models and is not directly comparable. So attributing a money value to tokens in general, independently of the model, is weird by nature.
And even within a model, the number of tokens needed to generate a response is very variable too, depending of the model itself and the parameters with which it has been configured (thinking mode, temperature, etc.).
So yeah, companies can pretty much set any price they want and there’s not much anyone can do about it.