@vcmj

vcmj@programming.dev · 14 hours ago

Surprisingly just setting the systemd flag in WSL settings worked, though for a long time I simply didn’t use systemd.

vcmj@programming.dev · edit-2 1 day ago

I use Arch in WSL BTW. This is not a joke its actually quite nice

vcmj@programming.dev · 10 months ago

Not an answer to the question, but in case performance is the goal, Torchaudio has it here

vcmj@programming.dev · 11 months ago

Depending on if you wrote the kernel cmdline yourself I imagine this might happen using /dev/sdN style device paths? BIOS might change things up every now and then for fun, so using partition UUIDs would be a better way if so.

vcmj@programming.dev · edit-2 1 year ago

Ah, even then it could just be a consequence of training samples usually being chronological(most often the expected resolution for conflicting instructions is “whatever you heard last”, with some exceptions when explicitly stated) so it learns to think that way. I did find the pattern also applies to GPT trained on long articles where you’d expect it not to, so wanted to just explain why that might be.

vcmj@programming.dev · 1 year ago

Or I should explain better: most training samples will be cut off at the top, so the network sort of learns to ignore it a bit.

vcmj@programming.dev · 1 year ago

Yes, that’s by design, the networks work on transcripts per input, it does genuinely get cut off eventually, usually it purges an entire older line when the tokens exceed a limit.

vcmj@programming.dev · 1 year ago

I was a curious child, and things spiralled out of control from there…

vcmj@programming.dev · 1 year ago

Ah, that makes sense. Most cloud providers have the full nine yards with online hardware provisioning and imaging I forgot you could still just rent a real machine.

vcmj@programming.dev · 1 year ago

Hmm, wonder if there was some reason they didnt just extract the original certificates from the VPS if it was actually the hosting provider, I mean even with mitigation it should be sitting in a temp folder somewhere, surely they could? Issuing new ones seems like a surefire way to alert the operators, unless they already used Let’s Encrypt of course.

vcmj@programming.dev · 1 year ago

I feel like this is just describing the future of business processing consultants. Like there’s already a role for this, unless I’m missing something?

vcmj@programming.dev · edit-2 1 year ago

I think the part that annoys me the most is the hype around it, just like blockchain. People who don’t know any better claiming magic.

We’ve had a few sequence specific architectures over the years. GRU, LSTM and now Transformers. They were all better than the last at the task of sequence specific transformations, and at least for the last one the specific task was language translation. We eventually figured out these guys have a bit of clairvoyance too, they could make accurate predictions based on past data, or at least accurate enough to bet on, and you can bet traders of various stripes have already made billions off that fact. I’ve even seen a transformer based weather model. It did OK, but transformers are better at language.

And that’s all it is! ChatGPT is a Transformer in the predictive stance. It looks at a transcript of a conversation and thinks what a human is most likely to say next. It’s a very complex transformation of historical data. If you give it the exact same transcript, it gives the exact same answer. It is in the literally mathematically rigorous sense entirely incapable of an original thought. Any perceived sentience is a shadow of OpenAI’s army of annotators or the corpus it was trained on, and I have a hard time assigning sentience to tomorrow’s forecast, which may well have used similar technology. It’s just an ultra fancy search engine index.

Anyways, that’s my rant done I guess. Call it a cynical engineer’s opinion. To be clear I think it’s a fantastic and useful technology, and it WILL change how we interact with machines. It can do fancy things with the combination of “shell” code driving it’s UI like multi-step “agents” or running code, and I actually hope OpenAI extends it far into the future, but I sincerely think any form of AGI will be something entirely different to LLMs, or at least they’ll only form a small part of it as an encoder/decoder for it’s thoughts.

EDIT: Added some paragraph spacing. Sorry, went into a more broad AI rant rather than staying on topic about coding specifically lol