DMs, have you ever had NPCs trick or scam your players? Would you? If so, how did it go?

queerlilhayseed@piefed.blahaj.zone · 2 days ago

P.S. This is a hypothesis, I haven’t even designed the test for it, much less run it. What follow are my suppositions.

I think whether or not it’s a good idea depends on how similar all the models are. I don’t have a rigorous definition of “similar” but things like similar training data, similar design methodologies, similar QA processes would all contribute. Theoretically (I think), if they’re all dissimilar, they should each catch errors the others miss. However, the more similar they are, the more likely they have the same biases and weak spots, and your error rate from a response + verification may be the same or even higher than the error rate for just the original prompt, and you’d be unlikely to detect those errors using just two similar models. It can instill false confidence in the results because you’re doing something that should in theory increase the validity of the data, but in practice might make no difference or even make the quality of responses worse.

queerlilhayseed@piefed.blahaj.zone · 2 days ago

I think it’s tricky. It’s kind of like adding LLMs like vectors, and hopefully the effect can soften or at least reveal the shortcomings of individual models. Is it a good idea? I don’t know, I think there are good reasons to think it’s a waste of time and resources. I certainly think I’d need a better explanation of what use it would be before I spent more time building it. But I still think about what use it would be from time to time; I haven’t decided that it’s a bad idea yet.

queerlilhayseed@piefed.blahaj.zone · 2 days ago

One of the projects I started and never got to a satisfactory end state was basically that, plus a judging round. Every model would respond to the same prompt, then every model would evaluate every other model’s response for accuracy and completeness. Then the results would get logged to a spreadsheet.

It’s simple enough, but for N models it requires N + N^2 model calls so it takes forever to run any decent dataset on consumer hardware. If I had the resources and a way to run it that didn’t fry the planet, I think it would be a cool running set of comparative benchmarks. IDK if it’d be useful at all but I’m still interested to see the data.

queerlilhayseed@piefed.blahaj.zone · 2 days ago

Oh neat. Yeah, if something like that had existed (and I’d been aware of it) I probably would have used it instead of building my own shoestring version.

queerlilhayseed@piefed.blahaj.zone · 2 days ago

Nothing so fancy. I just made a little python script to prompt the first model, wait for a response, then prompt the next model with the initial prompt + the response, and so on. It was very hacky and slow.

queerlilhayseed@piefed.blahaj.zone · 2 days ago

Yup, ollama, various models. I initially downloaded it because I, along with thousands of other people, wanted to see what would happen if I made models debate with each other after RAGging them with various books (The Prince, The Art of War, The complete works of Shakespeare, etc.).

The results were uninteresting and I abandoned the project pretty quickly. I’ll sometimes use them for code analysis but they’re too slow on my rig to be really useful.

queerlilhayseed@piefed.blahaj.zone · 1 month ago

the actual essence of the community and the thoughts and ideas are reflected not in your posts but in your ideologies

lizlemonmegaeyeroll.gif

Very impressive that you can divine the essence of a community, not reflected in posts, just by reading a bunch of posts and making assumptions about them. That’s a neat trick.

queerlilhayseed@piefed.blahaj.zone · 2 months ago

There are tons of spam factories that pose as local newspapers. The first one that comes to mind is the Denver Guardian, which gained brief notoriety during Trump’s rise to power. But there are a million of them, probably literally. They are easy to make and they are easy to launder through social media bot networks.

queerlilhayseed@piefed.blahaj.zone · 2 months ago

the list for the curious. I don’t mind if rimu wants to maintain a default blocklist, if I maintained my own fediverse app I would probably make something similar, based on my own preferences, to cut down on the mod work. If you want your piefed instance to allow botfarm produce, disable the blocklist or just fork it and live your dream.

queerlilhayseed@piefed.blahaj.zone · 3 months ago

DMs, have you ever had NPCs trick or scam your players? Would you? If so, how did it go?

queerlilhayseed@piefed.blahaj.zone · 7 months ago

A scryer who can see through another creature's eyes, but if the target is moving too much they get motion sick.

queerlilhayseed

DMs, have you ever had NPCs trick or scam your players? Would you? If so, how did it go?

DMs, have you ever had NPCs trick or scam your players? Would you? If so, how did it go?

A scryer who can see through another creature's eyes, but if the target is moving too much they get motion sick.

A scryer who can see through another creature's eyes, but if the target is moving too much they get motion sick.