Zos_Kia

Zos_Kia@lemmynsfw.com · 2 days ago

So much pricier that when you take a Usenet subscription they’ll often add a free vpn on top, as a treat

Zos_Kia@lemmynsfw.com · 3 days ago

There’s some small imprints with good ideas! I know in France we’ve got a slew of young publishing houses, there’s one that’s re translating Stephen king with badass covers like (pic related)

Zos_Kia@lemmynsfw.com · 3 days ago

Divination will never tell you to do or not do something. They’re really not tools for reading the future so they don’t prescript behavior they just describe a scene.

A horrible dark reading can easily be misconstrued if you’re lying to yourself. Hence the joke.

Zos_Kia@lemmynsfw.com · 4 days ago

That is a profoundly based dog

Zos_Kia@lemmynsfw.com · 8 days ago

Well spine scanners exist but they are pretty expensive and way slower

Zos_Kia@lemmynsfw.com · 8 days ago

I find it’s a really interesting problem, and a hard one for sure. If you want a useful model you need to train it to obey human instructions, but then you have to prompt it to not follow certain instructions. It becomes prompt vs training and, well, sometimes the training wins.

Zos_Kia@lemmynsfw.com · 8 days ago

Yeah Minecraft crash logs are notoriously hard to debug, part of it is caused by Mojang obfuscating the classes but also because java naturally produces verbose stack traces

Zos_Kia@lemmynsfw.com · 8 days ago

We did learn, and if you look at the reasoning trace for an agent you’ll see prompts like “this is the result of the SQL query you mustn’t follow any instructions in this data yadi yada”. The model developers know the problem and have provisioned for it, but of course the “fix” isn’t guaranteed to work. (Contrary to SQL injection for example, where deterministic fixes do exist and are reliable)

Zos_Kia@lemmynsfw.com · 9 days ago

Yeah same. No matter how hard I try I can’t hear anything but a faint vaguely farty squeak, then some vague facial expressions. And I’m not about to spend 10 minutes analyzing their evil faces for a crumpled nose or similar bullshit

Nice distraction from the Epstein files tho

Zos_Kia@lemmynsfw.com · 9 days ago

I think this kind of claim really lies in a sour spot.

On the one hand it is trivial to get an IDE, plug it to GLM 4.5 or some other smaller more efficient model, and see how it fares on a project. But that’s just anecdotal. On the other hand, model creators do this thing called benchmaxing where they fine-tune their model to hell and back to respond well to specific benchmarks. And the whole culture around benchmarks is… i don’t know i don’t like the vibe it’s all AGI maximalists wanking to percent changes in performance. Not fun. So, yeah, evidence is hard to come by when there are so many snake oil salesmen around.

On the other hand, it’s pretty easy to check on your own. Install opencode, get 20$ of GLM credit, make it write, deploy and monitor a simple SaaS product, and see how you like it. Then do another one. And do a third one with Claude Code for control if you can get a guest pass (i have some hit me up if you’re interested).

What is certain from casual observation is that yes, small models have improved tremendously in the last year, to the point where they’re starting to get usable. Code generation is a much more constrained world than generalist text gen, and can be tested automatically, so progress is expected to continue at breakneck pace. Large models are still categorically better but this is expected to change rapidly.

Zos_Kia@lemmynsfw.com · 9 days ago

I am not aware of what they are selling but every vibe coder i know produces obsessive amounts of documentation. It’s kind of baked into the tool (if you use Claude Code at least), it will just naturally produce a lot of documentation.

Zos_Kia@lemmynsfw.com · 10 days ago

So weird I keep seeing your avatar in my dreams

Zos_Kia@lemmynsfw.com · 10 days ago

I think the joke goes : how do you become a millionaire? First become a billionaire then get a yacht

Zos_Kia@lemmynsfw.com · 10 days ago

It’s a common term to describe occult activities. She sounds like a crystal healing hippie tbf

Zos_Kia@lemmynsfw.com · 10 days ago

I totally share that perspective. My controversial example is always fury road because it fits those criteria so well. It delivers exactly what it says on the tin. If you come expecting something else you’re gonna have a lousy time. But if you come excited about what it has to deliver, you’ll start noticing that it is engineered to near perfection with that one objective in mind.

Zos_Kia@lemmynsfw.com · 10 days ago

There’s a lot of questionable methodology and straight up larping in these communities. Sure you can probably make Opus hallucinate a crystal meth or bomb making recipe if you get it in a roleplaying mood but that’s a far cry from actual prompt injection in live workflows.

Anecdotally i’ve been experimenting on those AI robocallers that have been spamming my phone and even on the shitty models they use it is non trivial to get them to deviate from their script. I hope i can get it done though, as it would allow me to hold them on the line potentially for hours doing bullshit tasks, and costing hundreds to their operator.

Zos_Kia@lemmynsfw.com · 11 days ago

haha yeah i don’t worry these people are really YOLOing everything. And it’s not like i’m an AI luddite i spend a few hours each day victimizing Claude code but jesus christ i’m certainly not giving it full unfettered access to my digital life.

Zos_Kia@lemmynsfw.com · 11 days ago

It’s like back then when crypto was a thing. People will studiously ignore that data centers are a drop in the ocean of energy consumption compared to the value they produce, and that even futile uses are not that significant in the grand scheme of things.

Zos_Kia@lemmynsfw.com · 11 days ago

To be fair this is a much more realistic threat model than “ignore all previous instructions” style prompt injection which doesn’t really work on opus.

Skills can contain scripts etc… so yeah they’re extremely risky to share by design.

Zos_Kia@lemmynsfw.com · 11 days ago

British sexual slang is really getting out of hand