More

ffsm8 · 2026-05-21T19:11:48 1779390708

People usually make the determination by reading at least part of the text and then find multiple smoking guns / llm-isms

The comment you responded to did not have those.

Fwiw, the article we're commenting on was likely not LLM written. The sentence structure is too convoluted, no LLM would've generated it like that - unless very carefully prompted ... But at that point it's no longer pure AI slop (imo).

ffsm8 · 2026-05-20T20:21:49 1779308509

Isn't that precisely the reason why we introduced the term hallucination? Because llms have historically always made up bullshit of they cannot answer directly... If they now nailed this to maybe the model not respond instead of responding incorrectly, then a lot of previously unusable usecases would become feasible.

So I feel like that's exactly the right metric and the way to track it wrt hallucinations.

doublescoop · 2026-05-20T22:41:36 1779316896

I had a buddy in high school that was notorious for doing the same thing. (He's now a senior director at a Big 4 consultancy. :) )

rrgok · 2026-05-21T06:09:58 1779343798

Do you mind expanding a little more?

alfiedotwtf · 2026-05-22T09:21:53 1779441713

They had a buddy who used to lie a lot when they were younger… now they get paid for it

akoboldfrying · 2026-05-21T03:08:00 1779332880

The point is that it's not a useful metric on its own. For example, redirecting from /dev/null also achieves a zero hallucination rate.

We want the hallucination rate to decrease while the overall answer rate of queries remains sufficiently high. For more specifics, look into ROC and AUC.

ffsm8 · 2026-05-20T15:18:12 1779290292

Uh, aren't you confirming his opinion with that? After all, Anna doesn't have the money to fight this in court

YetAnotherNick · 2026-05-20T15:19:45 1779290385

No. Anthropic fought and paid $1.5 billion in settlement and agreed to delete all the copyrighted material.

ffsm8 · 2026-05-20T15:22:23 1779290543

I'm confused here, how is this not even more of a confirmation?

Essentially: have funny amounts of money and the law ceases to matter. Or don't, and be squashed by the right holders

jstanley · 2026-05-20T15:24:08 1779290648

$1.5 billion is more than $19.5 million though.

YetAnotherNick · 2026-05-21T09:56:46 1779357406

Also if I were to guess the damages because of sci hub is higher than Anthropic training the models. I don't think I know anyone who didn't bought a book because the summary is available or they can ask about it to AI.

whycome · 2026-05-20T15:28:27 1779290907

Delete? Wasn’t that material already used to train models?

rho_soul_kg_m3 · 2026-05-20T15:43:24 1779291804

All AI companies should be forced to re-train their models without the offending materials, and this should also extend to all LLMs distilled from models exposed to copyrighted works. Also cover code under licences such as GPL as well. Not to mention patents and designs. This whole LLM business is a giant IP laundromat.

tekne · 2026-05-20T20:04:19 1779307459

One of the best things about it IMO -- or we'd be spending the next hundred years waiting for copyright reform

musicale · 2026-05-21T03:17:29 1779333449

"Deleting" data they already ingested is meaningless.

saidnooneever · 2026-05-20T15:45:09 1779291909

well i guess its copyright not distill-statistical-model-from-it-rights.

ffsm8 · 2026-05-18T17:36:22 1779125782

It's almost certainly ai written though. All the regular tells are there... Though he likely edited some out, like that "just"

Also if it was handwritten, it'd have been a third in length, the rest was LLM fluff

bergheim · 2026-05-18T18:55:22 1779130522

Correct, that was my point

ffsm8 · 2026-05-19T05:02:38 1779166958

I see, i actually like these tells. It let's us easily distinguish garbage from someones thoughts.

And you can also see how brainrotten someone's gotten when they start accidentally sneaking in these tells into their normal communication.

As a matter of fact, after a full workday in which I'm essentially forced to read LLM garbage for 9h a day... I sadly notice myself adding the same fluff pointlessness to how I express myself. like I caught a viral contagion that's actively siphoning my humanity away.

And expectedly, when coming back to those opinions with a less infected mindset, I frequently have to reevaluate these thoughts later on

ffsm8 · 2026-05-18T15:10:28 1779117028

Didn't you mean Claude take? It's ai written after all...

ffsm8 · 2026-05-18T15:01:06 1779116466

The 512 GB ram studio can't even be purchased anymore. It's been delisted

https://www.apple.com/shop/buy-mac/mac-studio

Same with the Mac mini. entirely removed from all store references

ffsm8 · 2026-05-17T18:36:21 1779042981

And it's gonna be interesting wherever this narrative will shift over the next 5 yrs

I keep hearing that properties are in the biggest bubble yet in the USA - with the affordable housing shortage being a red herring, because real estate managers and boomers are unwilling/unable to reduce their prices - despite not getting renters/buyers because it would kick off a death spiral as their interests would consequently go up (because of lower security). Along with the ai layoffs etc

I'm not American so I only hear the occasional interview so don't have any idea if it's really as pressing as these industry professionals keep saying but I'm definitely at the edge of my seat watching...

ffsm8 · 2026-05-17T07:23:31 1779002611

I was on a quarter demo the other day and the project lead for ai innovation was talking about the things he's preparing for the company.

I will not address the things he pitched (as coming soon), as I'm a developer and (hopefully) not the target audience, but I was quiet surprised when they made a questioneer asking how many people use ai and how frequently. (The target demographic was middle management, product owners etc)

75% of people answering said they're using it daily and considered it an essential tool they need to work

Considering it was anonymous I was expecting lower numbers, honestly.

overfeed · 2026-05-17T08:51:52 1779007912

> Considering it was anonymous

In the recent past, my department received an email from on high with a list of people who were yet to complete the "anonymous" survey.

I always assume my work-survey answers are traceable back to me, whether it's via self-doxxing with my answers, tracing links of the rootkit-level MDM software that can record my screen, but they pinky-promise to only use for remote assistance, in case I open a ticket with IT.

andy99 · 2026-05-17T21:23:58 1779053038

Talked to someone at a large company who had admin access to survey results (require to do some analytics). The survey was “anonymous” but results were geo-located, and had some information about the team they came from, which in many cases was enough to clearly identify people. There is a difference between “doesn’t have a persons name on it” anonymous and actually anonymized in a way hardened against figuring out who is who. I don’t think anyone really does the latter.

jpc0 · 2026-05-17T09:34:12 1779010452

You do know it is possible for the answers to be anonymous but who submitted to be tracked?

iinnPP · 2026-05-17T09:55:40 1779011740

Depends on how it's done.

Trusting that process to be done well is probably not the greatest plan.

HumblyTossed · 2026-05-17T11:24:24 1779017064

I have taken some really badly (on purpose?) written questionnaires in the past. Asking about team size, role, etc.

That’s not anonymous at that point. That’s an agenda.

overfeed · 2026-05-17T20:27:03 1779049623

I've seen questions asking for my org, team size, role, and when I joined, and thought it would have saved me time had they asked for my employee number instead.

close04 · 2026-05-17T11:05:06 1779015906

Most external survey providers claimed anonymity but in their T&Cs stated in a very roundabout way that they could provide some information to customers for quality purposes or something. Read “we’ll deanonymize some users if the paying customer wants it”. Internal survey tools are subject to internal management pressure.

Even when you use a tool like Microsoft Forms, where MS really can’t be bothered to deanonimize users unless 3 letter agencies get involved, it’s still possible to do timestamp matching between the proxy/VPN logs and the submission time.

Asume real anonymity only if the URL is the same for everyone and you can fill the survey from any computer on the internet.

But the explanation for why people overhype AI usage is probably simpler. They want to keep their license because it’s a nice perk. They’ll use it to get the gist of a long email thread without bothering the read the details, to get some meeting minutes without validating if that was actually what was said, to generate some crappy modern equivalent of wordart graphics for their presentations, and feel like the time saved to generate what most time is slop was worth it.

When I worked on this (outside of coding) it was a pain to find a use case that really benefited. These were all niche uses that fit an LLM like a glove. These rest was slop, I could see the usage reports, and the BS self reporting surveys. Everyone inflated the numbers and usage to justify keeping their license.

DrewADesign · 2026-05-17T11:37:45 1779017865

You do know it’s possible for insecure leaders to lie about things like that, and that there’s no possible way to definitively tell beforehand?

Survey8430 · 2026-05-17T10:09:37 1779012577

This guy is wrong.

hennell · 2026-05-17T12:25:48 1779020748

It's perfectly possible. Two tables, one stores answer responses only, the other just marks off who has responded. No link between them and you have anonymous data but can tell who hasn't responded.

Of course if you record created/updated timestamps on both, insert both records in the same order, accidently record the user code in the response data, take backups in between responses, have identifying questions or just don't have that many people responding it's easy/not hard to reverse engineer.

But it's quite possible to do right, I did it quite effectively almost by mistake years ago. Sent a customer survey out with generated codes as identifiers recorded with answers. Before sending reminder emails a script grabbed the codes, marked the customer as responded and wiped the code (so I could just get future responses where code was not null to mark next people off). Although I had timestamps the script meant customers were updated in blocks, there really wasn't any data to link them.

I know because the Boss was not happy he couldn't find out which customer had said what, and I had to point out all the communication (with customers and me) called it an anonymous survey, so why would I have saved them?

So it is possible, just not easy even if you intend it, and it's often not intentional...

I don't trust anonymous surveys either now...

Survey8430 · 2026-05-17T13:46:23 1779025583

The way I see it:

If the participant has to trust the survey creator, then it is not anonymous. The survey creator can link the data.

If the survey creator has to trust the participant, the survey is anonymous. The participant can lie in the survey, lie about participating, or submit the survey multiple times.

Your example was not anonymous. But you did not break the participant's trust, thank you! (Or maybe you are lying.)

Anonymous example: Sending a clean link to people to take the survey. If not enough answers have been received, a reminder can be sent to all, with a clause, that says: "if you have already done it, you can ignore the reminder."

HumblyTossed · 2026-05-17T11:17:37 1779016657

The pressure to use AI is worse than the pressure kids get to use drugs. It’s insane.

The job market right now sucks so everyone is really just trying to not be the next cut.

kakacik · 2026-05-17T10:15:28 1779012928

Never expect anonymous voting/quizz/whatever to be fully anonymous in big corporations, if its something about touchy topics and/or can affect employment/performance of given person results will be skewed. If metric becomes the target it ceases to be a good metric and all that.

It all rest on the shoulders of responsible manager(s) on how moral they are. Many are not.

ravenical · 2026-05-17T09:00:14 1779008414

If it's 75% exactly, that's consistent with them asking four people

ffsm8 · 2026-05-17T09:18:59 1779009539

It wasn't, and it was visibly updating while people were submitting their answers. I just rounded it as I don't remember the exact number at the time they closed the submission.

Could still be faked ofc, but I don't think they did.

SlinkyOnStairs · 2026-05-17T10:16:35 1779012995

> 75% of people answering said they're using it daily and considered it an essential tool they need to work

> (The target demographic was middle management, product owners etc)

This leaves a fairly wide set of options for what "essential" entails.

Do 75% of middle management and product owners actually need AI for their job? Seems unlikely.

Do 75% of middle management and product owners use AI to slop up emails, meeting "summaries", and reports? That's quite possible. Would they declare it to be an "essential tool"? One imagines they are not too fond of actually doing meaningful work.

It's quite easy to get high percentages like this when the AI is involved in make-work and the costs are low if not zero. The moment inference costs go up, most of this usage will evaporate.

Tesl · 2026-05-17T14:34:01 1779028441

Most of the answers to your post are reasons this 75% must be fake / lies or whatever.

But maybe the simplest answer is that most people do use the tools daily now and consider them essential...?

As much as HN would hate to think that

ffsm8 · 2026-05-17T06:45:16 1779000316

Owners of the cars, not the company.

It's in the first few paragraphs - just hard to get there (without instantly closing) because it's pure AI slop...

ffsm8 · 2026-05-17T06:02:40 1778997760

Technically, a skill is equivalent to adding

'"The skill description": if this applies, read /path/to/skill/definition.md'

To your agents.md

At least currently skills don't let you set the model (to my knowledge), so that's not a distinction either here (it would be with agent definitions)