I can't count the times I've told clients and prospects to _not_ hire us to build something they wanted. Because they could just use off the shelf solutions that were cheaper financially, at least in the short to mid term, and much, much cheaper in terms of opportunity costs. I struggle to put even billed hours into something that doesn't make sense to me.
Of course some overdo it. I've seen companies with more random SaaS tools than staff, connected with shaky Zapier workflows, manual processes, or not at all. No backups, no sense of risks, just YOLOing. That's OK in some cases, in others really not.
I suppose it does need some engineering thinking to find the right things and employ them in a good way. Unfortunately even developers commonly lack that.
It is tool use with natural language search queries but going down a layer they are searched on a vector DB, very similar to RAG. Essentially Google RankBrain is the very far ancestor to RAG before compute and scaling.
I _think_ the idea is that the first one to hit self improving AGI will, in a short period of time, pull _so_ far ahead that competition will quickly die out, no longer having any chance to compete economically.
At the same time, it'd give the country controlling it so much economic, political and military power that it becomes impossible to challenge.
I find that all to be a bit of a stretch, but I think that's roughly what people talking about "the AI race" have in mind.
Not sure a lot of people would say "no" to either of these questions.
The only other question that I think is worth asking for investors, is how much stock in the acquiring company they get for their stocks in the acquired company. If the valuation of the acquired company in the deal is... optimistic enough, that seems like a no brainer.
First thought: In my experience, this is a muscle we build over time. Humans are pretty great at pattern detection, but we need some time to get there with new input. Remember 3D graphics in movies ~15 years ago? Looked mind blowingly realistic. Watching old movies now, I find they look painfully fake. YMMV of course.
Second thought: Does it _really_ matter? You find it interesting, you continue reading. You don't like it, you stop reading. That's how I do it. If I read something from a human, I expect it to be their thoughts. I don't know if I should expect it to be their hand typing. Ghost writers were a thing long before LLMs. That said, it wouldn't even _occur_ to me to generate anything I want to say. I don't even spell check. But that's me. I can understand that others do it differently.
Exactly! It must be exhausting to have this huge preoccupation with determining if something has come from an LLM or not. Just judge the content on it's own merits! Just because an LLM was involved doesn't mean the underlying ideas are devoid of value. Conversely, the fact that an LLM wasn't involved doesn't mean the content is worth your time of day. It's annoying to read AI slop, but if you're spending more effort suspiciously squinting at it for LLM sign versus assessing the content itself, then you're doing yourself a disservice IMO.
Just like how writing helps memorisation. Our brains are efficient, they only do what they have to do. Just like you won't build much muscles from using forklifts.
I've seen multiple cases of... inception. Someone going all in with ChatGPT and what not to create their strategy. When asked _anything_ about it, they defended it as if they came up with it, but could barely reason about it. Almost as if they were convinced it was their idea, but it really wasn't. Weird times.
Parent didn't mention Simon Willinson, and neither me nor parent appear to imply that _all_ people posting positively about LLMs are paid influencers, that'd be a ridiculous claim. It's just that there _are_ paid influencers, at every level, down to non-famous people getting a few bucks, and that's worth knowing.
Their Readme.md is weirdly obsessed with "2 hours":
"before Claude Opus 4.5 started doing better than humans given only 2 hours"
"Claude Opus 4.5 in a casual Claude Code session, approximately matching the best human performance in 2 hours"
"Claude Opus 4.5 after 2 hours in our test-time compute harness"
"Claude Sonnet 4.5 after many more than 2 hours of test-time compute"
So that does make one wonder where this comes from. Could just be LLM generated with a talking point of "2 hours", models can fall in love with that kind of stuff. "after many more than 2 hours" is a bit of a tell.
Would be quite curious to know though. How I usually design take home assignments is:
1. Candidate has several _days_ to complete (usually around a week).
2. I design the task to only _take_ 2-4 hours, informing the candidate about that, but that doesn't mean they can't take longer. The subsequent interview usually reveals if they went overboard or struggled more than expected.
But I can easily picture some places sending a candidate the assignment and asking them to hand in their work within two hours. Similar to good old coding competitions.
Bear in mind that there is a lot of money riding on LLMs leading to cost savings, and development (seen as expensive and a common bottleneck) is a huge opportunity. There are paid (micro) influencer campaigns going on and what not.
Also bear in mind that a lot of folks want to be seen as being on the bleeding edge, including famous people. They get money from people booking them for courses and consulting, buying their books, products and stuff. A "personal brand" can have a lot of value. They can't be seen as obsolete. They're likely to talk about what could or will be, more than about what currently is. Money isn't always the motive for sure, people also want to be considered useful, they want to genuinely play around and try and see where things are going.
All that said, I think your approach is fine. If you don't inspect what the agent is doing, you're down to faith. Is it the fastest way to get _something_ working? Probably not. Is it the best way to build an understanding of the capabilities and pit falls? I'd say so.
This stuff is relatively new, I don't think anyone has truly figured out how to best approach LLM assisted development yet. A lot of folks are on it, usually not exactly following the scientific method. We'll get evidence eventually.
> There are paid (micro) influencer campaigns going on and what not.
Extremely important to keep in mind when you read about LLMs, agents and what not both here, on reddit and elsewhere.
Just the other day I got offered 200 USD if I posted about some new version of a "agentic coding platform" on HN, which obviously is too little for me to compromise my ethics and morals, but makes it very clear how much of this must be going on, if me, some random user, gets offered money to just post about their platform. If I was offered that 15-20 years ago when I was broke and cleaning hotels, I'd probably take them up on their offer.
Hah, after submitting my comment, I actually though about it because I knew someone would eventually ask :)
I'm fortunate enough to live a very comfortable life after working myself to death, so I think for 20,000,000 USD I'd do it, happily so. 2,000,000 would be too little. So probably between those sit the real price to purchase my morals and ethics :)
It wasn't a shot at you personally but the point of this was that AI companies are flush with money and desperate to show any kind of growth and willing to spend money to do that. I'm sure they are finding people that have some social following and will happily pocket couple extra green bills to present AI products in a positive light with little to no actual proof.
> they are finding people that have some social following and will happily pocket couple extra green bills to present AI products in a positive light with little to no actual proof.
No doubt about it, I don't think people realize how pervasive this really is though, people still sometimes tell me they trusted something on HN/reddit just because it was the most upvoted answer, or that they chose a product based on what was mentioned the most etc.
I can definitely be bought for much much less. But only because I'm pretty sure I could rave about some AI platform while still being honest about it. Why do you draw such a hard ethical line? I agree with your sentiment, AI is currently a net negative on the world that I care about. But I also believe people when they say it helps them. Despite their inability to articulate anything useful to me, or that I can understand.
This is true about everything you see advertised to you.
When I went to opensource conference (or whatever it was called) in San Diego ~8y ago, there were so many Kubernetes people. When you talked with them nobody was actually using k8s in production and were clearly devrel/paid people.
Now it seems to be everywhere... so be careful with what you ignore too
As a very silly example, my father in law has a school in Mexico and he is now using chatgtp to generate all of their visual materials that they used to pay someone to do in the past.
They also used to pay someone to take school pictures for the books to look professional, now they use AI to make it look good/professional.
My father in law has no knowledge in technology, he uses chatgtp daily to do professional work for his school. That's already 2 jobs gone.
People must be hiding under a rock if they don't think this will have big consequences to society
> This stuff is relatively new, I don't think anyone has truly figured out how to best approach LLM assisted development yet.
Exactly. But as you say, there are so many people riding the hype wave that it is difficult to come to a sober discussion. LLMs are a new tool that is a quantum leap but they are not a silver bullet for fully autonomous development.
It can be a joy to work with LLMs if you have to write the umpteenth javascript CRUD boilerplate. And it can be deeply frustrating once projects are more complex.
Unfortunately I think benchmaxxing and lm-arena are currently pushing into the wrong direction. But trillions of VC money are at stake and leaning back, digesting, reflecting and discussing things is not an option right now.
> But as you say, there are so many people riding the hype wave that it is difficult to come to a sober discussion. LLMs are a new tool that is a quantum leap but they are not a silver bullet for fully autonomous development.
While I agree with the latter, I actually think on former point - that hype is making sober discussion impossible - is actually directionally incorrect. Like a lot of people I speak to privately, I'm making a lot of money directly from software largely written by LLMs (roadmaps compressed from 1-2 years to months since Claude Code was released), but the company has never mentioned LLMs or AI in any marketing, client communications, or public releases. We all very aware that we need to be able to retire before LLMs swamp or obsolete our niche, and don't want to invite competition.
Outside of tech companies, I think this is extremely common.
> It can be a joy to work with LLMs if you have to write the umpteenth javascript CRUD boilerplate.
There is so much latent demand for slightly customised enterprise CRUD apps. An enormous swathe of corporate jobs are humans performing CRUD and task management. Even if LLMs top out here, the economic disruption from this alone is going to be immense.
It is delusional to believe the current frontier models can only write CRUD apps.
I would think someone would have to only write CRUD apps themselves to believe this.
It doesn't matter anyway what a person "believes". If anything, I am having the opposite experience that conversing with people is becoming a bigger and bigger waste of time instead of just talking to Gemini. It is not Gemini that is hallucinating all kinds of nonsense vs the average person. It is the opposite.
There is a critical failure in education - people are not being taught how to debate without degenerating into a dominance debate or merely an echo chamber of talking points. It's a real problem, people literally do not understand that a question is not an opportunity to show off or to dominate, but is a request for an exchange of information.
And that problem is not just between people, this lack of communication skill continues with people's internal self conversations. Many a bully personality is that way because they bully themselves and terrorize themselves.
It's no wonder that people can use AI at all, with how poorly people communicate. So the cluster of nonsense that is all the shallow thinkers directing people down incorrect paths is completely understandable. They are learning by doing, which with any other technology would be fine, but with AI to learn how to use it by using it can serious damage one's cognitive ability, as well as leave a junkyard of failed projects behind.
I’m not sure you read my comment. I didn’t claim LLMs have reached a ceiling – I’m very bullish on them.
The point I was making is about the baseline capability that even sceptics tend to concede: if LLMs were “only” good at CRUD and task automation (which I don’t think is their ceiling), that alone is already economically and socially transformative.
A huge share of white-collar work is effectively humans doing CRUD and coordination. Compressing or automating that layer will have second- and third-order effects on productivity, labour markets, economics, and politics globally for decades.
even for CRUD I'm finding it quite frustrating. The question is no longer whether AI can write the code you specify: it can
It just writes terrible code I'd never want to maintain. Can I refactor and have it cleaned up by the AI also? Sure... but then I need to specify exactly how it should go about it and eugh should I just be writing this code myself?
It really excels when there are existing conventions within the app it can use as example
> This stuff is relatively new, I don't think anyone has truly figured out how to best approach LLM assisted development yet. A lot of folks are on it, usually not exactly following the scientific method. We'll get evidence eventually.
I try to think about other truly revolutionary things.
Was there evidence that GUIs would dramatically increase productivity / accessibility at first? I guess probably not. But the first time you used one, you would understand its value on some kind of intuitive level.
Having the ability to start OpenCode, give it an issue, add a little extra context, and have the issue completed without writing a single line of code?
The confidence of being able to dive into an unknown codebase and becoming productive immediately?
It's obvious there's something to this even if we can't quantify it yet. The wildly optimistic takes end with developers completely eliminated, but the wildly pessimistic ones - if clear eyed - should still acknowledge that this is a massive leap in capabilities and our field is changed forever.
> Having the ability to start OpenCode, give it an issue, add a little extra context, and have the issue completed without writing a single line of code?
Is this a good thing? I'm asking why you said it like this, I'm not asking you to defend anything. I'm genuinely curious about your rational/reasoning/context for why you used those words specifically?
I ask, because I wouldn't willingly phrase it like this. I enjoy writing code. The expression of the idea, while not even close to value I assign to fixing the thing, still has meaning.
e.g. I would happily share code my friend wrote that fixed something. But I wouldn't take and pride in it. Is that difference irrelevant to you, or do you still feel that sense of significance when an LLM emits the code for you?
> should still acknowledge that this is a massive leap in capabilities and our field is changed forever.
Equally, I don't think I have to agree with this. Our field is likely changed, arguably for the worse if the default IDE now requires a monthly rent payment. But I have only found examples of AI generating boiler plate. If it's not able to copy the code from some other existing source, it's unable to emit anything functional. I wouldn't agree that's a massive leap. Boilerplate has always been the least significant portion of code, no?
> We are paid to solve business problems and make money.
> People who enjoy writing code can still do so, just not on a business context if there's a more optimal way
Do you mean optimal, or expedient?
I hate working with people who's ideas of solving problems is punting it down the road for the next person to deal with. While I do see people do this kinda thing often, I refuse to be someone who claims credit for "fixing" some problem knowing I'm only creating a worse, or different problem for the next guy. If you're working on problems that require collaboration, creating more problems for the next guy is unlikely to give you an optimal output; because soon no one will willingly work with you. It's possible to fix business problems, and maintain your ethics, it's just feels easier to abandon them.
Cards on the table: this stuff saps the joy from something I loved doing, and turns me into a manager of robots.
I feel like it's narrowly really bad for me. I won't get rich and my field is becoming something far from what I signed up for. My skills long developed are being devalued by the second.
I hate that using these tools increases wealth inequality and concentrates power with massive corporations.
I wish it didn't exist. But it does. And these capabilities will be used to build software with far less labor.
Is that trade-off worth the negatives to society and the art of programming? Hard to say really. But I don't get to put this genie back in the bottle.
> Cards on the table: this stuff saps the joy from something I loved doing, and turns me into a manager of robots.
Pick two non-trivial tasks where you feel you can make a half-reasonable estimate on the time it should take, then time yourself. I'd be willing to bet that you don't complete it significantly faster with AI. And if you're not faster using AI, maybe ignore it like I and many others. If you enjoy writing code, keep writing code, and ignore the people lying because they need to spread FUD so they can sell something.
> But I don't get to put this genie back in the bottle.
Sounds like you've already bought into the meme that AI is actually magical, and can do everything the hype train says. I'm unconvinced. Just because there's smoke coming from the bottle doesn't mean it's a genie. What's more likely, magic is real? Or someone's lying to sell something?
> Sounds like you've already bought into the meme that AI is actually magical, and can do everything the hype train says. I'm unconvinced. Just because there's smoke coming from the bottle doesn't mean it's a genie. What's more likely, magic is real? Or someone's lying to sell something?
There are a lot of lies and BS out there in this moment, but it doesn't have to do everything the hype train says to have enough value that it will be adopted.
After my (getting to be long) career, there's a constant about software development: higher level abstractions will be used, because they enable people to either work faster, or they enable people who can't "grok" lower level abstractions to do things they couldn't before.
The output I can get from these tools today exceeds what I could've ever gotten from a junior developer before their existence, and it will never be worse than it is right now.
Sorry, couldn't resist :P But I do, in fact, agree based on my anecdotal evidence and feeling. And I'm bullish that even if we _haven't_ cracked how to use LLMs in programming well, we will, in the form of quite different tools maybe.
Point is, I don't believe anyone is at the local maximum yet, models changed too much the last years to really get to something stable.
And I'm also willing to leave some doubt that my impression/feeling might be off. Measuring short term productivity is one thing. Measuring long term effects on systems is much harder. We had a few software crises in the past. That's not because people back then were idiots, they just followed what seemed to work. Just like we do today. The feedback loop for this stuff is _long_. Short term velocity gains are just one variable to watch.
Anyway, all my rambling aside, I absolutely agree that LLMs are both revolutionary and useful. I'm just careful to prematurely form a strong opinion on where/how exactly.
> The confidence of being able to dive into an unknown codebase and becoming productive immediately?
I don't think there's any public evidence of this happening, except for the debacles with LLM-generated pull requests (which is evidence against, not for this happening).
Of course some overdo it. I've seen companies with more random SaaS tools than staff, connected with shaky Zapier workflows, manual processes, or not at all. No backups, no sense of risks, just YOLOing. That's OK in some cases, in others really not.
I suppose it does need some engineering thinking to find the right things and employ them in a good way. Unfortunately even developers commonly lack that.
reply