Dependency on US-hosted digital services (emails, chat, calendars, ticket systems, online editors, file hosting, sync services, payment providers) — e.g., Gmail, Google Docs.
Dependency on third-party authentication providers — e.g., login via Google, Apple, Facebook, Twitter, GitHub, and iCloud on iPhones.
Dependency on US cloud infrastructure providers — e.g., companies relying on Amazon Web Services (AWS) and Microsoft Azure.
Dependency through supply chain partners who rely on US tech — e.g., digitization partners using AWS/Azure impacting invoice processing.
Dependency on US-based business IT software and data services — e.g., banks using Microsoft LDAP, accountants using Dropbox, telecoms storing data in Oracle data lakes.
Dependency on US-controlled operating systems on user devices — e.g., Windows, macOS, iOS, Android.
Dependency on US-designed chips in most devices — e.g., Qualcomm, Intel, AMD, Nvidia, Apple chip hardware.
But seriously, this is my main answer to people telling me AI is not reliable: "guess what, most humans are not either, but at least I can tell AI to correct course and it's ego won't get in the way of fixing the problem".
In fact, while AI is not nearly as a good as a senior dev for non trivial tasks yet, it is definitely more reliable than most junior devs at following instructions.
That's exactly the thing. Claude Code with Opus 4.5 is already significantly better at essentially everything than a large percentage of devs I had the displeasure of working with, including learning when asked to retain a memory. It's still very far from the best devs, but this is the worse it'll ever be, and it already significantly raised the bar for hiring.
And even if the models themselves for some reason were to never get better than what we have now, we've only scratched the surface of harnesses to make them better.
We know a lot about how to make groups of people achieve things individual members never could, and most of the same techiques work for LLMs, but it takes extra work to figure out how to most efficiently work around limitations such as lack of integrated long-term memory.
A lot of that work is in its infancy. E.g. I have a project I'm working on now where I'm up to a couple of dozens of agents, and ever day I'm learning more about how to structure them to squeeze the most out of the models.
One learning that feels relevant to the linked article: Instead of giving an agent the whole task across a large dataset that'd overwhelm context, it often helps to have an agent - that can use Haiku, because it's fine if its dumb - comb the data for <information relevant to the specific task>, and generate a list of information, and have the bigger model use that as a guide.
So the progress we're seeing is not just raw model improvements, but work like the one in this article: Figuring out how to squeeze the best results out of any given model, and that work would continue to yield improvements for years even if models somehow stopped improving.
Humans are reliably unreliable. Some are lazy, some sloppy, some obtuse, some all at once. As a tech lead you can learn their strengths and weaknesses. LLMs vacillate wildly while maintaining sycophancy and arrogance.
Human egos make them unlikely to admit error, sometimes, but that fragile ego also gives them shame and a vision of glory. An egotistical programmer won’t deliver flat garbage for fear of being exposed as inferior, and can be cajoled towards reasonable output with reward structures and clear political rails. LLMs fail hilariously and shamelessly in indiscriminate fashion. They don’t care, and will happily argue both sides of anything.
Also that thing that LLMs don’t actually learn. You can threaten to chop their fingers off if they do something again… they don’t have fingers, they don’t recall, and can’t actually tell if they did the thing. “I’m not lying, oops I am, no I’m not, oops I am… lemme delete the home directory and see if that helps…”
If we’re going to make an analogy to a human, LLMs reliably act like absolute psychopaths with constant disassociation. They lie, lie about lying, and lie about following instructions.
I agree LLMs better than your average junior first time following first directives. I’m far less convinced about that story over time, as the dialog develops more effective juniors over time.
You can absolutely learn LLMs strengths and weaknesses too.
E.g. Claude gets "bored" easily (it will even tell you this if you give it too repetitive tasks). The solution is simple: Since we control context and it has no memory outside of that, make it pretend it's not doing repetitive tasks by having the top agent "only" do the task of managing and sub-dividing the task, and farm out each sub-task to a sub-agent who won't get bored because it only sees a small part of the problem.
> Also that thing that LLMs don’t actually learn. You can threaten to chop their fingers off if they do something again… they don’t have fingers, they don’t recall, and can’t actually tell if they did the thing. “I’m not lying, oops I am, no I’m not, oops I am… lemme delete the home directory and see if that helps…”
No, like characters in a "Groundhog Day" scenario they also doesn't remember and change their behaviour while you figure out how to get them to do what you want, so you can test and adjust and find what makes them do what you want and it, and while not perfectly deterministic, you get close.
And unlike humans, sometimes the "not learning" helps us address other parts of the problem. E.g. if they learned, the "sub-agent trick" above wouldn't work, because they'd realise they were carrying out a bunch of tedious tasks instead of remaining oblivious that we're letting them forget in between each.
LLMs in their current form need harnesses, and we can - and are - learning which types of harnesses work well. Incidentally, a lot of them do work on humans too (despite our pesky memory making it harder to slip things past us), and a lot of them are methods we know of from the very long history of figuring out how to make messy, unreliable humans adhere to processes.
E.g. to go back to my top example of getting adherence to a boring, reptitive task: Create checklists, subdivide the task with individual reporting gates, spread it across a team if you can, put in place a review process (with a checklist). All of these are techniques that work both on human teams and LLMs to improve process adherence.
You still can't have a "share to" target that is a web app on iOS. And the data your can store in local storage on safari is a joke.
Of course, forget about background tasks and integrated notifications.
In fact, even on Android you miss features with web apps, like widgets for quick actions, mapping actions to buttons and so on.
And no matter how good you cache things, the mobile browser will unload the app, and you will always get this friction when you load the web app on the new render you don't have on regular apps.
No, I use them but loading and unloading the app in the tab still happens when the browser flushes the app from memory because the OS killed it or the browser eviction policy hits.
This loading is not nearly as seamless as a regular app starting back up.
For a regular app, you have the app loading, and the os cache helping with it. If you do your job half correctly, it loads as a block almost instantly.
For a web app you have the web browser loading, the the display of the white viewport in a flash, then the app loading in the browser (with zero os cache to help with so it's slower). It needs then to render. Then restoring the scroll (which is a mess with a browser) and the state as much as you can but you are limited with persistence size so most content must be reloaded which means the layout is moving around. Not to mention JS in a browser is not nearly as performant as a regular app, so as your app grows, it gets worse.
Pretty sure it destroys something in you as well. So many context changes with no relation whatsoever and regular hooks that give you a pinch.
We haven't evolved for that. Our brain is trying to figure out a narrative between two things following each other. It needs time to process stuff. And there is so much shock it can absorb at once. So many "?!" and open loops in a day.
I made a TikTok account to at least know what people were talking about. After 3 months, I got it.
And I deleted it.
I felt noticeably worse when using it, in a way that nothing bad for me, including the news, refined sugar and pron, ever made me feel. The destruction was more intense, more structural. I could feel it gnarling.
In a way, such fast feedback is good, because it makes it easy to stop, while I'm still eating tons of refined sugar.
Thirty years ago, I read a book called Amusing Ourselves to Death by Neil Postman, in which he made very similar points about broadcast television. I don't remember all his points, but I vividly remember how talked about how you'll be watching a news story about something awful, maybe an earthquake in which hundreds of people died, and then with practically no warning you'll be hearing a happy jingle from a toothpaste commercial. The juxtaposition, he said, was bad for the human mind, and was going to create a generation that couldn't focus on important things.
I suspect that the rapid-fire progression of one one-minute video after another does something similar, and is also equally bad for you.
I've noticed that I can read or see something very emotionally engaging - something that really resonates with me, so much so that I'm maybe even choking up over it - and while I'm still having that emotional response, move onto the next post. I almost always have a moment of meta-reflection that scares me - why wasn't I content to just sit there and process these big emotions? How is the dopamine part of my brain so much more powerful than even the emotional part, that it forces me to continue what I'm doing rather than just feeling?
That talking point - that rapid-form media creates attention deficit problems is honestly overdone and there's no evidence that it's true at all (that I know of). ADHD exists and is a mostly genetic condition, you can't catch it without something serious like cPTSD. Amusing Ourselves To Death emphasized way more the angle of densensitization.
I used to think doomscrolling broke my brain before I was diagnosed. Later I realized I was "doomscrolling" way before I got my first digital device, rereading the same fiction books late into the night.
I can buy the argument that rapid-form media consumption acutely creates symptoms like ADHD (for at most a few hours after exposure) because I see it even in NT people.
I have ADHD myself, so you're not telling me anything I didn't know. Rapid-fire media consumption cannot create the genetic condition, but as you said it can create the symptoms. And that's the important part anyway: a generation that has trouble paying attention to important things because they're getting habituated to rapid-fire video formats. Even if the symptoms (chasing the next dopamine hit) are only acute and not chronic, as long as people are addicted (behaviorally, not chemically) to phone screens, those acute symptoms will occur so often that they might as well be chronic for all practical purposes, because more often than not, people will be in that slightly-dazed state caused by coming off the addictive behavior. (I used to have that myself after a multi-hour gaming session, before I realized that I was displaying all the signs of addiction and quit computer games cold turkey. So I know what it feels like.)
Got it, very good point. Hope somebody studies this soon, I can imagine the title: "Creation of ADHD-like symptoms in neurotypical individuals after exposure to superstimuli/digital content".
The same is true with the "In other news..." technique of seguing to the next story: its end result is overall desensitization and passive consumption.
As usual, if both sides exist, it's because they both provide benefits. The guessers' benefits are just not obvious at first glance.
Taleb has a nice bit on that, explaining that if something exists for long, it must have enduring beneficial properties, and if you think it's stupid, you are the one having a blind spot.
Dawkins led to the same conclusion: stuff that works stays and multiplies. You may not like it, but nature doesn't care what you think.
It's true for entities, systems, traits, concepts...
Everyone mocks Karens, until your flight is delayed and that insufferable lady tires up the staff so much that everyone gets compensation.
I dislike lying but it works, and our entire society is based on it (but we call it advertising).
Don't like mysandry? Don't understand why nature didn't select out ugly people? Think circumcision is dumb?
All those things give some advantages in some context, to such an extend it still prospers today.
In fact, several things can be true. Something can be alienating, and yet give enough benefits that it stays around.
A huge number of things are immoral, create suffering, confusion, destruction, even to the practitioner themselves, and yet are still here because they bring something to the table that is just sufficient to justify their existence.
See your friend making yet again a terrible love choice, getting pregnant, and stuck with a baby and no father? From a natural selection standpoint, it could very well be a super successful strategy for both parties. The universe doesn't optimize for our happiness or morality.
Enduring survival properties aren't the same as enduring beneficial properties. Feudalism and slavery stuck around for quite a long time and were mostly forced out against their will.
This is a good thing and a required first step, but it's a drop in the sea.
All MacOS, iOS, Windows and Android are all produced by the USA. Virtually all chips as well.
It is foolish to assume there are not backdoors in every one of them.
Meaning we should assume the USA can shut down the entire Europe's IT if they really want to.
Then you got the authentication systems, security software (antivirus, proxies like cloudflare, crowdstrike and so on), the various Saas (docs editors, drives, ticket systems, chats...), the payment systems (including Visa and swift, but also Paypal, google pay, stripe, etc), the software stores, the root DNS, the SSL root certificates and a ton of network hardware.
Given the current political situation, it's a very bad spot to be in.
- Old people to stabilise and ensure sustainability
Fire one group and you get problems on the long run.
The hard part is to keep the balance between each group's influence. They don't have the same needs, desires, agendas, and flaws.
reply