More

prodigycorp · 2026-02-04T12:56:56 1770209816

What a deeply embarrassing thing to post.

prodigycorp · 2026-02-04T01:04:39 1770167079

This is great.

In the past month, OpenAI has released for codex users:

- subagents support

- a better multi agent interface (codex app)

- 40% faster inference

No joke, with the first two my productivity is already up like 3x. I am so stoked to try this out.

jswny · 2026-02-04T04:10:10 1770178210

How do you get sub agents to work?

prodigycorp · 2026-02-04T12:40:39 1770208839

Add this to config.toml:

  [features]
  collab = true

wahnfrieden · 2026-02-04T01:27:50 1770168470

this is for api only

walletdrainer · 2026-02-04T12:09:18 1770206958

Is it even possible to actually use codex any other way? Every time I’ve tried logging in instead of using the API, I’ve hit the usage limits within a couple of hours.

wahnfrieden · 2026-02-05T00:40:58 1770252058

Yes with Pro tier sub.

prodigycorp · 2026-02-04T02:08:57 1770170937

Shoot me

wahnfrieden · 2026-02-04T03:48:26 1770176906

looks like i'm wrong

wahnfrieden · 2026-02-04T06:23:42 1770186222

wrong again: https://x.com/embirico/status/2018928763040665702

prodigycorp · 2026-02-04T12:41:07 1770208867

Thanks for updating this!

brianwawok · 2026-02-04T01:09:07 1770167347

Try Claude and you can get x^2 performance. OpenAI is sweating

viraptor · 2026-02-04T01:48:42 1770169722

May be a bit different depending on what kind of work you're doing, but for me 5.2-codex finally reached higher level than opus.

klipklop · 2026-02-04T01:25:05 1770168305

5.2-codex is pretty solid and you get dramatically higher usage rates with cheap plans. I would assume API use is much cheaper as well.

jerkstate · 2026-02-04T03:03:44 1770174224

people are sleeping on openai right now but codex 5.2 xhigh is at least as good as opus and you get a TON more usage out of the OpenAI $20/mo plan than Claude's $20/mo plan. I'm always hitting the 5 hour quota with Opus but never have with Codex. Codex tool itself is not quite as good but close.

indemnity · 2026-02-04T06:47:08 1770187628

Is there a plan like the $100 Claude Max? $200 for ChatGPT Pro is a little bit too much for me.

Whereas Claude Max 5x is enough that I don’t really run out with my usage patterns.

jerkstate · 2026-02-04T15:03:33 1770217413

If $20/mo Claude is not enough for you but 5x Claude at $100/mo is, the $20 chatgpt plus subscription might give you enough codex for your usage

p2hari · 2026-02-04T10:17:55 1770200275

I do not think so. I have been using both for a long time and with Claude I keep hitting the limits quickly and also most of the time arguing. The latest GPT is just getting things done and does it fast. I also agree with most of them that the limits are more generous. (context, do lot of web, backend development and mobile dev)

akmarinov · 2026-02-04T06:28:17 1770186497

If i could use GPT-5.2 with Claude Code - yeah. Otherwise slOpus requires too much steering to get things done. GPT-5.2 just works

ramon156 · 2026-02-04T09:42:46 1770198166

4.1 or 4.5? I did not need to steer Opus 4.5 at many points. A good description was more than enough

prodigycorp · 2026-02-03T16:27:26 1770136046

I consider revealing my file structure and file paths to be PII so naturally seeing people's comfort with putting all that up there makes me queasy.

prodigycorp · 2026-02-03T09:51:53 1770112313

Thank you for such a thoughtful comment. There's politics that gets flagged on this site, and there's politics that makes me think about things with more clarity. Yours is obviously the latter.

epistasis · 2026-02-03T16:38:18 1770136698

Collective decisions are unavoidably political, and grid electricity has to be a collective decision! My hope is that we can take the partisan aspect out of the politics, however, and reduce it to a discussion of the tradeoff of values: cost, reliability, climate, and any other values that we need to include. Fortunately I think that for nearly any value set, the answer is very similar.

prodigycorp · 2026-02-03T09:43:28 1770111808

My caveman brain was psyched out by the idea of stopping my coke drinking habit. I thought I had a soda addiction. Turns out I didnt, I just didnt drink enough water. After I pulled water bottles instead of coke cans from the fridge, the cravings went away.

Sometimes we don't need cold baths or extreme regimens to fix all the messed up things we're doing to our bodies. Simple changes go far to heal the damage.

embedding-shape · 2026-02-03T09:58:58 1770112738

I think what you experienced was behavioral addiction, tends to be a lot easier to overcome than chemical/physical addiction, often enough by just replacing the habit/behavior with something else.

Most people fighting addiction and having a hard time is fighting a chemical dependency, which is a lot harder and when people start looking beyond "Just do X instead".

prodigycorp · 2026-02-03T10:15:12 1770113712

You're probably right. It seems like there's not a hard line between behavior and chemical addiction, because of how the chemicals create reward signals to reinforce certain behavior.

From the article:

> Basic science models show that liquid sugar concentrations around 10% by weight—comparable with Coca-Cola, Pepsi, and Mountain Dew—can reliably trigger addictive behaviors in animals, including bingelike consumption, withdrawal, and dopamine system alterations.

But yeah, it's obviously nothing close to a nicotine.

prodigycorp · 2026-02-03T08:55:23 1770108923

Claude responds differently to "think", "think hard", and "think very hard". Just because it's hidden to you, doesn't mean a user doesn't have a choice.

Saying gpt-3.5-turbo is better than gpt-5.2 makes me think something you got some of them hidden motives.

https://code.claude.com/docs/en/common-workflows#use-extende...

Computer0 · 2026-02-03T09:03:15 1770109395

’’’Phrases like “think”, “think hard”, “ultrathink”, and “think more” are interpreted as regular prompt instructions and don’t allocate thinking tokens.’’’

prodigycorp · 2026-02-03T09:04:05 1770109445

They dont allocate thinking tokens but they do change model behavior.

Computer0 · 2026-02-03T09:06:21 1770109581

I was getting this in my Claude code app, it seems clear to me that they didn’t want users to do that anymore and it was deprecated. https://i.redd.it/jvemmk1wdndg1.jpeg

prodigycorp · 2026-02-03T09:08:05 1770109685

Thx for the correction. Changed a couple weeks ago. https://decodeclaude.com/ultrathink-deprecated/

sunaookami · 2026-02-03T11:50:53 1770119453

Nice blog, this post is interesting: https://decodeclaude.com/compaction-deep-dive/ Didn't know about Microcompaction!

prodigycorp · 2026-02-03T13:21:31 1770124891

If you're a big context/compaction fan and want another fun fact, did you know that instead of doing regular compaction (prompting the agent to summarize the conversation in a particular way and starting the new conversation with that), Codex passes around a compressed, encrypted object that supposedly preserves the latent space of the previous conversation in the new conversation.

https://openai.com/index/unrolling-the-codex-agent-loop/ https://platform.openai.com/docs/guides/conversation-state#c...

Context management is the new frontier for these labs.

prodigycorp · 2026-02-03T02:33:12 1770085992

If you're using 5.2 high, with all due respect, this has to be a skill issue. If you're using 5.2 Codex high — use 5.2 high. gpt-5.2 is slow, yes (ok, keeping it real, it's excruciatingly slow). But it's not the moronic caricature you're saying it is.

If you need it to be up to date with your version of a framework, then ask it to use the context7 mcp server. Expecting training data to be up to date is unreasonable for any LLM and we now have useful solutions to the training data issue.

If you need it to specify the latest version, don't say "latest". That word would be interpreted differently by humans as well.

Claude is well known at its one-shotting skills. But that's at the expense of strict instruction following adherence and thinner context (it doesn't spend as much time to gather context in larger codebases).

tomashubelbauer · 2026-02-03T09:13:32 1770110012

I am using GPT-5.2 Codex with reasoning set to high via OpenCode and Codex and when I ask it to fix an E2E test it tells me that it fixed it and prints a command I can run to test the changes, instead of checking whether it fixed the test and looping until it did. This is just one example of how lazy/stupid the model is. It _is_ a skill issue, on the model's part.

Sammi · 2026-02-03T17:49:49 1770140989

Non codex gpt 5.2 is much better than codex gpt 5.2 for me. It does everything better.

tomashubelbauer · 2026-02-03T17:51:51 1770141111

Yup, I find it very counter-intuitive that this would be the case, but I switched today and I can already see a massive difference.

Sammi · 2026-02-04T08:42:51 1770194571

It fits with the intuition that codex is simply overfitted.

tomashubelbauer · 2026-02-04T08:48:10 1770194890

Yeah I meant it more like it is not intuitive to my why OpenAI would fumble it this hard. They have got to have tested it internally and seen that it sucked, especially compared to GPT-5.2

theshrike79 · 2026-02-03T11:55:56 1770119756

Codex runs in a stupidly tight sandbox and because of that it refuses to run anything.

But using the same model through pi, for example, it's super smart because pi just doesn't have ANY safeguards :D

tomashubelbauer · 2026-02-03T15:42:53 1770133373

I'll take this as my sign to give Pi a shot then :D Edit: I don't want to speak too son, but this Pi thing is really growing on me so far… Thank you!

theshrike79 · 2026-02-03T21:50:55 1770155455

Wait until you figure out you can just say "create a skill to do..." and it'll just do it, write it in the right place and tell you to /reload

Or "create an extension to..." and it'll write the whole-ass extension and install it :D

prodigycorp · 2026-02-03T09:37:30 1770111450

i refuse to defend the 5.2-codex models. They are awful.

stitched2gethr · 2026-02-03T06:38:15 1770100695

Perhaps if he was able to get Claude Code to do what he wanted in less time, and with a better experience, then maybe that's not a skill he (or the rest of us) want to develop.

blitzar · 2026-02-03T10:11:30 1770113490

Talking LLMs off a ledge is a skill we will all need going forward.

keeganpoppen · 2026-02-04T00:37:17 1770165437

still a skill issue, not a codex issue. sure, this line of critique is also one levied by tech bros who want to transfer your company's balance sheet from salaries to ai-SaaS(-ery), but in what world does that automatically make the tech fraudulent or even deficient? and since when is not wanting to develop a skill a reasonable substitute for anything? if my doctor decided they didn't want to keep up on medical advances, i would find a different doctor. but yet somehow finding fault with an ai because it can't read your mind and, in response to that adversity, refusing to introspect at all about why that might be and blaming it on the technology is a reasonable critique? somehow we have magically discovered a technology to manufacture cognition from nothing more than the intricate weaving of silicon, dopants, et al., and the takeaway is that it sucks because it is too slow, doesn't get everything exactly right, etc.? and the craziest part is that the more time you spend with it, the better intuition you get for getting whatever it is you want out of it. but, yeah... let's lend even more of an ear to the head-in-sand crowd-- that's where the real thought leaders are. you don't have to be an ai techno-utopian maximalist to see the profound worthiness and promise of the technology; these things are manifestly self-evident.

prodigycorp · 2026-02-03T07:26:46 1770103606

Sure, that's fine. I wrote my comment for the people who don't get angry at an AI agents after using them for the first time within five hours of their release. For those who aren't interested in portending doom for OpenAI. (I have elaborate setups for Codex/Claude btw, there's no fanboying in this space.)

Some things aren't common sense yet so I'm trying my part to make them so.

keeganpoppen · 2026-02-04T00:41:53 1770165713

common sense has the misfortune of being less "common" than we would all like it to be. because some breathless hucksters are overpromising and underdelivering in the present, we may as well throw out the baby, the bath water, and the bath tub itself! who even wants computers to think like humans and automate jobs that no human would want to do? don't you appreciate the self-worth that comes from menial labor? i don't even get why we use tractors to farm when we have perfectly good beasts of burden to do the same labor!

jtrn · 2026-02-03T09:28:28 1770110908

Feelings are information with just as much, or more, value as biased intellectualizing.

Ask Linus Torvalds.

keeganpoppen · 2026-02-04T00:42:49 1770165769

i have absolutely no idea whatsoever what this means

miki123211 · 2026-02-03T12:45:58 1770122758

TBH, "use a package manager, don't specify versions manually unless necessary, don't edit package files manually" is an instructions that most agents still need to be given explicitly. They love manually editing package.json / cargo.toml / pyproject.toml / what have you, and using whatever version is given in their training data. They still don't have an intuition for which files should be manually written and which files should be generated by a command.

prodigycorp · 2026-02-03T13:27:57 1770125277

Agree, especially if they're not given access to the web, or if they're not strongly prompted to use the web to gather context. It's tough to judge models and harnesses by pure feel until you understand their proclivities.

hyeomans · 2026-02-03T04:43:26 1770093806

Ty for the tip on context7 mcp btw

adammarples · 2026-02-03T09:42:46 1770111766

How would a person interpret the latest version of flowbite?

jtrn · 2026-02-03T06:08:34 1770098914

Ok. You do you. I'll stick with the models that understand what latest version of a framework means.

prodigycorp · 2026-02-02T02:32:10 1769999530

I'm extremely wary about any application pushing politics.

I subscribe to MacPaw, who makes excellent apps like Setapp, Gemini, and CleanMyMac, all of which I use.

At some point, CleanMyMac started putting the Ukranian flag on the app icon and flagging utilities by any Russian developer as untrustworthy (because they are russian), and recommended that I uninstall them.

I am not pro russia/anti-ukraine independence by any means, but CleanMyMac is one of those apps that require elevated system permissions. Seeing them engage in software maccarythism makes me very, very hesitant to provide them.

_alternator_ · 2026-02-02T02:36:55 1769999815

Sorry, what does this have to do with notepad++?

prodigycorp · 2026-02-02T02:37:51 1769999871

Sorry, I meant to reply to this comment: https://news.ycombinator.com/item?id=46851664

Please refer to it for context.

gradus_ad · 2026-02-02T02:58:16 1770001096

You should repost under the intended post

stackghost · 2026-02-02T02:37:40 1769999860

The notepad++ author has publicly come out in favor of Taiwanese independence.

permo-w · 2026-02-02T02:48:03 1770000483

Taiwan is already independent. Surely the normal way to refer to it would be as coming out against assimilation with mainland China?

smuhakg · 2026-02-02T03:05:33 1770001533

The official position of Taiwan (Republic of China) and the People's Republic of China is that they're rival governments of the same China.

The Taiwanese government has never formally declared itself independent from the mainland. Such a declaration would likely cause the PRC to invade.

https://en.wikipedia.org/wiki/1992_Consensus

permo-w · 2026-02-04T11:03:15 1770202995

Nonetheless, de facto, they are independent. And if you'd glance back at my comment, I deliberately referred to it as "assimilating with mainland China" to pay lip service to them seeing themselves as the true government of China, in an attempt to avoid this very nitpick.

sb057 · 2026-02-02T02:59:43 1770001183

>Taiwan is already independent.

That is a very controversial statement, and one that both Taipei and Beijing disagree with.

Supermancho · 2026-02-02T03:08:37 1770001717

Controversy doesn't change the reality. Stating that Taiwan is not independent is political posturing. Look to French Guiana, which is not independent.

fc417fc802 · 2026-02-03T00:25:59 1770078359

Taipei only disagrees because they're under threat. Doublespeak should generally be called out. Taiwan lives under perpetual fear of occupation and forced assimilation.

stackghost · 2026-02-02T02:49:56 1770000596

>Surely the normal way to refer to it would be as coming out against assimilation with mainland China?

I suppose, though that's not really how I tend to see it phrased on socials or in the media.

sMarsIntruder · 2026-02-02T06:13:48 1770012828

De facto sed non de iure

litbear2022 · 2026-02-02T02:58:34 1770001114

Before Trump set his sights on Greenland, Denmark also considered Kosovo to be independent.

Barrin92 · 2026-02-02T02:38:13 1769999893

if you're going to give in and avoid applications because, like in this case they take a strong stance on Ukraine or Taiwan the hack has literally achieved its purpose. Either silence the author directly or destroy its userbase.

Fuck'em and just donate ten bucks to notepad++ , I'd rather my pc breaks then reward this crap

prodigycorp · 2026-02-02T02:49:04 1770000544

I think I made it clear that I use (and pay for) their applications. I also think I made a sufficiently nuanced comment that doesn't suggest that I've "given in" to anything.

Barrin92 · 2026-02-02T03:18:10 1770002290

what I took a bit of offense with is the term "software maccarythism". That's a movement now remembered for an over-reaction to often imaginary enemies. Ukraine is right now fighting for its life in a hot war on our continent here in Europe. Taiwan is at the very real risk of being invaded.

American and European infrastructure is subject to cyber attacks that that are effectively hostile military acts already. I don't think a vocal stance on Ukraine and an exclusion of Russian developers deserves the rhetoric of McCarthyism or being 'too political' as is these days a fashionable accusation. This is no red scare, this is speaking up for people bombed on a daily basis.

prodigycorp · 2026-02-03T02:40:01 1770086401

Adguard is flagged as "suspicious". If it is, I'd like to have a better reason than "because it's russian".

generalizations · 2026-02-02T03:52:49 1770004369

> a movement now remembered for an over-reaction to often imaginary enemies

I'm sure it felt very real at the time.

suprstarrd · 2026-02-02T02:53:54 1770000834

I can see where they got that idea from. You saying you won't provide permissions at the end ends up sounding a lot more like you won't use the app than I imagine you intended. (Although, subscribing to an app and then not using it would be silly.)

stavros · 2026-02-02T02:51:17 1770000677

I support the Ukraine effort as well, but breaking my applications seems like a bridge too far.

wiseowise · 2026-02-02T05:17:54 1770009474

> anti-ukraine independence

What the fuck is that supposed to mean, lol. Ukraine isn’t done secessionist state.

> Seeing them engage in software maccarythism makes me very, very hesitant to provide them.

So are they wrong when flagging software or not? You haven’t provided any details.

prodigycorp · 2026-02-03T09:36:10 1770111370

They flag AdGuard for Safari as suspicious. It's one of the most popular mac apps, if adguard is truly suspicious then it should be bigger news.

throwaway3060 · 2026-02-02T02:51:28 1770000688

I hate to say this, but wariness of software developed within Russia has been around for ages, long before the current war.

Since there are a lot of both Ukrainian and Russian software developers, this is personal for a lot of people in the industry.

prodigycorp · 2026-01-27T14:35:39 1769524539

can you relax the restrictions on your link or share a direct link to the video, i dont have a bluesky account

seanieb · 2026-01-27T14:45:11 1769525111

Yes. Sorry I’d no idea/forgotten it worked that way. Thank you for pointing it out. I’ve updated my settings.

nerdsniper · 2026-01-27T18:11:32 1769537492

https://www.bskydownloader.com/

prodigycorp · 2026-01-27T13:16:38 1769519798

Random aside about training data:

One of the funniest things I've started to notice from Gemini in particular is that in random situations, it talks with english with an agreeable affect that I can only describe as.. Indian? I've never noticed such a thing leak through before. There must be a ton of people in India who are generating new datasets for training.

evntdrvn · 2026-01-27T15:07:11 1769526431

There was a really great article or blog post published in the last few months about the author's very personal experience whose gist was "People complain that I sound/write like an LLM, but it's actually the inverse because I grew up in X where people are taught formal English to sound educated/western, and those areas are now heavily used for LLM training."

I wish I could find it again, if someone else knows the link please post it!

gxnxcxcx · 2026-01-27T16:02:12 1769529732

I'm Kenyan. I don't write like ChatGPT, ChatGPT writes like me

https://news.ycombinator.com/item?id=46273466

tverbeure · 2026-01-28T00:20:22 1769559622

Thanks for that link.

This part made me laugh though:

> These detectors, as I understand them, often work by measuring two key things: ‘Perplexity’ and ‘burstiness’. Perplexity gauges how predictable a text is. If I start a sentence, "The cat sat on the...", your brain, and the AI, will predict the word "floor."

I can't be the only one who's brain predicted "mat" ?

cozzyd · 2026-01-28T02:28:55 1769567335

And I thought it would be a hat...

tverbeure · 2026-01-29T18:05:50 1769709950

No, that would be "in the hat."

evntdrvn · 2026-01-29T15:30:52 1769700652

Thank you!!! :)

awesome_dude · 2026-01-27T18:24:53 1769538293

I've been critical of people that default to "an em dash being used means the content is generated by an LLM", or, "they've numbered their points, must be an LLM"

I do know that LLMs generate content heavy with those constructs, but they didn't create the ideas out of thin air, it was in the training set, and existed strongly enough that LLMs saw it as common place/best practice.

blenderob · 2026-01-27T13:51:23 1769521883

That's very interesting. Any examples you can share which has those agreeable effects?

prodigycorp · 2026-01-27T14:19:45 1769523585

I'm going to do a cursory look through my antigrav history, i want to find it too. I remember it's primarily in the exclamations of agreement/revelation, and one time expressing concern which I remember were slightly off natural for an american english speaker.

prodigycorp · 2026-01-27T16:59:17 1769533157

Cant find anything, too many messages telling the agent "please do NOT thosec changes". I'm going to remember to save them going forward.