Hacker Newsnew | past | comments | ask | show | jobs | submit | prodigycorp's commentslogin

What a deeply embarrassing thing to post.

This is great.

In the past month, OpenAI has released for codex users:

- subagents support

- a better multi agent interface (codex app)

- 40% faster inference

No joke, with the first two my productivity is already up like 3x. I am so stoked to try this out.


How do you get sub agents to work?

Add this to config.toml:

  [features]
  collab = true

this is for api only

Is it even possible to actually use codex any other way? Every time I’ve tried logging in instead of using the API, I’ve hit the usage limits within a couple of hours.

Yes with Pro tier sub.

Shoot me

looks like i'm wrong


Thanks for updating this!

Try Claude and you can get x^2 performance. OpenAI is sweating

May be a bit different depending on what kind of work you're doing, but for me 5.2-codex finally reached higher level than opus.

5.2-codex is pretty solid and you get dramatically higher usage rates with cheap plans. I would assume API use is much cheaper as well.

people are sleeping on openai right now but codex 5.2 xhigh is at least as good as opus and you get a TON more usage out of the OpenAI $20/mo plan than Claude's $20/mo plan. I'm always hitting the 5 hour quota with Opus but never have with Codex. Codex tool itself is not quite as good but close.

Is there a plan like the $100 Claude Max? $200 for ChatGPT Pro is a little bit too much for me.

Whereas Claude Max 5x is enough that I don’t really run out with my usage patterns.


If $20/mo Claude is not enough for you but 5x Claude at $100/mo is, the $20 chatgpt plus subscription might give you enough codex for your usage

I do not think so. I have been using both for a long time and with Claude I keep hitting the limits quickly and also most of the time arguing. The latest GPT is just getting things done and does it fast. I also agree with most of them that the limits are more generous. (context, do lot of web, backend development and mobile dev)

If i could use GPT-5.2 with Claude Code - yeah. Otherwise slOpus requires too much steering to get things done. GPT-5.2 just works

4.1 or 4.5? I did not need to steer Opus 4.5 at many points. A good description was more than enough

I consider revealing my file structure and file paths to be PII so naturally seeing people's comfort with putting all that up there makes me queasy.

Thank you for such a thoughtful comment. There's politics that gets flagged on this site, and there's politics that makes me think about things with more clarity. Yours is obviously the latter.

Collective decisions are unavoidably political, and grid electricity has to be a collective decision! My hope is that we can take the partisan aspect out of the politics, however, and reduce it to a discussion of the tradeoff of values: cost, reliability, climate, and any other values that we need to include. Fortunately I think that for nearly any value set, the answer is very similar.

My caveman brain was psyched out by the idea of stopping my coke drinking habit. I thought I had a soda addiction. Turns out I didnt, I just didnt drink enough water. After I pulled water bottles instead of coke cans from the fridge, the cravings went away.

Sometimes we don't need cold baths or extreme regimens to fix all the messed up things we're doing to our bodies. Simple changes go far to heal the damage.


I think what you experienced was behavioral addiction, tends to be a lot easier to overcome than chemical/physical addiction, often enough by just replacing the habit/behavior with something else.

Most people fighting addiction and having a hard time is fighting a chemical dependency, which is a lot harder and when people start looking beyond "Just do X instead".


You're probably right. It seems like there's not a hard line between behavior and chemical addiction, because of how the chemicals create reward signals to reinforce certain behavior.

From the article:

> Basic science models show that liquid sugar concentrations around 10% by weight—comparable with Coca-Cola, Pepsi, and Mountain Dew—can reliably trigger addictive behaviors in animals, including bingelike consumption, withdrawal, and dopamine system alterations.

But yeah, it's obviously nothing close to a nicotine.


Claude responds differently to "think", "think hard", and "think very hard". Just because it's hidden to you, doesn't mean a user doesn't have a choice.

Saying gpt-3.5-turbo is better than gpt-5.2 makes me think something you got some of them hidden motives.

https://code.claude.com/docs/en/common-workflows#use-extende...


’’’Phrases like “think”, “think hard”, “ultrathink”, and “think more” are interpreted as regular prompt instructions and don’t allocate thinking tokens.’’’

They dont allocate thinking tokens but they do change model behavior.

I was getting this in my Claude code app, it seems clear to me that they didn’t want users to do that anymore and it was deprecated. https://i.redd.it/jvemmk1wdndg1.jpeg

Thx for the correction. Changed a couple weeks ago. https://decodeclaude.com/ultrathink-deprecated/

Nice blog, this post is interesting: https://decodeclaude.com/compaction-deep-dive/ Didn't know about Microcompaction!

If you're a big context/compaction fan and want another fun fact, did you know that instead of doing regular compaction (prompting the agent to summarize the conversation in a particular way and starting the new conversation with that), Codex passes around a compressed, encrypted object that supposedly preserves the latent space of the previous conversation in the new conversation.

https://openai.com/index/unrolling-the-codex-agent-loop/ https://platform.openai.com/docs/guides/conversation-state#c...

Context management is the new frontier for these labs.


If you're using 5.2 high, with all due respect, this has to be a skill issue. If you're using 5.2 Codex high — use 5.2 high. gpt-5.2 is slow, yes (ok, keeping it real, it's excruciatingly slow). But it's not the moronic caricature you're saying it is.

If you need it to be up to date with your version of a framework, then ask it to use the context7 mcp server. Expecting training data to be up to date is unreasonable for any LLM and we now have useful solutions to the training data issue.

If you need it to specify the latest version, don't say "latest". That word would be interpreted differently by humans as well.

Claude is well known at its one-shotting skills. But that's at the expense of strict instruction following adherence and thinner context (it doesn't spend as much time to gather context in larger codebases).


I am using GPT-5.2 Codex with reasoning set to high via OpenCode and Codex and when I ask it to fix an E2E test it tells me that it fixed it and prints a command I can run to test the changes, instead of checking whether it fixed the test and looping until it did. This is just one example of how lazy/stupid the model is. It _is_ a skill issue, on the model's part.

Non codex gpt 5.2 is much better than codex gpt 5.2 for me. It does everything better.

Yup, I find it very counter-intuitive that this would be the case, but I switched today and I can already see a massive difference.

It fits with the intuition that codex is simply overfitted.

Yeah I meant it more like it is not intuitive to my why OpenAI would fumble it this hard. They have got to have tested it internally and seen that it sucked, especially compared to GPT-5.2

Codex runs in a stupidly tight sandbox and because of that it refuses to run anything.

But using the same model through pi, for example, it's super smart because pi just doesn't have ANY safeguards :D


I'll take this as my sign to give Pi a shot then :D Edit: I don't want to speak too son, but this Pi thing is really growing on me so far… Thank you!

Wait until you figure out you can just say "create a skill to do..." and it'll just do it, write it in the right place and tell you to /reload

Or "create an extension to..." and it'll write the whole-ass extension and install it :D


i refuse to defend the 5.2-codex models. They are awful.

Perhaps if he was able to get Claude Code to do what he wanted in less time, and with a better experience, then maybe that's not a skill he (or the rest of us) want to develop.

Talking LLMs off a ledge is a skill we will all need going forward.

still a skill issue, not a codex issue. sure, this line of critique is also one levied by tech bros who want to transfer your company's balance sheet from salaries to ai-SaaS(-ery), but in what world does that automatically make the tech fraudulent or even deficient? and since when is not wanting to develop a skill a reasonable substitute for anything? if my doctor decided they didn't want to keep up on medical advances, i would find a different doctor. but yet somehow finding fault with an ai because it can't read your mind and, in response to that adversity, refusing to introspect at all about why that might be and blaming it on the technology is a reasonable critique? somehow we have magically discovered a technology to manufacture cognition from nothing more than the intricate weaving of silicon, dopants, et al., and the takeaway is that it sucks because it is too slow, doesn't get everything exactly right, etc.? and the craziest part is that the more time you spend with it, the better intuition you get for getting whatever it is you want out of it. but, yeah... let's lend even more of an ear to the head-in-sand crowd-- that's where the real thought leaders are. you don't have to be an ai techno-utopian maximalist to see the profound worthiness and promise of the technology; these things are manifestly self-evident.

Sure, that's fine. I wrote my comment for the people who don't get angry at an AI agents after using them for the first time within five hours of their release. For those who aren't interested in portending doom for OpenAI. (I have elaborate setups for Codex/Claude btw, there's no fanboying in this space.)

Some things aren't common sense yet so I'm trying my part to make them so.


common sense has the misfortune of being less "common" than we would all like it to be. because some breathless hucksters are overpromising and underdelivering in the present, we may as well throw out the baby, the bath water, and the bath tub itself! who even wants computers to think like humans and automate jobs that no human would want to do? don't you appreciate the self-worth that comes from menial labor? i don't even get why we use tractors to farm when we have perfectly good beasts of burden to do the same labor!

Feelings are information with just as much, or more, value as biased intellectualizing.

Ask Linus Torvalds.


i have absolutely no idea whatsoever what this means

TBH, "use a package manager, don't specify versions manually unless necessary, don't edit package files manually" is an instructions that most agents still need to be given explicitly. They love manually editing package.json / cargo.toml / pyproject.toml / what have you, and using whatever version is given in their training data. They still don't have an intuition for which files should be manually written and which files should be generated by a command.

Agree, especially if they're not given access to the web, or if they're not strongly prompted to use the web to gather context. It's tough to judge models and harnesses by pure feel until you understand their proclivities.

Ty for the tip on context7 mcp btw

How would a person interpret the latest version of flowbite?

Ok. You do you. I'll stick with the models that understand what latest version of a framework means.

I'm extremely wary about any application pushing politics.

I subscribe to MacPaw, who makes excellent apps like Setapp, Gemini, and CleanMyMac, all of which I use.

At some point, CleanMyMac started putting the Ukranian flag on the app icon and flagging utilities by any Russian developer as untrustworthy (because they are russian), and recommended that I uninstall them.

I am not pro russia/anti-ukraine independence by any means, but CleanMyMac is one of those apps that require elevated system permissions. Seeing them engage in software maccarythism makes me very, very hesitant to provide them.


Sorry, what does this have to do with notepad++?

Sorry, I meant to reply to this comment: https://news.ycombinator.com/item?id=46851664

Please refer to it for context.


You should repost under the intended post

The notepad++ author has publicly come out in favor of Taiwanese independence.

Taiwan is already independent. Surely the normal way to refer to it would be as coming out against assimilation with mainland China?

The official position of Taiwan (Republic of China) and the People's Republic of China is that they're rival governments of the same China.

The Taiwanese government has never formally declared itself independent from the mainland. Such a declaration would likely cause the PRC to invade.

https://en.wikipedia.org/wiki/1992_Consensus


Nonetheless, de facto, they are independent. And if you'd glance back at my comment, I deliberately referred to it as "assimilating with mainland China" to pay lip service to them seeing themselves as the true government of China, in an attempt to avoid this very nitpick.

>Taiwan is already independent.

That is a very controversial statement, and one that both Taipei and Beijing disagree with.


Controversy doesn't change the reality. Stating that Taiwan is not independent is political posturing. Look to French Guiana, which is not independent.

Taipei only disagrees because they're under threat. Doublespeak should generally be called out. Taiwan lives under perpetual fear of occupation and forced assimilation.

>Surely the normal way to refer to it would be as coming out against assimilation with mainland China?

I suppose, though that's not really how I tend to see it phrased on socials or in the media.


De facto sed non de iure

Before Trump set his sights on Greenland, Denmark also considered Kosovo to be independent.

if you're going to give in and avoid applications because, like in this case they take a strong stance on Ukraine or Taiwan the hack has literally achieved its purpose. Either silence the author directly or destroy its userbase.

Fuck'em and just donate ten bucks to notepad++ , I'd rather my pc breaks then reward this crap


I think I made it clear that I use (and pay for) their applications. I also think I made a sufficiently nuanced comment that doesn't suggest that I've "given in" to anything.

what I took a bit of offense with is the term "software maccarythism". That's a movement now remembered for an over-reaction to often imaginary enemies. Ukraine is right now fighting for its life in a hot war on our continent here in Europe. Taiwan is at the very real risk of being invaded.

American and European infrastructure is subject to cyber attacks that that are effectively hostile military acts already. I don't think a vocal stance on Ukraine and an exclusion of Russian developers deserves the rhetoric of McCarthyism or being 'too political' as is these days a fashionable accusation. This is no red scare, this is speaking up for people bombed on a daily basis.


Adguard is flagged as "suspicious". If it is, I'd like to have a better reason than "because it's russian".

> a movement now remembered for an over-reaction to often imaginary enemies

I'm sure it felt very real at the time.


I can see where they got that idea from. You saying you won't provide permissions at the end ends up sounding a lot more like you won't use the app than I imagine you intended. (Although, subscribing to an app and then not using it would be silly.)

I support the Ukraine effort as well, but breaking my applications seems like a bridge too far.

> anti-ukraine independence

What the fuck is that supposed to mean, lol. Ukraine isn’t done secessionist state.

> Seeing them engage in software maccarythism makes me very, very hesitant to provide them.

So are they wrong when flagging software or not? You haven’t provided any details.


They flag AdGuard for Safari as suspicious. It's one of the most popular mac apps, if adguard is truly suspicious then it should be bigger news.

I hate to say this, but wariness of software developed within Russia has been around for ages, long before the current war.

Since there are a lot of both Ukrainian and Russian software developers, this is personal for a lot of people in the industry.


can you relax the restrictions on your link or share a direct link to the video, i dont have a bluesky account

Yes. Sorry I’d no idea/forgotten it worked that way. Thank you for pointing it out. I’ve updated my settings.


Random aside about training data:

One of the funniest things I've started to notice from Gemini in particular is that in random situations, it talks with english with an agreeable affect that I can only describe as.. Indian? I've never noticed such a thing leak through before. There must be a ton of people in India who are generating new datasets for training.


There was a really great article or blog post published in the last few months about the author's very personal experience whose gist was "People complain that I sound/write like an LLM, but it's actually the inverse because I grew up in X where people are taught formal English to sound educated/western, and those areas are now heavily used for LLM training."

I wish I could find it again, if someone else knows the link please post it!


I'm Kenyan. I don't write like ChatGPT, ChatGPT writes like me

https://news.ycombinator.com/item?id=46273466


Thanks for that link.

This part made me laugh though:

> These detectors, as I understand them, often work by measuring two key things: ‘Perplexity’ and ‘burstiness’. Perplexity gauges how predictable a text is. If I start a sentence, "The cat sat on the...", your brain, and the AI, will predict the word "floor."

I can't be the only one who's brain predicted "mat" ?


And I thought it would be a hat...

No, that would be "in the hat."

Thank you!!! :)

I've been critical of people that default to "an em dash being used means the content is generated by an LLM", or, "they've numbered their points, must be an LLM"

I do know that LLMs generate content heavy with those constructs, but they didn't create the ideas out of thin air, it was in the training set, and existed strongly enough that LLMs saw it as common place/best practice.


That's very interesting. Any examples you can share which has those agreeable effects?

I'm going to do a cursory look through my antigrav history, i want to find it too. I remember it's primarily in the exclamations of agreement/revelation, and one time expressing concern which I remember were slightly off natural for an american english speaker.

Cant find anything, too many messages telling the agent "please do NOT thosec changes". I'm going to remember to save them going forward.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: