Why would I care if a protest is done in coordination with the local municipality. Why would I care if traffic gets blocked. Why would I care if a DHS agent has their finger bit off? These things require a level of respect for a system I do not have. I hope this clears things up.
I still have yet to find a "Small" model that can use function calls consistently enough to not be frustrating. That is the most noticeable difference I consistently see between even older "SOTA" models and the best performing "SMALL" models (<70b).
I think they are referring to the occurrence rate of false positives, not that of false negatives. E.g. the page for California lists back through to the Bond Fire, which was contained in 2020. The problem may stem from that the FEMA page lists the incident as a single day https://www.fema.gov/disaster/5385 so this tool doesn't set and end date like it would for https://www.fema.gov/disaster/5382
A similar kind of noise note could probably be made of the "Recent Earthquakes" section. E.g. if you select Indianapolis, IN it includes all the way down to a M2.6 which occurred in NW Tennessee 30 days ago.
’’’Phrases like “think”, “think hard”, “ultrathink”, and “think more” are interpreted as regular prompt instructions and don’t allocate thinking tokens.’’’
I was getting this in my Claude code app, it seems clear to me that they didn’t want users to do that anymore and it was deprecated. https://i.redd.it/jvemmk1wdndg1.jpeg
If you're a big context/compaction fan and want another fun fact, did you know that instead of doing regular compaction (prompting the agent to summarize the conversation in a particular way and starting the new conversation with that), Codex passes around a compressed, encrypted object that supposedly preserves the latent space of the previous conversation in the new conversation.
reply