No, there is a whole news cycle about how chats you delete aren't actually being deleted because of a lawsuit, they essentially have to respond. It's not an attempt to spin the lawsuit; it's about reassuring their customers.
The part where they go out of the way to call the lawsuit baseless is spin though, and mixing that with this messaging does exactly that, presents a mixed message. The NYT lawsuit is objectively not baseless. OpenAI did train on the Times and chat gpt does output information from that training. That’s the basis of the lawsuit. NYT may lose, this could end up being considered fair use, it might ultimately be a flimsy basis for a lawsuit, but to say it’s baseless (and with nothing to back that up) is spin and makes this message less reassuring.
No, it's not. It's absolutely standard corporate communications. If they're fighting the lawsuit, that is essentially the only thing they can say about it. Ford Motor Company would say the same thing (well, they'd probably say "meritless and frivolous").
No, this isn't even close to spin, it's just a standard part of defending your case. In the US tort system you need to be constantly publicly saying you did nothing wrong. Any wavering on that point could be used against you in court.
This is a funny thread. You say "No" but then restate the point with slightly different words. As if anything a company says publicly about ongoing litigation isn't spin.
Can you share your definition? This is actually quite puzzling because as far as I know “spin” has always been associated with presenting things in a way that benefits you. Like, decades ago, they could have the show “Bill O’Rilley’s No Spin Zone” and everybody knew the premise was that they argue against guests who were trying to tell a “massaged” version of the story, and that they’d go for some actual truth (fwiw I thought the whole show was full of crap, but the name was not confusing or ambiguous).
I’m not aware of any definition of “spin” where being conventional is a defense against that accusation. Actually, that was the (imagined) value-add of the show, that conventional corporate and political messaging is heavily spun.
Spin, like you illustrate in your comment, has connotations of distorting the truth.
Simply denying the allegations isn't really spinning anything; it's just denying the allegations. And The thing I dislike about characterizing something like this as spin is that it defangs the term by removing all those connotations and instead turning it into just a buzzwordy way of saying, "I disagree with what this person said."
They didn’t just deny the allegations. They called the case baseless. The case is clearly not baseless, in the sense that there’s at least enough of a basis that the court didn’t vacate the order to preserve the chats.
It seems to me that the discussion of whether or not it is spin has turned into a discussion of which party people basically agree with.
My personal opinion is that OpenAI will probably win, or at least get away with a pretty minor fine or something like that. However, the communications coming from both parties in the case should be assumed to be corporate spin until proven otherwise. And, calling an unfinished case baseless is, at least, a bit presumptuous!
There's a difference between "we are choosing to phrase it this way" versus "our lawyers told us we have to say this". "Spin" is generally seen as a voluntary action, which makes the former a clearcut case of it, the latter less so.
1) taking your lawyer’s advice is a voluntary action (although it is probably a good one)
2) I don’t understand the distinction being made between voluntary or involuntary, in the sense that a corporation is a thing made up of by people, it doesn’t have a will in-and-of-itself, so the communications it sends must always actually be made by somebody inside the corporation (whether a lawyer, marketing person, or in the unlikely event that somebody lets them out, an engineer).
My understanding is that they have to keep chats based on an order, *as a result of their previous accidental deletion of potential evidence in the case*[0].
And per their own terms they likely only delete messages "when they want to" given the big catch-alls. "What happens when you delete a chat? -> It is scheduled for permanent deletion from OpenAI's systems within 30 days, unless: It has already been de-identified and disassociated from your account"[1]
They should include the part where the order is a result of them deleting things they shouldn’t have then. You know, if this isn’t spin.
Then again I’m starting to think OpenAI is gathering a cult leader like following where any negative comments will result in devoted followers or those with something to gain immediately jumping to its defense no matter how flimsy the ground.
>They should include the part where the order is a result of them deleting things they shouldn’t have then. You know, if this isn’t spin.
From what I can tell from the court filings, prior to the judge's order to retain everything, the request to retain everything was coming from the plaintiff, with openai objecting to the request and refusing to comply in the meantime. If so, it's a bit misleading to characterize this as "deleting things they shouldn’t have", because what they "should have" done wasn't even settled. That's a bit rich coming from someone accusing openai of "spin".
Your linked article talks about openai deleting training data. I don't see how that's related to the current incident, which is about user queries. The ruling from the judge for openai to retain all user queries also didn't reference this incident.
Without this devolving into a tit for tat then the article explains for those following this conversation why it’s been elevated to a court order and not just an expectation to preserve.
No worries. I can’t force understanding on anyone.
Here. I had an LLM summarize it for you.
A court order now requires OpenAI to retain all user data, including deleted ChatGPT chats, as part of the ongoing copyright lawsuit brought by The New York Times (NYT) and other publishers[1][2][6][7]. This order was issued because the NYT argued that evidence of copyright infringement—such as AI outputs closely matching NYT articles—could be lost if OpenAI continued its standard practice of deleting user data after 30 days[2][6][7].
This new requirement is directly related to a 2024 incident where OpenAI accidentally deleted critical data that NYT lawyers had gathered during the discovery process. In that incident, OpenAI engineers erased programs and search result data stored by NYT's legal team on dedicated virtual machines provided for examining OpenAI's training data[3][4][5]. Although OpenAI recovered some of the data, the loss of file structure and names rendered it largely unusable for the lawyers’ purposes[3][5]. The court and NYT lawyers did not believe the deletion was intentional, but it highlighted the risks of relying on OpenAI’s internal data retention and deletion practices during litigation[3][4][5].
The court order to retain all user data is a direct response to concerns that important evidence could be lost—just as it was in the accidental deletion incident[2][6][7]. The order aims to prevent any further loss of potentially relevant information as the case proceeds. OpenAI is appealing the order, arguing it conflicts with user privacy and their established data deletion policies[1][2][6][7].
Gruez said that is talking about an incident in this case but unrelated to the judge's order in question.
You said the article "explains for those following this conversation why it’s been elevated to a court order" but it doesn't actually explain that. It is talking about separate data being deleted in a different context. It is not user chats and access logs. It is the data that was used to train the models.
I pointed that out a second time since it seemed to be misunderstood.
Then you posted an LLM summary of something unrelated to the point being made.
Now we're here.
As you say, one cannot force understanding on another; we all have to do our part. ;)
Edit:
> The court order to retain all user data is a direct response to concerns that important evidence could be lost—just as it was in the accidental deletion incident[2][6][7].
What did you prompt the LLM with for it to reach this conclusion? The [2][6][7] citations similarly don't seem to explain how that incident from months ago informed the judge's recent decision. Anyway, I'm not saying the conclusion is wrong, I'm saying the article you linked does not support the conclusion.
I think in your rush to reply you may have not read the summarization.
Calm down, cool off, and read it again.
The point is that the circumstances of the incident in 2024 are directly related to the how and why of the NYT lawyers request and the judges order.
The article I linked was to the incident in 2024.
Not everything has to be about pedantry and snark, even on HN.
Edit: I see you edited your response after re-reading the summarization. I’m glad cooler heads have prevailed.
The prompt was simply “What is the relation, if any, between OpenAI being ordered to retain user data and the incident from 2024 where OpenAI accidentally deleted the NYT lawyers data while they were investigating whether OpenAI had used their data to train their models?”
> I see you edited your response after re-reading the summarization.
Just to be clear, the summary is not convincing. I do understand the idea but none of the evidence presented so far suggests that was the reason. The court expected that the data would be retained, the court learned that it was not, the court gave an order for it to be retained. That is the seeming reason for the order.
Put another way: if the incident last year had not happened, the court would still have issued the order currently under discussion.
It's hard to reassure your customers if you can't address the elephant in the room. OpenAI brought this on themselves by flaunting copyright law and assuring everyone else that such aggressive and probably-illegal action would be retroactively acceptable once they were too big to fail.