August 28, 2025, 12:05 pm | Read time: 3 minutes
In a blog post, OpenAI openly discusses the safety measures in ChatGPT. These are designed to prevent the use of artificial intelligence for dangerous and harmful purposes. To this end, company employees can read chats—and in serious cases, even forward them to the police.
Artificial intelligence has proven to be an extremely helpful tool, answering questions, speeding up workflows, and sometimes simply serving as a conversation partner. However, the technology is still in its early stages and often lacks sufficient safety measures to protect users. A Stanford University study shows that the use of AI in mental health care can have dangerous consequences. In the U.S., parents are suing ChatGPT operator OpenAI because their son allegedly obtained a suicide guide through the AI.
Police Are Involved in Immediate Danger
OpenAI has now published an article on its own blog that addresses user safety on the platform. For the first time, the company reveals what happens with chats where “we discover users who want to harm others.” A team of human employees reviews flagged content and is authorized to suspend accounts. Furthermore, OpenAI writes: “If human reviewers determine that a case poses an immediate threat to the physical safety of others, we may refer it to law enforcement.”
In the FAQ on data usage in ChatGPT, OpenAI states quite clearly that its own employees and “trusted” service providers can access user content. The reasons for data access can include:
- Investigating abuse or security incidents
- Providing support
- Handling legal matters
- Improving model performance (except for opt-out)
The company explicitly states that it does not share sensitive information that is not intended to be reviewed or used.
However, chats containing cases of self-harm are explicitly excluded from being forwarded to the police “to respect people’s privacy.” Given the fatal consequences a conversation with ChatGPT can have if safety measures fail, this statement seems somewhat disconcerting.
When asked, TECHBOOK was referred to the privacy FAQ by an OpenAI agent. So far, we have not received a response on how the company plans to better handle cases of self-harm.
OpenAI Continually Improves Protection Mechanisms
In its blog article, OpenAI outlines specific measures it intends to take to address existing problems. According to OpenAI, safety measures work best in short conversations. In longer exchanges, “parts of the model’s safety training can degrade.” Through extended back-and-forth, users can exploit this weakness to bypass protection. The company is working to maintain protection mechanisms even in such conversations.
Also interesting: How to Prevent ChatGPT from Using Your Inputs as Training Data
It can also happen that content that should be blocked still appears. OpenAI is working to ensure that the filter does not underestimate the severity of the content.