Meta Platforms Inc.—the parent company of Facebook, Instagram, and Threads—is in the process of substituting its human risk assessment teams with artificial intelligence systems, according to multiple sources familiar with the company’s internal operations.
The transition, which remains in an early rollout phase, is part of Meta’s broader strategy to automate safety analysis, crisis prediction, and content moderation across its suite of digital platforms.
Sources say that Meta engineers are now deploying AI models trained on vast quantities of user behavior data and historical platform incidents. These systems are designed to identify potential threats—ranging from violent or extremist content to coordinated disinformation campaigns—without direct human intervention.
Instead of routing flagged posts or emerging crises to specialized human review teams, product managers and engineers can now submit structured questionnaires to the AI and receive near-instant risk ratings and policy recommendations.
Historically, Meta’s content moderation apparatus has relied on dedicated teams of analysts and risk assessors who review borderline cases flagged by automated filters.
Under the new approach, those human assessors may increasingly play supporting roles—auditing AI outputs or handling appeals—while AI models take on front-line decision-making responsibilities. According to one Meta engineer briefed on the project, “The goal is to accelerate compliance checks and reduce staff workload.
“The AI can crunch mountains of data much faster than a human team.” However, this same engineer acknowledged that “it remains to be seen how well the models handle nuanced or context-dependent content, especially in crisis zones.”
READ ALSO: Meta threatens to shut access to Facebook, Instagram in Nigeria over $290m fines
In April, Meta’s independent Oversight Board issued a series of rulings criticizing the company’s content filtering practices—while stopping short of ordering policy reversals. In one of those rulings, the Board specifically called on Meta to evaluate any adverse human rights impacts stemming from its shift toward greater automation.
The Board’s statement noted: “As these changes are being rolled out globally, the Board emphasizes it is now essential that Meta identify and address adverse impacts on human rights that may result from them. This should include assessing whether reducing its reliance on automated detection of policy violations could have uneven consequences globally, especially in countries experiencing current or recent crises, such as armed conflicts.”
READ ALSO: Meta Announces Llama Impact Grant
Meta spokespersons have declined to confirm the precise scope of the shift away from human review. In a brief statement provided to the press, a company representative said: “We are continually investing in advanced AI tools to enhance user safety and reduce reliance on manual processes. Our human content reviewers remain integral to our overall safety ecosystem, providing critical context and oversight where AI may fall short.” The spokesperson added that Meta’s engineering teams are rolling out these AI-driven risk assessment tools incrementally, with ongoing internal audits to measure accuracy and fairness.
Industry analysts note that Meta’s decision comes amid mounting pressure to balance scale with precision. The platform handles billions of pieces of user-generated content daily, making it increasingly challenging—and expensive—to maintain large-scale human review teams. As one Silicon Valley consultant explained, “If you have thousands of employees sifting through flagged posts, the costs run into the hundreds of millions annually. AI promises to lower that cost curve, but it also introduces new risks around bias, false positives, and regional blind spots.”
Experts worry that Meta’s AI-driven model could produce uneven outcomes across countries and languages. Automated detection systems are typically trained on data sets that skew toward major languages like English, Spanish, and Mandarin.
In contrast, content in smaller languages—such as regional dialects spoken in parts of Africa or Southeast Asia—may be misinterpreted or altogether overlooked. Human reviewers, by contrast, often possess local expertise and can differentiate between, for example, a protest chant and an incitement to violence.