Hello. It looks like youre using an ad blocker that may prevent our website from working properly. To receive the best Tortoise experience possible, please make sure any blockers are switched off and refresh the page.

If you have any questions or need help, let us know at memberhelp@tortoisemedia.com

The crisis of Meta’s left behind languages

The crisis of Meta’s left behind languages

Facebook has left many languages behind in the past, particularly in Asia and Africa. Could a new language model change that?

Here’s what you need to know this week:

  • Affairs of state: The crisis of Meta’s left behind languages

State-by-state:

  • Google acted to secure abortion-related data
  • Apple announced “Lockdown Mode”
  • Microsoft measured “thriving”
  • Meta retaliated against a whistleblower
  • Amazon Prime might have to give up “dark patterns”
  • Tencent has the lower hand in China

Affairs of state: Can AI help Meta’s left behind languages?

Facebook users – all 2.9 billion of them – speak over 110 different languages. 

However, Facebook only reviews content in around 70 languages and has published its community standards in just 50.

What’s the problem? Inaccuracy and omission. In certain languages, users can post content that is dangerous and illegal, but Facebook doesn’t notice. It also means that users on Facebook and Instagram can’t always read the menus, terms of service or privacy policy in a language that they understand. 

Is it really doing that much harm? Yes. Facebook has admitted to playing a role in the genocide against Rohingya people in Myanmar, by failing to detect incitement to violence in the local language, Burmese. 

An investigation by the Bureau of Investigative Journalism and the Observer also found that Facebook was permitting content containing incitements to ethnic violence in Ethiopia, in Amharic and other minority languages. 

Just last month, Facebook accepted advertisements containing “violent speech that directly calls for people to be killed, starved or ‘cleansed’ from an area” targeting Amhara, Oromo and Tigrayan people there – despite claiming it had “been implementing a comprehensive strategy to keep people in the country safe”. 

Internal memos have shown that Facebook’s staff know translations of the term “hate speech” into Pashto, a language widely spoken in Pakistan and Afghanistan, are not accurate. In Fiji, racist commentary during the country’s elections in 2018 was addressed by local officials sending translations to Facebook to seek moderation. 

Mohammed Saneem, an election supervisor in Fiji argued at the time “if they are allowing users to post in their language, there should be guidelines available in that same language”. 

What’s the solution? For Facebook, one solution may be artificial intelligence. Last week the company announced that it would be using a new model – called “No Language Left Behind” (NLLB) – to translate 200 different languages and improve the quality of translation. 

“NLLB will help more people read things in their preferred language, rather than always requiring an intermediary language that often gets the sentiment or content wrong,” Facebook has said.

Facebook has emphasised that the purpose of NLLB is digital inclusion – allowing more users to engage on Facebook platforms in the language of their choice. 

But the company has also offered a $200,000 grant to researchers that apply NLLB to initiatives in “areas in support of the UN Sustainable Development Goals”. This would include Goal 18, a focus on reducing the number of people fleeing war, persecution and conflict.

Will NLLB work to reduce harm? It might, but there are reasons for scepticism. 

If NLLB is applied to moderation and upholding community standards, rather than just translating the messages and posts that appear on the platform, it could help moderators to catch egregious examples of hate speech and misinformation.

But artificial intelligence is not especially effective at detecting nuance, and large models lack training in lesser used languages. 

“AI enabled content moderation is using the AI as a tool to assist humans in doing their job, it is still a collaborative effort” Hannah Rose Kirk, an AI and online safety researcher at the Alan Turing Institute, told us. 

“You can’t just build an AI that’s capable of parsing or translating all of these different languages and deploy it to do all the moderation when you don’t also have the human infrastructure to supervise that process”.

“If people thought that Meta just had models deployed for moderation without humans to supervise those judgments, there would be a lot of pushback, and a lot of edge cases and cultural context would be missed.” 

In 2019, Facebook developers weren’t optimistic about the potential for automation to fix translation problems in other languages. 

A leaked document showed that 4,000 manually labelled content reviews were needed every day to train Facebook’s “immature” algorithms. “They need teachers (human reviewers) to grow,” one engineer wrote, saying that without further training, Facebook’s automated moderation was not effective.

Guy Rosen, Meta’s Chief Information Security Officer, has said that “hate speech is difficult to detect because of the complex nuances of speech” but defended the company over claims by a former employee that Facebook catches less than 5 per cent of hate speech on the platform.

“Platforms are concerned, because they have users all over the world speaking these low resource languages, meaning there aren’t data sets and algorithmic tools that exist already to help moderators understand. So it’s a big concern, and hiring moderators who speak all the languages is a really difficult problem” Kyle Dent, head of ethics at Checkstep, a company using AI to provide moderation tools, told us.

Dent said the challenge is immense, but is optimistic about NLLB, and the fact that Facebook has made it accessible to other researchers as an open source tool. 

“Nobody should take away the idea that now this is solved for all these languages, but it is certainly progress”.


Google: Abortion-related data

Google is taking action following the end of Roe v. Wade. Google announced that it would automatically erase the location tracking data on phones that had been near a “sensitive medical location” such as an abortion clinic. It also said it would limit the ability of companies to acquire data about the other apps installed on devices, including period tracking, pregnancy and family planning apps, the FT has reported. Many women’s rights groups are concerned that those seeking abortions could be subjected to data surveillance, with companies still able to harvest their data and pass it on to third parties, or to the police. 


Apple: Lockdown mode

Last week, Apple unveiled a new Lockdown Mode. The feature lets users activate “extreme” protections for iPhones, iPads and Macs by shutting off web browsing, stopping incoming calls from unrecognised numbers and pausing any accessory connections that the device might have been making. Effectively, it closes off the device’s hackable surface, reducing the risk of a security breach. Apple is in a position to take user privacy seriously. Unlike Meta and Google, Apple is not as dependent on advertising revenue and data sharing – or as wedded to practices that often jeopardise user privacy. Every day, 2 billion people around the world use Apple devices. They aren’t just a platform, but a physical product provider.


Microsoft: Thriving metrics

Microsoft is going deep. It’s no longer only concerned with bolstering employee engagement, but making sure their lives are “meaningfully” lived too. The tech state has zeroed in on a new way of measuring “thriving” – which it has described as its new north star. Now it asks workers for feedback on the “five Ps”: pay, perks, people, pride and purpose. This could be dismissed as meaningless woke management speak, but it does recognise something important: employees want more from their companies now than ever before.


Meta: Whistleblower retaliation

Does Facebook ever really delete anything? Meta has been accused of secretly keeping users’ deleted Messenger data and then sharing it with police. The claim is made by former employee Brennan Lawson, who is suing the tech state for whistleblower retaliation, alleging he was fired after raising concerns about the tool’s legality. More broadly, he says that he – like others – were forced to watch brutal content as a “risk and response” specialist without Meta properly protecting his mental health. Facebook disputes his claim – but it’s one to watch. 


Amazon: Dark patterns

The days of digital “dark patterns” might be coming to an end. An EU ruling could soon enable users to cancel their Amazon Prime subscriptions with just two clicks, rather than navigate the “dark patterns” of manipulative design intended to keep them from cancelling. This is bad news for the tech state – just as the cost-of-living crisis has seen millions of people reconsider their subscription commitments. Didier Reynders, the EU’s justice commissioner, announced after the ruling that “consumers must be able to exercise their rights without any pressure from platforms”. The UK is considering plans that would empower the Competition and Markets Authority to fine businesses that lock their consumers in “subscription traps”.


Tencent: Lower hand

Xi Jinping, China’s leader, said in 2013: “whoever controls data has the upper hand”.This philosophy has shaped the Chinese Communist Party’s approach to digital regulation ever since – collect, control and consolidate data. Tencent has been the biggest victim of the government crackdown on the growth and freedom of consumer data-driven companies in China. As Xi’s data regime has bedded in, Tencent has been repeatedly fined for its use of data and records of transactions. Shares in the company dropped this week after a fresh round of penalties were imposed by China’s States Administration for Market Regulation. Tencent is being driven into areas that are more appealing to the Party – like chip design, artificial intelligence and autonomous systems design – and away from gaming and social media.

Thanks for reading,

Luke Gbedemah
@LukeGbedemah

Alexi Mostrous
@AlexiMostrous