Lab Platform Governance, Media and Technology (PGMT)

Marvin Meyer / Unsplash

Platforms overwhelmingly use automated content moderation, first DSA transparency reports show

November 6th 2023 was the first deadline for platforms categorised as “Very Large Online Platforms (VLOPs)” by the EU to deliver their transparency reports. The EU’s new Digital Services Act (DSA) demands VLOPs to publish these twice a year, containing standardized information about the number of users and aggregated numbers of content moderation decisions taken to mitigate potential “societal risks”.  In addition to publishing numbers of content moderation decisions, the platforms need to? describe the mechanisms they use to prevent these “risks”.  Probably for the first time in history of these tech giants, they are disclosing numbers about their use of  automated moderation systems and human moderators. The current data is from April 2023 till September 2023. 

Unsurprisingly, automated content moderation takes up the major proportion of all user regulation decisions taken in the area of  governance by platforms (Katzenbach, 2021, Gorwa et al., 2020). However, for the very first time one can grasp just how big this proportion is. 

We reviewed transparency reports submitted by six platforms, and analyzed actions taken with regard to removing content. Because decisions are largely loosely based on each platform’s own policies (in most cases – community guidelines),  there are no uniformed categories to compare, so for the moment, we have compared the aggregated data for the purpose of compatibility.

YouTube, in their report, only indicates “internal or external detection of content”, without specifically identifying automated or manual means of detection. Snapchat also has not provided numbers of automated versus human moderation decisions. Data for Facebook, Instagram, X, TikTok and LinkedIn is aggregated for the period they reported on. Data for X was counted for their moderation decisions based on community guidelines categories. Pinterest described most of their content moderation decisions as “hybrid” ; they are included as “automated” in the following tables and graphs, since the process described relies mostly on automated moderation. The data for Pinterest includes only moderation decisions on’”graphic violence and threats” because they do not provide aggregated data on all categories. 

SOCIAL MEDIA PLATFORMS
ModerationFacebookInstagramXTikTokPinterestLinkedIn
Automated43,870,76575,113,4621,449,6071,800,8262,413,40334,966
Human2,827,0411,184,951507,5062,199,13440880
Total46,697,80676,298,4131,957,1133,999,9602,413,8113,5046

Table 1. Number of automated content moderation decisions and human content moderation decisions declared by platforms in the first DSA transparency report. 

SOCIAL MEDIA PLATFORMS
ModerationFacebookInstagramXTikTokPinterestLinkedIn
Automated93,95%98,45%74,07%45,02%99,98%99,77%
Human6,05%1,55%25,93%54,98%0,02%0,23%
Total100,00%100,00%100,00%100,00%100,00%100,00%

Table 1. Number of automated content moderation decisions and human content moderation decisions in % declared by platforms in the first DSA transparency report 

Figure 1: Number of automated content moderation decisions and human content moderation decisions in % declared by platforms in the first DSA transparency report 

The analysis shows that TikTok stands out for their high reporting of human content moderation, and with 15% X (formerly Twitter) also looks like it has a high human content moderation rate when compared to other platforms, although the volume of moderated posts differs a lot.

How the platforms report numbers on their human content moderators has also varied by platform.  Here, in addition to the six VLOPs presented above, we have also analyzed Snapchat data.  

For example, Meta reported EU-related numbers of content moderators only by EU member states’ official languages, the same was done by TikTok and YouTube.  However, it looks like X also reported global moderators (such as Arabic and Hebrew languages, and Snapchat reported all the global moderators they have). 

SOCIAL MEDIA PLATFORMS
LanguageMetaTwitterYouTubeTikTokPinterestSnapchatLinkedIn
Bulgarian202969000
Croatian1912420000
Czech1903162000
Danish1709420230
Dutch541241670249
English1092294151422131451011N/A
Estonian3076000
Finnish15015400120
French22652176687125030
German2428123186927220
Greek2202896000
Hungarian2402563000
Irish42000000
Italian17929143901813
Latvian21119100
Lithuanian60116000
Maltese1000000
Polish651992080410
Portugese58414647543633
Romanian35034167040
Slovak110544000
Slovenian901545000
Spanish1632050746877031
Swedish 210161081210
Arabic12529
Hebrew21
Hindi29
Indonesian7
Japanese4
Mandarin4
Norwegian32
Punjabi15
Russian10
Tagalog5
Tamil7
Turkish7
Ukrainian3

Table 3: Language proficiency of human moderators indicated by social media platforms in DSA Transparency reports. 

Figure 2: Pareto chart (the most common languages known by content moderators as declared by all the included platforms).  Language proficiency of content moderators on major social media platforms as declared in DSA Transparency reports in October 2023. 

Figure 3: Stacked chart of language proficiency of content moderators on major social media platforms (English excluded) as declared in DSA Transparency reports in October 2023. 

It is clear from the data, that content moderators on social media platforms lack skills in certain languages, although it varies from one platform to another.  However, some of the languages (e.g. Maltese) are absent from almost all of them, and some are underrepresented.  In addition, DSA needs to include provisions for non-official languages because one can not understand how the content in, for example, Russian, Arabic or Hebrew is moderated on VLOPs in the EU.  

Cover Photo Credits: Marvin Meyer / Unsplash


Posted

in

by

Tags: