兰德-新闻报道中的坏角色-追踪国家行为者操纵新闻的行为(英)-2021.11-20正式版.docx
BADACTORSINNEWSREPORTINGTRACKINGNEWSMANIPULATIONBYSTATEACTORSChristianJohnsonWilliamMarcellinoheglobalspreadofthecoronavirusdisease2019(COVID-19)createdfertilegroundforattenptstoinfluenceanddestabilizedifferentpopulationsandcountries.Inresponsetothis.RANDCorporationresearchersconductedaproof-of-conceptstudyfordetectingtheseeffortsatscale.Marryingalarge-scalecollectionpipelineforglobalnewswithmachine-learninganddataanalysisworkflows,theRANDteamfoundthatbothRussiaandChinaappeartohaveemployedinformationmanipulationduringtheCOVID-19Pandemicinservicetotheirrespectiveglobalagendas.Thisreportisthesecondinaseriesoftworeports;thefirst(Matthews,Migacheva1andBrown,2021)focusedonqualitativeanddescriptiveanalysisofthesamedatareferredtointhisreport.Here,wedescribeOuranalyticworkflowsfordetectinganddocumentingstate-sponsoredmalignandsubversiveinformationefforts,andWereportquantitativeresultsthatSupportthequalitativefindingsfromthefirstreport.IntroductionAspartofOuranalysis,wesearchedforbothdifferencesandsimilaritiesinthetopicsdiscussedbyRussian,Chinese,andWesternnewsmedia,andwefoundthatconspiracytheoriesandgeopoliticalposturingwererelativelycommoninRussianandChinesenewsarticlesComparedwithWestern(U.S.andUK)articles.TheworkwedescribeherelaysthefoundationforarobustprotectivecapabilitythatdetectsandshedsIightonstateactorinformationmanipulationandmiscondctintheglobalarena.Disinformation,Propaganda,andTruthDecayTheworldisexperiencingacrisisrelatedtodisagreementsovertheestablishedtruth,aphenomenonthatRANDreferstoasTruthDecayashiftinpublicdiscourseawayfromfactsandanalysisthatiscausedbyfourinterrelateddrivers(RichandKavanagh,2018):1. anincreasingdisagreementaboutfactsandanalyticalinterpretationsoffactsanddata2. ablurringofthelinebetweenopinionandfact3. anincreasingrelativevolume,andresultinginfluence,ofopinionandpersonalexperienceoverfact4. adecliningtrustinformerlyrespectedsourcesoffactualinformation.RANDwww.rand.orgTruthDecayisaseriousthreattobothdomesticU.S.andinternationalsecurity,onethatisbeingexacerbatedbymaligneffortsfromavarietyofnationalbadactors.Theseill-intentionedeffortstomisuseinformatiarelabeledmanywaysreadersmighthaveseentheseeffortslabeledasdisinformation,misinformation,fakenews,andinformationoperations.Forclarityandconsistency'ssake,weusethedefinitionstakenfromRichandKavanagh,2018,intheremainderofthispaper.OurdefinitionofconspiracytheoriescomesfromDouglasetal.,2019.(SeetheKeyInformationDefinitionsbox.)TRUTHE)E3 二:一二,KEYINFORMATIONDEFINITIONSDisinformationFalseormisleadinginformatiospreadintentionally,usuallytoachievesomeMisinformationpoliticaloreconomicobjective,infIuencepublicattitudes,orhidethetruth(asyonymforpropaganda)FalseormisleadinginformationthatisConspiracyspreadunintentionally.byerrorormistakeInformationthatattemptstoexplainthetheoriesultimatecausesofsignificantsocialFakenewsandpoliticaleventsandcircumstanceswithclaimsofsecretplotsbytwoormorepowerfulactorsNewspaperarticles,televisionsnewsshows,TopicDefinitionorotherInformationdisseminatedthroughbroadcastorsocialmediathatareintentionallybasedonfasehoodsorthatintentionallyusemisleadingframingtoofferadistortednarrativeNewsManipulationfromBothChinaandRussiaWefoundthatduringtheCOVID-19pandemic,bothRussiaandChinaengagedinnewsmanipulationthatservedtheirgeopoliticalgoals.1AlthoughEnglish-languagenewsmediafrombothnationsdidengageIntraditionalreportingonCOVID-19reportingoninfection,deathrates,andmedicalresponsesgloballyIheyalsoconducteddistinctmediaeffortsthatappeartobepolitiCallydrivennewsmanipulation.WefoundthatRussianmediaadvancedanti-U.S.conspiracytheoriesaboutthevirusandthatChinesemediaadvancedpro-ChinanewsthatlaunderedBeijing'sreputationintermsofCOVID-19response.Additionally,wefoundthatearlyinthepandemic,RussianmediasupportedChina'sefforttoburnishitsreputation.Intotal,threemainpillarsOfChineseandRussiannewsabouttheCOVID-19pandemicwereidentified.First,unsurprisingly,ChineseandRussiannewsagenciesreportedonstorieswithbroadinterestthatis,newstopicscoveredSimiIarIybyWesternnewsagencies.GoodexampiesofthispillararearticlesdescribingthecasenumbersanddeathsrelatedtoCOVID-19.ThesecondpillarofnewsstoriesconsistsOfarticlesthatperformgeopoliticalreputation-launderingonbehalfofRussiaandChina.ManyChinesenewsarticles,fore×arple,praiseChina'shandlingofthepandemicandhighlightitsdonationsofaidtoforeigncountries.interestingly,RussiannewspraisesChinainasimilarway.RussiannewsalsoappearedtodownplaytheoriginalVID-19outbreakinWuhan.(Weconsidertheinteractionbetweenthesedifferentpillarslaterinthisreport.)Finally,RussianandChinesenewsagenciespromotedconspiracytheoriesregardingCOVID-19andthepublichealthmeasuresimplementedtocontainit.ExamplesofnewsinthispillararethesuggestionthatC0V!D-19isabioweaponorotherwiseengineeredinaIaboratoryortheideathatcontact-tracingeffortsarepartofaneffortbygovernmentandtechnologycompaniestotrackcitizens.TheSuccessofourproof-Of-Conceptstudysupportstheideathatexisting,off-the-shelfnaturallanguageprocessingmethodscanbeusedtomakesenseofnewsreportingbynation,ataglobalscale.Thesemethods,linkedtoascalableinfrastructurethatingestsnewsfromaroundtheworld,couldcreateaU.S.-supportedcapabilitytodetectnewsmanipulationatthenation-statelevel.Inplaceofattemptstoidentifyindividualnewsstoriesorsourcesthatareunreliable,suchaCapabilitycouldmakemanipulationofthebroadernewslandscapepubliclyVisible.Automatedsummarizingofanation,snewsoutputatanaggregatelevelwouldquicklyuncoveramanipulatkxieffortforexample,thespreadingOfaconspiracytheorythatcontact-tracingprogramsarepartofagovernmenttrackingeffort.(ThisisarealexamplethatRussiannewssourcesspreadandthatourmodeldetected). By news rranipulation. we mean that news articles were published to f urther the agenda of a state sponsor rather than to inf Orm the public. These aices are therefore subject to pressures beyond the standard editorial control of a news agency.Wehaveseveralreasonsforchoosingtofocusouranalysisondataaggregatedatthenation-statelevel(asopposedto,forexample,theindividualnewsoutletlevel).Frst1weviewedthisstudyasanextensionofpriorworklookingatnation-stateleveldisinformationefforts(Marcellino1Johnson,etal.,2020;Marcellino1Marcinek1etal.,2020).Thesepriorworkslookedatnation-stateactorsengagedinbroaddisinformationeffortstointerferewithelections,andwelookedSpecificallyatstatemanipulationofnewsmediaduringapandemic.Second,keyfeaturesthatpresentthemselvesonlyatthenationallevelwereofinterest:Mostimportantly,theUnitedStatesandUnitedKingdomhaverobust,independentpresseswhileRussiaandChinaexertstatecontrolOvertheirnewsmedia.Aseparateandequallycompellinganalysiswouldexaminepotentialnewsdisinformationwithinnations(forexampie,bypartisannewssourcesintheUnitedStates).Itislikelythatsuchananalysiswouldfindsignificantdifferencesbetweenindividualoutletsthatareworthexploring,especiallythroughthelensofpoliticalpolarizationintheUnitedStatespartisannewshaspreviousIybeenidentifiedasadriverofTruthDecay(RichandKavanagh,2018).ApotentiallimitationofthisworkisthatWefoCUSedonbonEnglish-languagearticles.RussiaandChinaarenotmajorityEnglish-speaking,sowearecorparingnewsstoriesaimedatdomesticaudiences(U.S.andUK)withonesaimedatforeignaudiences(RussianandChinese),insofarasthenewsoutletsaretryingtoinfluenceEnglish-speakingpeople,however,wefeelthattheycanbeusefullycopared.Cross-Iinguisticcorparisonofdomesticallyorientedreportingisanotherpotentiallinetofutureresearch.GiventheeffectivenessOfcombiningexistingoff-the-shelfmethodsinourreport,apublicsystemformonitoringglobalnewsthatdetectsanddescribesglobalnewsthemesbynationisplausible.SuchasystemcouldhelpguardagainstTruthDecayeffortsfrommaliciousstateactors.Thesystemalsocouldanalyzeadditionalsourcesofdata,suchassocialmediaposts,tounderstandboththenarrativesbeingpushedandwhichonestakehold.Moreinsightcouldalsobegarneredbyperformingdeeperanalysisattheindividualnewsagencylevel:DifferentonlinecommunitiesareIikelytoresponddifferentlytosimilarnewsstories,dependingonwhichsourcetheyoriginatefrom,forexample.MorediscussionofsuchanewsmonitoringsystemcanbefoundintheDiscussionsection.MethodologyIdentifyingdisinformationinalarge,complexdatasetisnotasimpletask.Theworddisinformationisacatchalltermusedtorefertoanarrayofdifferentphenomena-from"fakenews,"toopinionpiecesmasqueradingasjournalism,tolegitimatenewsstoriesthatheapinordinateattentiononcertaintopics(whileignoringothers).Asdescribedinthedefinitionsbox,disinformationisusedtorefertothedeliberatespreadingofmisleadingorincorrectinformation;misinformationreferstohonestbutincorrectknowledge.However,thelinebetweenthetwocanSometimesbeblurred;priorRANDwork(Marcellino1Johnson,etal.,2020)showedthatcoordinatedbotactivitywaslikelyuseddeliberatelyintherun-uptothe2020U.S.presidentialelectiontoamPlifyauthentictweetsandmakethemappearmorepoplarthantheyreallywere(Commonlycalledastroturfing)inanattempttocreateafalseimpressiOfgrassrootsspread.Ourgoal,therefore,wasnottodetectdisinformationperselbuttoidentifywhenandTheworddisinformationisacatchalltermusedtorefertoanarrayofdifferentphenomena一from"fakenewsJtoopinionpiecesmasqueradingasjournalism,tolegitimatenewsstoriesthatheapinordinateattentiononcertaintopics.TROTH E3 UxrhowRussianandChinesenewsmediaappeartobemanipulatedbyforcesoutsidethenormalnewscycleandeditorialprocesses.BecauseourdataSetfeaturedmanyarticlesfromavarietyofU.S.andUKmedia,wemakethekeyassumptionthatnewsworthystorieswillbecoveredbytheseWesternoutlets;instancesinwhichRussianandChinesemediacoverstoriesthatarequalitativelydifferentfromthosecoveredbyWesternmediaareworthyofmoreScrutinytodeterminewhethertheycouldbepartofadisinformaticampaign.Computationaltechniqueshavepreviouslybeenusedbyresearcherstostudydisseminationoffakenews,ParticularlyonTwitter.Grinbergetal.,2019,demonstratedthatfakenewsinthelead-uptothe2016U.S.presidentialelectionwasseenandsharedprimarilybyarelativelysmallnumberOfTwitterusers,primarilyconsistingbothofhighlyconservativeandcyborgaccounts.2Usingasimilarmethodology,Lazeretal.,2020,foundthatthesameConclusionsessentiallyheldtrueforthespreadoffakenewsrelatedtoCOVID-19.MarceIIino1Johnson,etal.,2020,usedadifferentmethodologytodeterminethatbot-Iikeaccountslikelyplayedasignificantroleinspreadingfar-rightconspiracytheoriesanddisinformationleadinguptothe2020election.Inshort,theavailableresearchSuggeststhatmuchofthedisinformationonsocialmediaisspreadbyarelativelysmallnumberofmalignusers.ThesestudieshavemostlyexaminedmetadataandderivedfeaturestodrawtheirconclusionsinsteadofstudyingthelanguageOfdisinformationitself.3Thispaperbuildsonexistingresearchtostudynotonlymetadataaboutnews,buttheactualcontentofthenewsitself.WehopedthatunderstandingthetopicalthemesbeingspreadviaClisinformationwouldleadtonewinsightsthatcannotbeseenSimplybylookingatuserengagementonsocialnetworks,suchasTwitter.ThefirstreportinthisseriesidentifiedseveralkeymarkersofdisinformationinRussianandCinesenews:conspiracytheories,geopoliticalposturing,andanti-U.S.messaging.Althoughwehopedthatadata-drivenapproachwouldreplicatethesefindings,wesoughttoperformOuranalysisasblindlyaspossible;thatis,wedidnotseektoconfirmorsuspicionsandSimplysearchthedatatofindConspiracytheoriesJnsteadjWeusedalgorithmStodetectthedominantthemesinthedataandonlythenanalyzedthesethemestodeterminetheircontent.Ouroverallstrategy,asmentionedearlier,restedontheideathatanydisinformationpublishedbyRussianandChinesenewssourceswouldbedetectablebecauseitscontentwoulddiffermeaningfullyfromthecontentinU.S.andUKnewsarticles.Certainly,somedifferencesincontentaretobeexpectedunderano-manipulationhypothesis:Forexample,RussiannewssourcesmightbemorelikelytocoverstoriesaboutEasternEuropethannewSfromtheUnitedStates,Sirrplybecauseofgeographicalproximity.However,wehypothesizedthatbyinspectingthesedifferencesclosely,wewouldbeabletouncoverpatternsassociatedwithmanipulation.Ultimately,anydifferencesbetweenWesternandnon-Westernnewsarticleswoldalsorequirehumananalysistodeterminewhetherthedifferenceswereinnocuousormalign.DataDescriptionWeusedNewsAPItocollectallEnglish-Ianguagearticlesfrom43newssources(nineofwhichareRussianJiveChinese,27U.S.,andtwoUK)fortheperiodJanuary1,2020,throughAugust31,2020,thatfeaturedeither"coronavirus"or,COVlD"inthete×t.4Thisresultedinatotalof247,315articles,theVastmajorityofthem(230,865)fromU.S./UKsources,withSmallernumbersfromRussian(14,309)andChinese(2,141)sources.(WeprovideamoredetailedbreakdownOfarticlespublishedbynewsoutletintheAppendix.)Foroursearchperiod,theoverallfrequencyofpublishedarticleswitheithertermmentionedgrewrapidlythroughJanuaryandFebruary,reachingapeakinMarchandApril.ArticlefrequencybyCountryoforiginisshownovertimeinFigure1.AsimilarpatternwasSeeninpublishingIrequencyovertimeacrossU.S./UK,Russian,andChinesesources,althoughRussiannewssourcesappearedtopublishsomewhatlessIrequentlyinmidtolateFebruary.More-detailedanalysisofthisapparentRussianslowdownisdescribedlaterinthisreport.2 Acybogaccountisonethatmixesautomatedbotactivitywithrealhumantweets.3 DerivedfeaturesreferstosuchthingsasthepresenceoffaknewsURLsinaTwitterfeed.*NewsAPIisanapplicationprogramminginterfacethatallowsuserstoautomaticallyconnecttoandsearchalargedatabaseofnewsarticles,includingnewswireservices(animportantadvantageoversuchrivalsourcesasLexisNexis),RANDhasbuiltascalableinfrastrcturetoretrieve,store,query,andthenanalyzevylargenewsarticledatasets.Thisscalablearchitectureisapowerfultoolthatallowsustogatheranenormousamountofnewsdataforanass.butitalsohasaconstraint:Wecancollectonlynewsarticlesfromsourcescoveredbytheservice,whichdoesnotincludesourcesthatarebehindpaywallsorotherwiserestrictedinaccess.Forourstudy,inpaicularlonlynineRussianandfiveChinesesourcesinEnglisharecoveredbyNewsAPI.FIGURE1ArticIeFrequencyoverTimein2020U.S./UKnewsRussiannews,ChinesenewsNOTE:Themovingsevendayaveragepublishingrateisoverlaidoneachsourceasasolidline.Notethatthey-axisislogarithmicallyscaled;wehaveaboutanorderofmagnitudefewerRussiannewsarticlesthanU.S./UK.andaboutanorderofmagnitudefewerChinesearticlesthanRussian.AepedS3orvBecausetheCOVID-19pandemicwassuchanimpactfulworldwideevent,wewerenotsurprisedtofindthatnewsstoriesaboutmanyothertopics,suchasthosethatweredominantlyabouteconomicorpoliticalStories1WereaIsorepresentedinordatasetbecausetheyalsoreferencedthepandemicinsomeway.However,acursoryexaminationofrandomarticlesinourdatasetshowedthatthemajoritywerefocusedonadifferent(nonpandemic)topic,althoughthepandemicplayedasignificantroleinmanyofthesearticles.WedecidedtomodelhowthisassortmentofdifferentsubjectsvariedacrossRussian,Chinese,andWesternnewsmedia.IfwecoulddeterminethatcertaintopicswerebeingdiscussedquiteoftenbyRussianorChinesenewsbutrarelybyWesternoutlets,thatWouldsuggesttheneedforadditionalexaminationandmightevenbeindicativeofamalignefforttopushcertainnarratives.Naturallanguageprocessing,thebranchofmachinelearningthatdea