Engineering5(2019)1179–1192Contents lists available at ScienceDirectEngineeringResearch
Cybersecurity—Article
PrivacyComputing:Concept,ComputingFramework,andFutureDevelopmentTrends
FenghuaLia,b,HuiLic,?,BenNiua,JinjunChendaInstituteofInformationEngineering,ChineseAcademyofSciences,Beijing100093,ChinaSchoolofCyberSecurity,UniversityofChineseAcademyofSciences,Beijing100049,ChinacStateKeyLaboratoryofIntegratedServicesNetworks,SchoolofCyberEngineering,XidianUniversity,Xi’an710071,ChinadDepartmentofComputerScienceandSoftwareEngineering,SwinburneUniversityofTechnology,Hawthorn,VIC3122,Australiabarticleinfoabstract
Withtherapiddevelopmentofinformationtechnologyandthecontinuousevolutionofpersonalizedser-vices,hugeamountsofdataareaccumulatedbylargeinternetcompaniesintheprocessofservingusers.Moreover,dynamicdatainteractionsincreasetheintentional/unintentionalpersistenceofprivateinfor-mationindifferentinformationsystems.However,problemssuchasthecaskprincipleofpreservingpri-vateinformationamongdifferentinformationsystemsandthedif?cultyoftracingthesourceofprivacyviolationsarebecomingincreasinglyserious.Therefore,existingprivacy-preservingschemescannotpro-videsystematicprivacypreservation.Inthispaper,weexaminethelinksoftheinformationlife-cycle,suchasinformationcollection,storage,processing,distribution,anddestruction.Wethenproposeathe-oryofprivacycomputingandakeytechnologysystemthatincludesaprivacycomputingframework,aformalde?nitionofprivacycomputing,fourprinciplesthatshouldbefollowedinprivacycomputing,algorithmdesigncriteria,evaluationoftheprivacy-preservingeffect,andaprivacycomputinglanguage.Finally,weemployfourapplicationscenariostodescribetheuniversalapplicationofprivacycomputing,anddiscusstheprospectoffutureresearchtrends.Thisworkisexpectedtoguidetheoreticalresearchonuserprivacypreservationwithinopenenvironments.ó2019THEAUTHORS.PublishedbyElsevierLTDonbehalfofChineseAcademyofEngineeringandHigherEducationPressLimitedCompany.ThisisanopenaccessarticleundertheCCBY-NC-NDlicense(http://creativecommons.org/licenses/by-nc-nd/4.0/).Articlehistory:Received15December2018Revised20March2019Accepted19April2019Availableonline6September2019Keywords:PrivacycomputingPrivateinformationdescriptionPrivacymetricEvaluationoftheprivacy-preservingeffectPrivacycomputinglanguage1.IntroductionInrecentyears,informationtechnologyandmobilecommuni-cationtechnologyhavebeencloselyintegratedandrapidlydevel-oped.Thesoftwareandhardwareofsmartdevicesarealsoupgradedandevolvedcontinuously.Thesetechnologieshavepro-motedthedevelopmentoftheinternet,themobileinternet,cloudcomputing,bigdata,andtheInternetofThings.Atthesametime,avarietyofnewservicemodelshaveimprovedthequalityoflivinggreatly;theseincludethee-commerceservicesrepresentedbyAmazonandTaobao,thesocialnetworkservicesrepresentedbyFacebookandWechat,andthevehicleservicesrepresentedbyUberandDidi.However,theemergenceandrapiddevelopmentofnewtech-nologyandnewservicemodesleadtoacommonsituationinwhichamassiveamountofusers’personalinformationinteracts?Correspondingauthor.E-mailaddress:lihui@mail.xidian.edu.cn(H.Li).acrossinformationsystems,digitalecosystems,andevennationalnetworkboundaries.Ineachstepofthewholeinformationlife-cycle,users’personalinformationisinevitablyretainedinvariousinformationsystems,suchascollection,storage,processing,release(includingexchange),destruction,andsoon.Thisleadstoseparationoftheownership,management,andutilizationrightofinformation,whichseriouslythreatensusers’rightstoconsent,tobeerased/forgotten,andtoextendauthorization.Furthermore,thelackofeffectivemonitoringtechnologyleadstodif?cultyinthetracingandforensicsofprivacyinvasion.Mostexistingprivacy-preservingschemesfocusonrelativelyisolatedapplicationscenariosandtechnicalpoints,andproposesolutionstospeci?cproblemswithinagivenapplicationscenario.Whileaprivacy-preservingschemebasedonaccesscontroltech-nologyissuitableforasingleinformationsystem,theproblemofprivacypreservationinmetadatastorageandpublishingremainsunsolved.Similarly,aprivacy-preservingschemebasedoncryp-tographyisonlyapplicabletoasingleinformationsystem.Althoughtheimplementationofkeymanagementwiththehelp1180F.Lietal./Engineering5(2019)1179–1192oftrustedthirdpartiescanrealizetheexchangeofprivateinforma-tionbetweenmultipleinformationsystems,users’deletionrightandextendedauthorizationaftertheexchangeremainunsolved.Aprivacy-preservingschemebasedongeneralization,confusion,andanonymitytechnologiesdistortthedata,makingitimpossibletoberestored,andthereforecanbeappliedtomanyscenarios,suchasanonymizingdatawithoneoperationormultipleopera-tionstoobtainanincreasedlevelofprivacypreservation.However,thiskindofprivacy-preservingschemereducestheutilityofthedata,whichleadstotheadoptionofweakerprivacy-preservingschemesinactualinformationsystems,ortothesimultaneousstorageoforiginaldata.Atpresent,adescriptionmethodandacomputingmodelthatcanintegrateprivateinformationwiththedemandforprivacypreservationareunavailable,andwelackacomputingarchitecturetoprotectprivacyondemandincomplexapplicationscenarios,suchasprivateinformationexchangeacrossaninformationsystem,privateinformationsharingwithmulti-servicerequirements,anddynamicanonymizingofprivateinformation.Inbrief,existingprivacy-preservingtechnologiescannotmeettheprivacypreservationrequirementsincomplexinformationsystems,whichleadstounsolvedprivacy-preservingproblemsintypicalapplicationscenariossuchase-commerceandsocialnet-working.Forthisreason,weputforwardaprivacycomputingthe-oryandakeytechnologysystemforprivacypreservation.Themaintechnicalcontributionsareasfollows:??Forthe?rsttime,weproposeaprivacycomputingtheoryandakeytechnologysystemforprivacypreservation.Thisisdonefromtheperspectiveofwholelife-cyclepreservationofpri-vateinformation,inordertoanswerthedemandforsystem-aticprivacypreservationincomplexapplicationscenarios.??Weprovideageneralframeworkforprivacycomputing,includingaconceptandformalde?nitionofprivacycomput-ing,fourprinciplesoftheprivacycomputingframework,algorithmdesigncriteria,evaluationoftheprivacy-preservingeffect,andaprivacycomputinglanguage.??Weintroducefourcasestoverifytheeffectivenessofourpro-posedframeworkandtodemonstratehowtheframeworkimplementsprivacypreservationandtracesevidencewhenaprivacyinvasionoccurs.Theremainderofthispaperisorganizedasfollows:Section2describesrelatedwork,whiletheconceptandkeytechnologiesofourprivacycomputingareintroducedinSection3.WeutilizefourscenariostodescribetheubiquitousapplicationofourprivacycomputingframeworkinSection4,andlookforwardtofutureresearchdirectionsinprivacycomputingandunsolvedproblemsinSection5.WeconcludeourpaperinSection6.2.RelatedworkExistingresearchonprivacypreservationmainlyfocusesontheprivacy-preservingtechniquesofdataprocessing,andonprivacymeasurementandevaluation.2.1.Privacy-preservingtechniquesofdataprocessingResearchonprivacypreservationhasbeenconductedonallstagesofinformationlifecycle,includinginformationcollection,storage,processing,release,anddestruction.Inaddition,basedonaccesscontrol,informationconfusion,andcryptographytech-nologies,numerousprivacy-preservingschemeshavebeenpro-posedfortypicalscenariossuchassocialnetworking,location-basedservices,andcloudcomputing.Accesscontroltechnologyprotectsprivateinformationbycreat-ingaccessingstrategiestoensurethatonlyauthorizedsubjectscanaccessthedataresource.Inrecentyears,multipleprivacy-preservingtechniquesbasedonaccesscontroltechnologyhavebeenpresented.Scherzeretal.[1]proposedahigh-assurancesmartcardprivacy-preservingschemewithmandatoryaccesscon-trols(MACs)[2,3],andSlamanig[4]proposedaprivacy-preservingframeworkforoutsourceddatastoragebasedondiscretionaryaccesscontrol(DAC)[5,6].Inordertoimprovetheeffectivenessofauthoritymanagement,Sandhuetal.[7]presentedrole-basedaccesscontrol(RBAC).InRBAC,auserismappedtoaspeci?croleinordertoobtaincorrespondingaccessingauthority,whichsimpli-?estheauthoritymanagementgreatlyincomplicatedscenarios.Dafa-Allaetal.[8]designedaprivacy-preservingdata-miningschemewithRBACformultiplescenarios.In2018,Lietal.[9]pro-posedanovelcyberspace-orientedaccesscontrol(CoAC)model,whichcaneffectivelypreventsecurityproblemscausedbytheseparationofdataownershipandmanagementrightsandbysec-ondary/multipleforwardingofinformation,bycomprehensivelyconsideringvitalfactors,suchastheaccessrequestingentity,gen-eraltimeandstate,accesspoint,device,networks,resources,net-workinteractivegraph,andchainofresourcetransmission.Basedonthismodel,theyproposedascenario-basedaccesscontrolmethodcalledHideMe[10]forprivacy-awareusersinphoto-sharingapplications.Ascenarioiscarefullyde?nedbasedonacombinationoffactorssuchastemporal,spatial,andsharingbehaviorfactors.Inaddition,attribute-basedencryption[11,12]transformtheidentityoftheuserintoaseriesofattributes,andtheattributeinformationisembeddedthroughaprocessofencryptionanddecryptionsothatthepublickeycryptosystemhastheabilityof?ne-grainedaccesscontrol.Shaoetal.[13]achieved?ne-grainedaccesscontrolwithattribute-basedencryp-tion,andprotectedtheuser’slocationprivacyinlocation-basedservices.Informationconfusiontechnologyprotectstheoriginaldatawithgeneralization,anonymity,orconfusion,whichpreventsattackersfromobtainingusefulinformationfromthemodi?eddata.Anonymitytechnologies,suchask-anonymity[14–17],l-diversity[18,19],andt-closeness[20,21],achieveprivacypreservationbymaskingtheoriginaldatawithinacloakingspatial.Differentialpri-vacy[22,23]iswidelyconsideredtobeaprivacy-preservingtech-nologybecauseitdoesnotrequirebackgroundknowledgeoftheattackers.Toaddresstheissueofsimilarityattacks,Dewri[24]pro-posedananonymousalgorithmthatappliesdifferentialprivacytechnologytolocation-relateddata;thismethodisabletomaxi-mizetheeffectivenessofdifferentialprivacy.However,differentialprivacymustaddagreatdealofrandomizationtoqueryresults,anditsutilitydrasticallydecreaseswithincreasingprivacypreser-vationrequirements[25].Cryptographytechnologyprotectsusers’privateinformationthroughencryptiontechniquesandtrapdoorfunctions.Inordertoprotectprivatedataincloudcomputing,theconceptofhomo-morphicencryptionwas?rstproposedbyRivestetal.[26].Withhomomorphicencryption,Zhuetal.[27]proposedaprivacy-preservingspatialqueryframeworkforlocation-basedservices.In1999,basedoncompositeresiduosity,Paillier[28]designedanadditivehomomorphicencryptionalgorithm,whichiswidelyusedinmultiplescenarios.Forsmartgrids,Luetal.[29]proposedaprivacy-preservingdataaggregationschemewiththePailliercryp-tosystem,whichcanprotectusers’sensitiveinformationandresistvariousattacks.In2009,Gentry[30]successfullyconstructedthefullyhomomorphicencryption(FHE)algorithmbasedonanideallattice[31];thismethodachievesadditiveandmultiplicativehomomorphicencryptionsimultaneously.However,theef?ciencyofFHEisfarfrompracticalintherealworld,eventhoughmanymodi?edschemes[32–34]havebeenproposedinrecentyears.Inordertoimprovetheef?ciency,Zhuetal.[35]proposedanef?-cientandprivacy-preservingpointofinterest(POI)query[36]F.Lietal./Engineering5(2019)1179–11921181schemewithalightweightcosinesimilaritycomputingprotocolforlocation-basedservices.Theproposedschemeishighlyef?cientandcanprotectusers’queryandlocationinformationsimultane-ously.Othercryptography-basedsolutions[37,38]havealsobeenproposedtoenhancetheprivacyofthedataownerincloudcom-putingscenarios.Theabove-mentionedprivacy-preservingschemesareconcretealgorithmsthatmainlyfocusonapartialdatasetofspeci?cscenarios.Asaresult,theylackthealgorithmframeworkforthedynamicdatasetofspeci?cscenarios,andfurtherlacktheuniver-salalgorithmframeworkforthedynamicdatasetofmultiplesce-narios.Moreover,formultimediadata,itisnecessarytocombinemultiplealgorithmstoachieveprivacypreservation.Thematureschemesinthisareaareinsuf?cient.Finally,furtherresearchisneededonsuperimposingdifferentprivacy-preservingalgorithmsoneachotherinordertoobtainbetterpreservationquality.2.2.PrivacymeasurementandevaluationSpeci?cresearchgroupsarenowfocusingonthe?eldofinfor-mationtheoryandapplications.Oyaetal.[39]proposedaschemeusingconditionalentropyandmutualinformationascomplemen-taryprivacymetrics.MaandYau[40]proposedaprivacymetricfortime-seriesdatatoquantifytheamountofreleasedinformationobtainedbyattackers.CuffandYu[41]usedmutualinformationtodescribetheinformationobtainedbyattackersfromobservingdata,andmeasuredthedecreaseofuncertaintyoftheoriginaldata.Jorgensenetal.[42]combinedthecontrollablecharacterofprivacybudgetewithdifferentialprivacy,andgeneratednoisecalibratedtolapeDf=eTbasedontheprivacydemandsoftheuser,wherelapeáTistheLaplacedistributionfunction,andDfisthesensitivityofdata.Whenedecreases,theaddednoiseincreases,andtheintensityoftheprivacyprotectionishigher.Asoodehetal.[43]depictedtheriskofprivacyleakagewithmutualinformation.Theycalculatedthedecreaseoftheuncertaintyofprivateinformationinoriginaldataduringthereleaseofthedata.ZhaoandWagner[44]usedfournovelcriteriatoevaluatethestrengthof41privacymet-ricsforvehicularwork.Theirresultsshowthatthereisnometricthatcarriesacrossallcriteriaandtraf?cconditions.Furthermore,researchonapplication?eldsmainlyfocusesonsocialnetworking,location-basedservices,cloudcomputing,andsoforth.Inthe?eldofsocialnetworking,withafocusonwebpagesearching,Gervaisetal.[45]proposedaprivacy-preservingschemebasedontheobfuscatingtechnique,andquanti?edtheusers’privacy.Consideringthedifferentsearchingbehaviorsofuserswithvariousintentions,theydesignedacommonlyusedtooltomeasuretheirprivacy-preservingschemebasedontheobfus-catingtechnique.Aimingatspatiotemporalconnection,Caoetal.[46]usedcalculationtoanalyzethedataandquanti?edthepoten-tialrisksunderadifferentialprivacytechniquethroughaformaldescriptionofprivacy.Withafocusonmobilecrowdsensing,Luoetal.[47]proposedusingtheSalusalgorithm,whichpreservesdifferentialprivacy,toprotectprivatedataagainstdata-reconstructionattacks.Theyalsoquanti?edprivacyrisks,andpro-videdaccurateutilitypredictionsforcrowd-sensingapplicationscontainingSalus.Forascenarioofsocialrecommendation,Yangetal.[48]proposedPrivRank,aframeworkthatpreventsusersfrominferenceattacksandguaranteespersonalizedranking-basedrecommendations.TheyutilizedKendall’ssrankdistancetomeasuredatadistortion,andminimizedprivacyleakagebymeansofoptimaldataobfuscationlearning.Inthe?eldoflocation-basedservices,withthegoalofidentifyingtheattackingmodelandtheadversaries’backgroundknowledge,Shokrietal.[49]usedinformationentropytodescribethepreci-sion,certainty,andvalidityformeasuringtheeffectivenessofpri-vacypreservation.BasedontheBayesianStackelbergmodelofgametheory[50],theuserinthismodelactsasaleader,andtheattackeractsasafollower,toformthegametheorymodel.Kiekintveldetal.[51]proposedaframeworkto?ndtheoptimalprivacymechanismthatisabletoresistthestrongestinferenceattack.Recently,Zhaoetal.[52]proposedaprivacy-preservingparadigm-drivenframeworkforindoorlocalization(P3-LOC).Thisframeworkutilizesspeciallydesignedk-anonymityanddifferentialprivacytechniquestoprotectthedatatransmittedintheirindoorlocalizationsystem,whichguaranteesboththeusers’locationpri-vacyandthelocationserver’sdataprivacy.Zhangetal.[53]pro-posedalocationprivacy-preservingapproachusingpowerallocationstrategiestopreventeavesdropping.Basedontheirhighlyaccurateapproximatealgorithms,differentpower-allocationstrategieswereabletoachieveabettertradeoffbetweenlocalizationaccuracyandprivacy.Inthe?eldofcloudcomputing,asaservice-orientedprivacy-preservingframework,aprivacy-preservingmethodcalledSAFE[54]implementedsecurecoordinationforcross-neighborinterac-tionbetweentheprotocolanditselfincloudcomputing.Basedongametheoryanddifferentialprivacy,Wuetal.[55]quanti?edthegameelementsrelatedtotheuserswithmulti-level.Theyalsoimplementedusers’privacymeasurementbyanalyzingasingledataset.Zhangetal.[56]usedade?nitionofdifferentiationtoquantifythelevelofprivacyofparticipatingusers,andthentoimplementanaccurateincentivemechanism.Topreservethedataowner’sprivacyinthecloud,ChaudhariandDas[57]presentedasingle-keyword-basedsearchableencryptionschemeforapplica-tionsinwhichmultipledataownersuploadtheirdataandmultipleusersaccessthedata.Mostoftheabove-mentionedschemeslackauni?edde?nitionoftheconceptofprivacy.Moreover,theprivacymetricvariesdynamicallywiththesubjectreceivinginformation,thesizeofthedataquantity,andthescenarios.Furthermore,thedynamicpri-vacymetricmethodiscurrentlylacking.Finally,thedisseminationofinformationisacross-informationsystem,buttheaboveschemeslackconsistencyamongdifferentinformationsystemsandalsolackaformalizeddescriptionmethodforthedynamicpri-vacyquantization.Therefore,theyarefarfromsatisfyingthedynamicrequirementsoftheprivacypreservationofcross-platformprivateinformationexchanges,extendedauthorization,andsoon.Insummary,existingprivacy-preservingtechnologiesandpri-vacymeasurementmethodsarefragmented,andlackaformalizeddescriptionmethodfortheauditingofprivateinformationandconstraintconditions.Aschemethatintegratesprivacypreserva-tionwiththetrackingandforensicsofprivacyinfringementhasnotyetbeenconsidered.Inaddition,itisdif?culttoconstructauniforminformationsystemthatcoversallthestagesofinforma-tioncollection,storage,process,release,destruction,andsoon.3.De?nitionandframeworkofprivacycomputing3.1.Conceptsofprivacyandprivacycomputing3.1.1.PrivacyrightandprivateinformationThelegalde?nitionofprivacyemphasizesprotectinganindi-vidual’srightsaccordingtothelaw,andincludestherequirementthatpersonalinformation,activities,andspacescannotbepub-lished,interferedwith,orintrudeduponillegally.Itemphasizestheindependenceofprivacyfrompublicinterestsandgroupinter-ests,includingpersonalinformationthatapersondoesnotwantotherstoknow,personalaffairsthatapersondoesnotwantotherstotouch,andapersonalareathatapersondoesnotwantotherstoinvade.Theessenceofthelegalde?nitionisinfactprivacyrights.1182F.Lietal./Engineering5(2019)1179–1192Thispaperfocusesonthefull-life-cyclepreservationofprivateinformation.Morespeci?cally,privateinformationincludesper-sonalinformationthatapersondoesnotwantotherstoknoworthatisinappropriateforotherstoknow,aswellaspersonalinfor-mationthatapersonwantstobedisseminatedwithinanapprovedcircleofpersonnelinawayhe/sheagreeswith.Privateinformationcanbeusedtodeduceauser’spro?le,whichmayimpacthis/herdailylifeandnormalwork.Academicallyspeaking,privateinformationiscloselyrelatedtothespatiotemporalscenarioandthecognitiveabilityofthesubject.Itshowsdynamicperceptualresults.Unlikethede?nitionofpri-vacyinthelaw,wemainlyde?neanddescribeprivateinformationtechnically,inordertosupportresearchonvarioustechnicalaspectssuchasthesemanticunderstandingofprivacy,privacyinformationextraction,thedesignofprivacy-preservingalgo-rithms,theevaluationofprivacy-preservingeffectiveness,andsoforth.3.1.2.PrivacycomputingIngeneral,privacycomputingreferstoacomputingtheoryandmethodologythatcansupportthedescribing,measuring,evaluat-ing,andintegratingoperationsperformedonprivateinformationduringtheprocessingofvideo,audio,image,graph,numericalvalue,andbehaviorinformation?owinapervasivenetwork.Itcomprisesasetofsymbolizedandformulizedprivacycomputingtheories,algorithms,andapplicationtechnologieswithquantita-tiveassessmentstandardsandsupportfortheintegrationofmul-tipleinformationsystems.Privacycomputingincludesallcomputingoperationsbyinfor-mationowners,collectors,publishers,andusersduringtheentirelife-cycleofprivateinformation,fromdatageneration,sensing,publishing,anddissemination,todatastorage,processing,usage,anddestruction.Itsupportsprivacypreservationforamassivenumberofusers,withhighconcurrencyandhighef?ciency.Inaubiquitousnetwork,privacycomputingprovidesanimportanttheoreticalfoundationforprivacypreservation.Fromtheperspectiveoffull-life-cycleprivacypreservation,wehaveconstructedaframeworkofprivacycomputing,whichisshowedinFig.1.Withtheinputofanyformatofplaintextinfor-mationM,ourframework?rstseparatesthewholeprocessintoasetofelements,asfollows:semanticextraction,scenarioextraction,privateinformationtransformation,integrationofprivateinformation,privacyoperationselection,privacy-preservingschemeselection/design,evaluationofprivacy-preservingeffect,scenariodescription,andfeedbackmechanism.Wefurtherimplementtheprivacycomputingframeworkbycarefullyorganizingtheseelementsinto?vesteps,listedbelow.Step1:Extractprivateinformation.AccordingtotheformatandsemanticsofplaintextinformationM,we?rstextractprivateinformationXandobtainprivateinformationvectorI,thedetailscanbefoundinSection3.2.Step2:Abstractthescenario.AccordingtothetypeandsemanticsofeachprivateinformationelementikinI,wethende?neandabstracttheapplicationscenario.Also,theextractedprivateinformationshouldbere-organizedbythetransformationandintegration.Step3:Selecttheprivacyoperation.Accordingtotheprivacyoperationssupportedbyeachik,weselectandgenerateadisseminationcontroloperationset.Step4:Selectordesigntheprivacy-preservingscheme.Accordingtotheapplicationrequirements,weselectordesignanappropriateprivacy-preservingscheme.Ifcapableschemesareavailable,theycanbeselecteddirectly.Otherwise,wehavetodesignnewschemes.Step5:Evaluatetheprivacy-preservingeffectiveness.Accord-ingtorelevantassessmentcriteria,weassesstheprivacy-preservationeffectivenessoftheselectedprivacy-preservingscheme,byemployingmeasurementssuchasanentropy-basedordistortion-basedprivacymetric.Detailsonassessingprivacy-preservationeffectivenesscanbefoundinSection3.5.Iftheevaluationresultoftheprivacypreservationdoesnotmeettheexpectedrequirements,thefeedbackmechanismisexe-cuted.Thismechanismconsistsofthreesituations:①Iftheappli-cationscenarioismis-abstracted,itshouldbere-abstractediteratively;②iftheapplicationscenarioisabstractedproperlybuttheprivacyoperationisselectedimproperly,theprivacyoper-ationshouldbere-organized;or③iftheapplicationscenarioandprivacyoperationareselectedcorrectly,theprivacy-preservingschemeshouldbeadjustedorimprovedtoeventuallyachieveasatisfactoryeffectivenessofprivacypreservation.Itisnotablethattheseelementsandstepscanbecombinedfreelyaccordingtothespeci?cscenario;theprocessisdepictedinFig.1.3.2.FormalizationofprivacycomputingInthissection,we?rstde?neprivateinformationXanditssixbasiccomponents,alongwithrelatedaxioms,theorems,andassumptions;theseprovideafoundationtodescribetheotherpartofprivacycomputing.ItisnotedthatextractionmethodsfortheprivateinformationvectorofanyinformationMareoutsidetheFig.1.Privacycomputingframework.F:privacycomputationoperationset;A:privacyattributevector;C:locationinformationset;X:auditcontrolinformationset;H:constraintconditionset;W:disseminationcontroloperationset;X:normalizedprivateinformation;f:privacycomputingoperation;feXT:operatednormalizedprivateinformation.F.Lietal./Engineering5(2019)1179–11921183scopeofthispaper,astheyaresubjecttodomain-speci?cextrac-tionconditions.Thequanti?cationofprivateinformationcon-tainedincontentisalsooutsidethescopeofthispaper,asitisthetaskoftheprogrammerormodeleroftheinformationsystem.De?nition1:PrivateinformationXconsistsofsixcomponents,hI;A;C;X;H;Wi:namely,theprivateinformationvector,pri-vacyattributevector,locationinformationset,auditcontrolinfor-mationset,constraintconditionset,anddisseminationcontroloperationset,respectively.De?nition2:TheprivateinformationvectorI?eIID;i1;i2;:::;ik;:::;inT,whereike1 k nTisaprivateinfor-mationelement.Eachikrepresentssemanticallyinformationalandindivisibleatomicinformation.Informationtypesincludetext,audio,video,images,andsoforth,and/oranycombinationofthesetypes.Semanticcharacteristicsincludewords,phrases,toneofvoice,pitchoftone,phoneme,sound,frame,pixels,color,andsoforth,and/ortheircombination.Theseareusedtorepresentatomicinformationthatissemanticallyinformative,indivisible,andmutuallydisjointedwithininformationM.IIDistheuniqueidenti-?eroftheprivateinformationvector,andisindependentofprivateinformationelements.Forexample,inthetext‘‘U1andU2wenttoLoctodrinkbeer,”theprivateinformationvectorisI?eIID;i1;i2;i3;i4;i5;i6;i7T?eIID;U1;and;U2;went;to;Loc;todrink;beerT.Inthiscase,n?7.Notethatcertainspecialpiecesofinformation,suchasproverbs,canbeeffectivelydividedbynat-urallanguageprocessing-basedsolutions.Axiom1:Withinanaturallanguageanditsgrammarrules,andwithinthegranularityofwords,phrases,andslang,thenumberofelementsofprivateinformationvectorIisbounded.Property1:Theprivateinformationvectorconformsthe?rstnormalform(1NF)andthesecondnormalform(2NF).Privateinformationcomponentikisde?nedasthesmallestgranularitythatcannotbedividedfurther,whichiscalledtheatomicproperty.The1NFisapropertyofarelationinarelationaldatabase.ArelationRisin1NFifandonlyifthedomainofeachattributecontainsonlyatomicvalues,andeachattributecanonlyhaveasinglevaluefromthatdomain.Underthisde?nition,ikcon-formsto1NF.Meanwhile,theprivateinformationvectorIhastheuniqueidenti?cationIIDasaprimarykey.Othernon-primaryattri-butesarealldependentonthisprimarykey.ArelationRisin2NFifR21NFandeverynon-primaryattributeoftherelationisdepen-dentontheuniqueprimarykey.Therefore,ikconformsto2NF.De?nition3:TheconstraintconditionsetisdenotedbyH?fh1;h2;:::;hk;:::;hng,wherehke1 k nTisaconstraintcon-ditionvectorcorrespondingtotheprivateinformationcomponentik.hkdescribesthepermissionsnecessaryforanentitytoaccessikindifferentscenarios—suchaswho,atwhattime,usingwhatdevices,bywhatmeansofaccess—andusestheprivacyattributecomponentikandthedurationofusageoftheprivateinformationvector.Onlyentitiesthatsatisfytheconstraintconditionvectorhkcanaccesstheprivateinformationcomponentik.Anentitycanbeanowner,areceiver,orapublisheroftheinformation.De?nition4:TheprivacyattributevectorisA?ea1;a2;:::;ak;:::;an;ant1;:::;amT,whereakdenotestheprivacyattributecom-ponentandisusedtomeasurethedegreeofprivateinformationpreservation.Inpracticalapplications,differentprivateinforma-tioncomponentsareabletoformweighteddynamiccombinationsofdifferentscenarios.Thesecombinationswillproducenewpri-vateinformation.However,basedontheatomicityoftheprivateinformationcomponents,werepresenttheprivateinformationpreservationdegreeofdifferentcombinationofikwiththeprivacyattributecomponent.When1 k n,thereisaone-to-onecorre-spondencebetweenakandik;whenn k m,akrepresentstheprivateinformationpreservationdegreeofthecombinationoftwoormoreprivateinformationcomponents.Wesetak2?0;1??;ak?0indicatesthattheprivateinformationcomponentikhasthehighestdegreeofpreservation.Underthiscondition,informationMisnotshared;thatis,thereisnopossi-bilityofanyleakage,becausetheinformationisprotectedtothehighestdegree.Inthatcase,mutualinformationbetweenthepro-tectedprivateinformationandtheoriginalprivateinformationis0.Forexample,incryptography-basedprivacy-preservingmeth-ods,ak=0meansthatthesecretkeyhasbeenlostandtheinforma-tioncannotbereversed;incasesinwhichnoiseinjection,anonymization,orotherirreversibletechniqueshavebeenapplied,ak=0indicatesthatthedegreeofdistortionofthedatahasledtoacompleteirrelevancybetweentheprocessedinformationandtheinitialinformation.ak=1indicatesthatikisnotprotectedandcanbepublishedfreelywithoutlimit.Theothervaluesbetween0and1representdifferentdegreesofprivateinformationpreser-vation.Thelowerthevalueis,thehigherthedegreeofprivateinformationpreservationis.Theprivacy-preservingquantitativeoperationfunctionisdenotedbyr;thiscanbeamanuallylabelledfunction,aweightingfunction,andsoforth.Sincedifferenttypesofprivateinformationikcorrespondtodifferentkindsofoperationfunctions,theresult-ingprivacyattributecomponentsarealsodifferent,andareexpressedbyak?reik;hkT,wheree1 k nT.Foranycombina-tionofprivateinformationcomponentsi1;i2;:::;in,wedenoteitasintj?ik1_ik2_ááá_iks,where_standsforthecombinationoperationoftheprivateinformationcomponents.Giventheprivacy-preservingquantitativeoperationfunctionrprivacyattributecomponentaàandthentj,wehaveantj?rintj;hk1;hk2;:::;hksT,wheree1 k1<... Engineering ResearchCybersecurity—Article隐私计算——概念、计算框架及其未来发展趋势李凤华a,b,李晖c,*,牛犇a,陈金俊dab Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China School of Cyber Security, University of Chinese Academy of Sciences, Beijing 100049, Chinac State Key Laboratory of Integrated Services Networks, School of Cyber Engineering, Xidian University, Xi’an 710071, Chinad Department of Computer Science and Software Engineering, Swinburne University of Technology, Hawthorn, VIC 3122, Australiaa r t i c l e i n f oArticle history:Received 15 December 2018Revised 20 March 2019Accepted 19 April 2019Available online 6 September 2019摘要随着信息技术的快速发展和个性化服务的不断演进,大型互联网公司在服务用户过程中积累了海量数据。此外,数据的频繁跨境、跨系统、跨生态圈交互已成为常态,加剧了隐私信息在不同信息系统中有意/无意留存,但随之而来的隐私信息保护短板效应、隐私侵犯追踪溯源难等问题越来越严重,致使现有的隐私保护方案不能提供体系化的保护。本文从信息采集、存储、处理、发布(含交换)、销毁等全生命周期的各个环节角度出发,阐明了现有常见应用场景下隐私保护算法的局限性,提出了隐私计算理论及关键技术体系,其核心内容包括:隐私计算框架、隐私计算形式化定义、隐私计算应遵循的四个原则、算法设计准则、隐私保护效果评估、隐私计算语言等内容。最后以四个应用场景为示例描述了隐私计算的普适性应用,并展望了隐私计算的未来研究方向和待解决问题,期待指引开放环境下用户隐私保护等方面的理论与技术研究。? 2019 THE AUTHORS. Published by Elsevier LTD on behalf of Chinese Academy of Engineering and Higher Education Press Limited Company This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).关键词隐私计算隐私信息描述隐私度量隐私保护效果评估隐私计算语言1.引言信息技术、移动通信技术等的紧密结合与快速发展,以及智能终端软硬件的不断升级与换代,促进了互联网、移动互联网、云计算、大数据、物联网等方面的技术发展,同时催生了以Amazon/淘宝为代表的电商、以Facebook/微信为代表的社交、以Uber/滴滴为代表的出行等各种新型服务模式,大幅度提升了人们的生活品质。然而,新技术、新服务模式的产生与快速发展促使海量用户个人信息跨系统、跨生态圈甚至跨境交互成为常态,用户个人信息在采集、存储、处理、发布(含交换)、销毁等全生命周期各个环节中不可避免地会在不同信息系统中留存,导致信息的所有权、管理权与使用权分离,严重威胁了用户的知情权、删除权/被遗忘权、延伸授权。另一方面,缺少有效的监测技术支撑,导致隐私侵犯溯源取证困难。现有隐私保护方案大都聚焦于相对孤立的应用场景和技术点,针对给定的应用场景中存在的具体问题提出解决方案。基于访问控制技术的隐私保护方案适用于单一信息系统,但元数据存储、发布等环节的隐私保护问题并未解决。基于密码学的隐私保护方案也同样仅适用于单一信息系统,虽然借助可信第三方实施密钥管理可以实现多信息系统之间的隐私信息交换,但交换后的隐* Corresponding author. E-mail address: lihui@mail.xidian.edu.cn (H. Li). 2Author name et al. / Engineering 2(2016) xxx–xxx私信息的删除权/被遗忘权、延伸授权并未解决。基于泛化、混淆、匿名等技术的隐私保护方案因对数据进行了模糊处理,经过处理后的数据不能被还原,适用于单次去隐私化、隐私保护力度逐级加大的多次去隐私化等应用场景,但因这类隐私保护方案降低了数据可用性,导致在实际信息系统中,经常采用保护能力较弱的这类隐私保护方案,或者同时保存原始数据。目前缺乏能够将隐私信息与保护需求一体化的描述方法及计算模型,并缺乏能实现跨系统隐私信息交换、多业务需求隐私信息共享、动态去隐私化等复杂应用场景下的按需隐私保护计算架构。总之,现有隐私保护技术无法满足复杂信息系统的隐私保护需求,导致电子商务、社交网络等典型应用场景下的隐私保护问题尚未得到根本性解决。为此,本文从隐私信息全生命周期保护的角度出发,针对复杂应用场景下的体系化隐私保护需求,提出了隐私计算理论及关键技术体系,包括隐私计算框架、隐私计算形式化定义、隐私计算应遵循的四个原则、算法设计准则、隐私保护效果评估、隐私计算语言等内容,以图像、位置隐私保护等应用场景为示例描述了隐私计算的普适性应用,并展望了隐私计算的未来研究方向和待解决问题。2.国内外现状现有的隐私保护研究主要集中在信息处理过程中的隐私保护、隐私度量与评估两个方面。2.1. 信息处理过程中的隐私保护学术界在信息采集、存储、处理、发布(含交换)、销毁等各个环节均开展了隐私信息保护研究,并在社交网络、位置服务、云计算等典型应用场景下提出了大量保护方案,其隐私保护方法主要分为访问控制、信息混淆、密码学等三类。访问控制技术通过制定信息资源的访问策略以保证只有被授权的主体才能访问信息,从而实现信息的隐私保护。近年来,多个基于访问控制的隐私保护方案被相继提出。Scherzer等[1]提出了基于强制访问控制(MAC)模型[2,3]的高可用智能卡隐私保护方案。Slamanig [4]则提出了基于自主访问控制(DAC)模型[5,6]的外包数据存储隐私保护方案。为了提高权限管理效率,Sandhu等[7]提出了角色访问控制(RBAC),用户通过成为适当的角色成员获得相应的信息访问权限,极大地简化了复杂场景中的权限管理。Dafa-Alla等[8]基于角色访问控制提出了一种适用于多场景的隐私保护数据挖掘方法。2018年,Li等[9]提出了面向网络空间的访问控制模型(CoAC),该模型涵盖了访问请求实体、广义时态、接入点、访问设备、网络、资源、网络交互图和资源传播链等要素,可有效防止由于数据所有权与管理权分离、信息二次/多次转发等带来的安全问题。基于此模型,他们提出了一种基于场景的访问控制方法——HideMe [10],为照片分享应用中的用户提供隐私保护。此外,基于属性的加密(ABE)[11,12]将用户的身份标识形式化为一系列的属性,并将属性信息嵌入加解密的过程中,使公钥密码体制具备了细粒度访问控制的能力。FINE方案[13]利用基于属性加密的密码学算法来实现细粒度的访问控制,保护了用户的位置隐私。信息混淆技术是基于特定策略修改真实的原始数据,使攻击者无法通过发布后的数据来获取真实数据信息,进而实现隐私保护。k-匿名[14–17]、l-多样性 [18,19]和t-近邻[20,21]等多种匿名化技术通过将用户的原始数据隐藏到一个匿名空间中实现敏感信息的隐私保护。差分隐私[22,23]由于对攻击者的背景知识无要求而成为一种被广泛认可的隐私保护技术,文献[24]将差分技术与位置大数据服务相结合,针对发布数据聚集易受相似性攻击的问题,提出一种最大化差分隐私效果的匿名算法。然而,差分隐私需要在查询结果中加入大量的随机化,随着隐私保护要求增多,可用性会急剧下降 [25]。密码学技术是利用加密技术和陷门函数,使攻击者在无法获得密钥情况下不能得到用户隐私信息;为了保护云计算中用户的隐私信息,Rivest等[26]首次提出了同态加密的概念。基于同态加密,Zhu等[27]构造了隐私保护的空间多边形查询方案。1999年,Paillier [28]设计出了基于复合模数的加法同态加密算法,在多种场景下得到了广泛应用。基于Paillier加密系统,Lu等[29]提出了一种面向智能电网的隐私保护的数据聚合方案,该方案能够保护用户隐私并抵抗多种攻击。2009年, Gentry [30]基于理想格[31]成功构造了全同态加密方案,虽然近年来提出了许多改进方案[32–34],但是其复杂度仍然过高,不能应用于实际。为解决此问题,Zhu等[35]基于轻量级隐私保护余弦相似度计算协议,设计了高效隐私保护的POI查询[36]方案,实现了用户查询信息和位置信息的隐私保护。此外,还提出了一些基于密码学的方案[37,38],来为云计算场景下的用户数据提供Author name et al. / Engineering 2(2016) xxx–xxx3隐私保护。上述各种隐私保护方案主要是针对特定场景局部数据集的具体算法,缺少针对特定场景动态数据集的算法框架,更缺少适应多场景动态数据集的普适性算法框架;其次针对多媒体数据需要多个隐私保护算法的组合,目前也缺少成熟的方案;第三,将不同隐私保护算法互相叠加以获得更好保护效果的方法也有待开展研究。2.2. 隐私度量与评估目前学术界从信息论和应用领域对此开展针对性的研究。文献[39]提出使用条件熵和互信息作为互补的隐私度量。Ma和Yau [40]提出了一种时间序列数据的隐私度量标准,用于量化对手在尝试推断给定任何已发布数据范围内的原始数据时可用的信息量。Cuff和Yu [41]提出了一种基于条件互信息的度量,通过描述对手观察公开数据后,原始数据中隐私信息不确定性的下降来度量隐私信息。Jorgensen等[42]结合差分隐私算法中ε可控的特点,根据用户对数据隐私保护强度的要求,通过调整噪声的分配策略生成符合lap(?f/ε)分布的噪声,其中,lap(·)为Laplace分布函数。当ε越小,添加的噪声越多,隐私保护强度越高。Asoodeh等[43]通过互信息来度量隐私泄露的程度,他们通过计算攻击者在观察到发布数据之前和之后,在原始数据集中隐私信息的不确定量的降低来度量隐私信息。Zhao和Wagner [44]应用4个全新的标准来评估车辆工作中的41个隐私指标强度。他们的研究结果表明,没有一个指标能够满足所有标准和交通条件。应用领域的研究则主要聚焦在社交网络、位置服务、云计算等方面。社交网络领域。Gervais等[45]提出了针对网页搜索中基于混淆技术的隐私保护方案,对用户隐私进行了量化,在考虑用户意图不同时每个个体不同的搜索行为,设计了一个通用性工具,对基于混淆技术的隐私保护方案进行隐私度量;Cao等[46]在考虑时空关联的情况下,通过对隐私形式化描述,以及数据分析与计算,量化了在差分隐私技术下潜在的风险。Luo等[47]提出使用Salus算法保护私有数据免受数据重建攻击,该算法能够实现差分隐私。他们还量化了隐私风险,并为包含Salus的群体感知应用提供了准确实用的预测。在社交推荐场景中,Yang等[48]提出了PrivRank,该框架能抵御成员推断攻击并给出个性化的推荐结果。他们利用Kendall的τ秩距离来测量数据失真程度,并通过最优数据混淆学习来最小化隐私泄漏。位置服务领域。Shokri等[49]提出关于位置隐私保护机制的框架,利用确定攻击模型以及敌手的背景知识,通过信息熵等方法来描述攻击过程的精确性、确定性、正确性,从而实现隐私保护效果的度量;并同时提出一种基于博弈理论的框架,通过Bayesian Stackelberg博弈模型[50],该模型中的领头者在该框架中指的是用户,跟随者是攻击者,以此研究用户和攻击者的博弈,从而找出能够抵抗最强推测攻击的最佳隐私保护机制。Kiekintveld等[51]提出了一个框架来寻找能够抵抗最强推断攻击的最佳隐私机制。最近,Zhao等[52]提出了一个隐私保护范式驱动的室内定位框架(P3-LOC),利用特殊设计的k-匿名和差分隐私技术来保护其室内定位系统中传输的数据,既保证了用户的定位优先级,又保证了定位服务器的数据隐私。Zhang等[53]提出了一种利用功率分配策略防止窃听的位置隐私保护方法。通过使用精确的近似算法,不同的功率分配策略能够在定位精度和隐私强度之间达到更好的平衡。云计算领域。SAFE [54]是以服务为导向的隐私保护框架,为云计算中对协议和本体的在跨邻域交互下实现了安全协调。Wu等[55]基于博弈论和差分隐私,对用户所涉及的博弈元素进行多级量化,通过的单一数据集的分析实现用户的隐私度量。Zhang等[56]利用了差分的概念来对参与用户的隐私等级进行量化,进而实现准确的激励机制。为了保护云端的数据隐私,Chaudhari和Das [57]提出了一种基于单个关键字的可搜索加密方案,适用于多个数据所有者上传数据、多个用户访问数据的应用。上述各类隐私度量方案缺乏对隐私概念的统一定义;其次,隐私度量随信息接收主体、拥有数据量大小以及场景动态变化,目前缺乏隐私的动态度量方法;第三,信息跨系统传播,缺乏不同系统隐私度量的一致性、隐私信息操作控制的形式化描述方法,不能支持跨平台的隐私信息交换、延伸授权等动态保护需求。综上所述,现有的隐私保护以及隐私度量方案零散孤立,还缺乏隐私信息操作审计和约束条件的形式化描述方法,尚未有将隐私保护与隐私侵犯取证追踪一体化考虑的方案,无法构建涵盖信息采集、存储、处理、发布(含交换)、销毁等全生命周期各个环节的隐私保护和隐私侵犯取证追踪的技术体系。4Author name et al. / Engineering 2(2016) xxx–xxx3. 隐私计算的定义与框架本节依次介绍隐私与隐私计算的基本概念,隐私计算框架及形式化定义,隐私保护方案的设计准则及效果评估。3.1. 隐私与隐私计算的概念3.1.1. 隐私权与隐私信息法律上对隐私的定义侧重保护法律所赋予公民的个人权益,包括私人信息、活动、空间不得被非法公开、干扰、入侵等,强调/突出隐私与公共利益、群体利益的无关性,包括当事人不愿他人知道或他人不便知道的个人信息、不愿他人干涉或他人不便干涉的个人私事、不愿他人侵入或他人不便侵入的个人领域,以及只愿在本人认可的人群范围及以本人认可的传播方式传播隐私信息等,法律关注点的实质是隐私权。从隐私保护的角度,本文更多侧重隐私信息的全生命周期保护,具体而言,隐私信息包括当事人不愿他人知道或他人不便知道的个人信息、只愿在本人认可的人群范围且本人认可的传播方式传播等。隐私信息还可被用来精准刻画用户的个人画像,从而影响其生活和工作。从学术上来讲,隐私信息与时空场景、主体认知能力等因素紧密相关,并呈现出动态的感知结果。本文主要从技术角度对隐私信息进行定义和描述,因此本文所定义的隐私概念与法律的定义有所差异,是为了支持跨系统隐私信息交换、隐私信息处理、隐私保护效果自动化评估等方面的研究。3.1.2. 隐私计算隐私计算是面向隐私信息全生命周期保护的计算理论和方法,具体是指在处理视频、音频、图像、图形、文字、数值、泛在网络行为信息流等信息时,对所涉及的隐私信息进行描述、度量、评价和融合等操作,形成一套符号化、公式化且具有量化评价标准的隐私计算理论、算法及应用技术,支持多系统融合的隐私信息保护。隐私计算涵盖了信息所有者、信息转发者、信息接收者在信息采集、存储、处理、发布(含交换)、销毁等全生命周期过程的所有计算操作,是隐私信息的所有权、管理权和使用权分离时隐私信息描述、度量、保护、效果评估、延伸控制、隐私泄漏收益损失比、隐私分析复杂性等方面的可计算模型与公理化系统。从全生命周期的角度出发,本文提出了如图1所示的隐私计算框架。该框架面向任意格式的明文信息M,首先将全过程分解成以下几个元素:语义提取、场景提取、隐私信息变换、隐私信息整合、隐私操作选取、隐私保护方案选择/设计、隐私效果评估、场景描述以及反馈机制。然后,将这些元素整合到以下5个步骤中,以此实现隐私计算框架。步骤1:隐私信息提取。根据明文信息M的格式、场景抽象。根据I中各隐私信息分量ik的类步骤2:隐私操作选取。选取各隐私信息分量ik所支步骤3:语义等,抽取隐私信息X,并得到隐私信息向量I。型、语义等,对应用场景进行定义与抽象。持的隐私操作,并生成传播控制操作集合。计合适的隐私保护方案。如有可用且适合的方案及参数则直接选择,如无,则重新设计。步骤5:隐私保护效果评估。根据相关评价准则,隐私保护方案设计/选取。根据需求选择/设步骤4:本文使用基于熵或基于失真的隐私度量来评估所选择的隐私保护方案的隐私保护效果。有关评估保护隐私效果A:Γ:Ω:Θ:Ψ:隐私计算操作集合;隐私属性向量;广义定位信息集合;审计控制信息集合;约束条件集合;传播控制操图1. 隐私计算框架。F:X:f:f(X):作集;归一化隐私信息;隐私计算操作;执行操作后的归一化隐私信息。Author name et al. / Engineering 2(2016) xxx–xxx5的详情,请参阅第3.5节。对所采用的隐私保护方案进行效果评价。当隐私保护效果评价结果没有达到预期,则执行反馈机制,包括3种具体情况:①当场景抽象不当时,则对场景重新进行抽象迭代;②当场景抽象无误但隐私操作选取不当时,则对隐私操作重新进行规约;③当场景、操作均无误时,则对隐私保护方案进行调整/完善,以达到满意的隐私保护效果。需要注意的是,这些元素和步骤可以根据具体场景自由组合,该过程如图1所示。3.2. 隐私信息的形式化定义本节首先定义隐私信息X及其所涵盖的6个基本元素,以及相关公理、定理和假设等,这些是描述隐私计算其他内容的基础。需要指出的是,针对任意信息M的隐私信息向量的提取方法不在本文研究范畴内,因为它们受特定领域提取条件的约束。隐私信息的量化也不在本文研究范畴内,因为这是信息系统编程人员或建模人员的任务。其中,这定义1:6个元素分别代表隐私信息向量、隐私属性向隐私信息X由六元组〈I, A, Γ, Ω, Θ, Ψ〉组成,量、广义定位信息集合、审计控制信息集合、约束条件集合、传播控制操作集合。中,定义2:i隐私信息向量I = (IID, i1, i2, …, ik, …, in),其k (1≤k≤n)是隐私信息分量,用于表示信息M中语义上含有信息量的、不可分割的、彼此互不相交的原子信息,其信息类型包括文本、音频、视频、图像等,语义特征包括字、词、语调、语气、音素、音调、帧、像素、颜色等。IID为该隐私信息向量的唯一标识。例如,文字信息“U1和U2去Loc喝酒”,这句话中I = (IID, i1, i2, i3, i4, i5, i6, i7) = (IID, U1, 和, U2, 去, Loc, 喝, 酒),n = 7。注意:某些特定的信息片段,如谚语,可以用自然语言处理方案进行有效的切分。短语(公理1:phrase在某种自然语言及其语法规则下,在单词、)、俚语的粒度下,隐私信息向量I的分量数量一定有界。范式(性质1:2NF)。隐私信息向量符合第1范式(1NF)和第2隐私信息分量ik定义为不可细分的最小粒度,具有原子属性。1NF的定义为:称一个关系模式R属于第一范式,当且仅当R的所有属性的域都是原子的。所以ik符合第1范式。隐私信息向量I有唯一标识的IID为主键,其他非主属性的元素均依赖于该主键。2NF的定义为:若R∈1NF,且每一个非主属性完全函数依赖于唯一的主键,则(1≤定义3:R∈k≤n)表示隐私信息分量约束条件集合2NF。所以ik符合第Θ = {iθ2范式。1, θ2,…, θk, …, θn},θk k对应的约束条件向量,用于描述在不同场景下实体访问ik所需的访问权限,例如,谁、在什么时间、使用什么设备、以什么方式访问和使用隐私信息向量,并持续使用隐私信息向量多长时间等。只有满足约束条件向量θk中全部访问权限的访问实体才能正常访问隐私信息分量ik。实体包括信息所有者、信息接收者、信息发布者等。隐私属性向量A = (a1, a2, …, ak, …, an, an+1, …, a定义4:m),ak代表隐私属性分量,用于量化隐私信息分量及分量组合的保护程度。在现实应用时,在不同场景下不同的隐私信息分量可进行加权动态组合,这些组合会产生新的隐私信息,但基于隐私信息分量的原子性,本文将不同ik组合的隐私信息保护程度,以隐私属性分量表示。当1≤k≤n时,ak与ik一一对应;当n≤k≤m时,ak表示两个或两个以上隐私信息分量组合后的隐私信息的保护程度。ak取值范围定义为[0, 1],其中,ak取值为0时表示隐私信息所有者在安全可控的环境下信息独享,即信息没有任何共享性,不允许有任何泄漏的可能,代表信息得到最高程度的保护,保护后的隐私信息与原始隐私信息的互信息为0。例如,如果是加密之类的隐私保护方法,代表密钥丢失、信息完全不可恢复的情况;如果是添加噪声、泛化等不可逆有损的隐私保护方法,代表信息失真度,使得保护后信息与原始信息完全不相关。ak取值为1时,代表ik分量不受任何保护,可以不加限制地随意发布。不同的中间值代表对不同隐私信息分量的保护程度,取值越低,表示隐私信息的保护程度越好。将隐私保护程度量化操作函数记为σ,其中,人工标记、加权函数等都可作为隐私保护程度量化操作函数,因为ik有不同的信息类型,因此对应的σ表达式也不同,可记为ak = σ(ik, θk ) (1≤k≤n)。对于隐私信息分量i1, i2, …, in的任一组合in+j = ik1 σ生成隐私属性分量∨ ik2 a∨ … ∨ iks,∨运算符定义为多个隐私信息分量的组合,通过隐私保护程度量化操作函数n+j,即an+j = σ(in+j, θk1, θk2,…, θks ) (1≤k1<…