软件工程专业毕业设计外文文献解析.docx
学校代码:10128生.本科毕业设计外文文献翻译英文题目:SoftwareDatabaseAnObject-OrientedPerspective.中文题目:软件数据库的面向对象的视角学生姓名:宋兰兰学院:信息工程学院系别:软件工程系W业加姓Tw二。一三年六月AHISTORICALPERSPECTIVEFromtheearliestdaysofcomputers,storingandmanipulatingdatahavebeenamajorapplicationfocus.Thefirstgeneral-purposeDBMSwasdesignedbyCharlesBachmanatGeneralElectricintheearly1960sandwascalledtheIntegratedDataStore.Itformedthebasisforthenetworkdatamodel,whichwasstandardizedbytheConferenceonDataSystemsLanguages(CODASYL)andstronglyinfluenceddatabasesystemsthroughthe1960s.BachmanwasthefirstrecipientofACM,sTuringAward(thecomputerscienceequivalentofaNobelprize)forworkinthedatabasearea;hereceivedtheawardin1973.Inthelate1960s,IBMdevelopedtheInformationManagementSystem(IMS)DBMS,usedeventodayinmanymajorinstallations.IMSformedthebasisforanalternativedatarepresentationframeworkcalledthehierarchicaldatamodel.TheSABREsystemformakingairlinereservationswasjointlydevelopedbyAmericanAirlinesandIBMaroundthesametime,anditallowedseveralpeopletoaccessthesamedatathroughcomputernetwork.Interestingly,todaythesameSABREsystemisusedtopowerpopularWeb-basedtravelservicessuchasTravelocity!In1970,EdgarCodd,atlBM,sSanJoseResearchLaboratory,proposedanewdatarepresentationframeworkcalledtherelationaldatamodel.Thisprovedtobeawatershedinthedevelopmentofdatabasesystems:itsparkedrapiddevelopmentofseveralDBMSsbasedontherelationalmodel,alongwitharichbodyoftheoreticalresultsthatplacedthefieldonafirmfoundation.Coddwonthe1981TuringAwardforhisseminalwork.Databasesystemsmaturedasanacademicdiscipline,andthepopularityofrelationalDBMSschangedthecommerciallandscape.Theirbenefitswerewidelyrecognized,andtheuseofDBMSsformanagingcorporatedatabecamestandardpractice.Inthe1980s,therelationalmodelconsolidateditspositionasthedominantDBMSparadigm,anddatabasesystemscontinuedtogainwidespreaduse.TheSQLquerylanguageforrelationaldatabases,developedaspartofIBM,sSystemRproject,isnowthestandardquerylanguage.SQLwasstandardizedinthelate1980s,andthecurrentstandard,SQL-92,wasadoptedbytheAmericanNationalStandardsInstitute(ANSI)andInternationalStandardsOrganization(ISO).Arguably,themostwidelyusedfbnofconcurrentprogrammingistheconcurrentexecutionofdatabaseprograms(calledtransactions).Userswriteprogramsasiftheyaretoberunbythemselves,andtheresponsibilityforrunningthemconcurrentlyisgiventotheDBMS.JamesGraywonthe1999TuringawardforhiscontributionstothefieldoftransactionmanagementinaDBMS.Inthelate1980sandthe1990s,advanceshavebeenmadeinmanyareasofdatabasesystems.Considerableresearchhasbeencarriedoutintomorepowerfulquerylanguagesandricherdatamodels,andtherehasbeenabigemphasisonsupportingcomplexanalysisofdatafromallpartsofanenterprise.Severalvendors(e.g.,IBM,sDB2,Oracle8,InformixUDS)haveextendedtheirsystemswiththeabilitytostorenewdatatypessuchasimagesandtext,andwiththeabilitytoaskmorecomplexqueries.Specializedsystemshavebeendevelopedbynumerousvendorsforcreatingdatawarehouses,consolidatingdatafromseveraldatabases,andforcarryingoutspecializedanalysis.Aninterestingphenomenonistheemergenceofseveralenterpriseresourceplanning(ERP)andmanagementresourceplanning(MRP)packages,whichaddasubstantiallayerofapplication-orientedfeaturesontopofaDBMS.WidelyusedpackagesincludesystemsfromBaan,Oracle,PeopleSoft,SAP,andSiebel.Thesepackagesidentifyasetofcommontasks(e.g.,inventorymanagement,humanresourcesplanning,financialanalysis)encounteredbyalargenumberoforganizationsandprovideageneralapplicationlayertocarryoutthesetasks.ThedataisstoredinarelationalDBMS,andtheapplicationlayercanbecustomizedtodifferentcompanies,leadingtolowerIntroductiontoDatabaseSystemsoverallcostsforthecompanies,comparedtothecostofbuildingtheapplicationlayerfromscratch.Mostsignificantly,perhaps,DBMSshaveenteredtheInternetAge.WhilethefirstgenerationofWebsitesstoredtheirdataexclusivelyinoperatingsystemsfiles,theuseofaDBMStostoredatathatisaccessedthroughaWebbrowserisbecomingwidespread.QueriesaregeneratedthroughWeb-accessiblefbsandanswersareformattedusingamarkuplanguagesuchasHTML,inordertobeeasilydisplayedinabrowser.AllthedatabasevendorsareaddingfeaturestotheirDBMSaimedatmakingitmoresuitablefordeploymentovertheInternet.Databasemanagementcontinuestogainimportanceasmoreandmoredataisbroughton-line,andmadeevermoreaccessiblethroughcomputernetworking.Todaythefieldisbeingdrivenbyexcitingvisionssuchasmultimediadatabases,interactivevideo,digitallibraries,ahostofscientificprojectssuchasthehumangenomemappingeffortandNASA,sEarthObservationSystemproject,andthedesireofcompaniestoconsolidatetheirdecision-makingprocessesandminetheirdatarepositoriesforusefulinformationabouttheirbusinesses.Commercially,databasemanagementsystemsrepresentoneofthelargestandmostvigorousmarketsegments.Thusthes-tudyofdatabasesystemscouldprovetoberichlyrewardinginmorewaysthanone!INTRODUCTIONTOPHYSICALDATABASEDESIGN1.ikeallotheraspectsofdatabasedesign,physicaldesignmustbeguidedbythenatureofthedataanditsintendeduse.Inparticular,itisimportanttounderstandthetypicalworkloadthatthedatabasemustsupport;theworkloadconsistsofamixofqueriesandupdates.Usersalsohavecertainrequirementsabouthowfastcertainqueriesorupdatesmustrunorhowmanytransactionsmustbeprocessedpersecond.Theworkloaddescriptionandusers,performancerequirementsarethebasisonwhichanumberofdecisionshavetobemadeduringphysicaldatabasedesign.Tocreateagoodphysicaldatabasedesignandtotunethesystemforperformanceinresponsetoevolvinguserrequirements,thedesignerneedstounderstandtheworkingsofaDBMS,especiallytheindexingandqueryprocessingtechniquessupportedbytheDBMS.Ifthedatabaseisexpectedtobeaccessedconcurrentlybymanyusers,orisadistributeddatabase,thetaskbecomesmorecomplicated,andotherfeaturesofaDBMScomeintoplay.DATABASEWORKLOADSThekeytogoodphysicaldesignisarrivingatanaccuratedescriptionoftheexpectedworkload.Aworkloaddescriptionincludesthefollowingelements:1. Alistofqueriesandtheirfrequencies,asafractionofallqueriesandupdates.2. Alistofupdatesandtheirfrequencies.3. Performancegoalsforeachtypeofqueryandupdate.Foreachqueryintheworkload,Wemustidentify:Whichrelationsareaccessed.Whichattributesareretained(intheSELECTclause).Whichattributeshaveselectionorjoinconditionsexpressedonthem(intheWHEREclause)andhowselectivetheseconditionsarelikelytobe.Similarly,foreachupdateintheworkload,wemustidentify:Whichattributeshaveselectionorjoinconditionsexpressedonthem(intheWHEREclause)andhowselectivetheseconditionsarelikelytobe.Thetypeofupdate(INSERT,DELETE,orUPDATE)andtheupdatedrelation.ForUPDATEcommands,thefieldsthataremodifiedbytheupdate.Rememberthatqueriesandupdatestypicallyhaveparameters,forexample,adebitorcreditoperationinvolvesaparticularaccountnumber.Thevaluesoftheseparametersdetermineselectivityofselectionandjoinconditions.Updateshaveaquerycomponentthatisusedtofindthetargettuples.Thiscomponentcanbenefitfromagoodphysicaldesignandthepresenceofindexes.Ontheotherhand,updatestypicallyrequireadditionalworktomaintainindexesontheattributesthattheymodify.Thus,whilequeriescanonlybenefitfromthepresenceofanindex,anindexmayeitherspeeduporslowdownagivenupdate.Designersshouldkeepthistrade-offerinmindwhencreatingindexes.NEEDFORDATABASETUNINGAccurate,detailedworkloadinformationmaybehardtocomebywhiledoingtheinitialdesignofthesystem.Consequently,tuningadatabaseafterithasbeendesignedanddeployedisimportant-Wemustrefinetheinitialdesigninthelightofactualusagepatternstoobtainthebestpossibleperformance.Thedistinctionbetweendatabasedesignanddatabasetuningissomewhatarbitrary.Wecouldconsiderthedesignprocesstobeoveronceaninitialconceptualschemaisdesignedandasetofindexingandclusteringdecisionsismade.Anysubsequentchangestotheconceptualschemaortheindexes,say,wouldthenberegardedasatuningactivity.Alternatively,wecouldconsidersomerefinementoftheconceptualschema(andphysicaldesigndecisionsaffectedbythisrefinement)tobepartofthephysicaldesignprocess.WhereWedrawthelinebetweendesignandtuningisnotveryimportant.OVERVIEWOFDATABASETUNINGAftertheinitialphaseofdatabasedesign,actualuseofthedatabaseprovidesavaluablesourceofdetailedinformationthatcanbeusedtorefinetheinitialdesign.Manyoftheoriginalassumptionsabouttheexpectedworkloadcanbereplacedbyobservedusagepatterns;ingeneral,someoftheinitialworkloadspecificationwillbevalidated,andsomeofitwillturnouttobewrong.Initialguessesaboutthesizeofdatacanbereplacedwithactualstatisticsfromthesystemcatalogs(althoughthisinformationwillkeepchangingasthesystemevolves).Carefulmonitoringofqueriescanrevealunexpectedproblems;forexample,theoptimizermaynotbeusingsomeindexesasintendedtoproducegoodplans.Continueddatabasetuningisimportanttogetthebestpossibleperformance.TUNINGTHECONCEPTUALSCHEMAInthecourseofdatabasedesign,wemayrealizethatourcurrentchoiceofrelationschemasdoesnotenableusmeetourperformanceobjectivesforthegivenworkloadwithany(feasible)setofphysicaldesignchoices.Ifso,wemayhavetoredesignourconceptualschema(andre-examinephysicaldesigndecisionsthatareaffectedbythechangesthatwemake).Wemayrealizethataredesignisnecessaryduringtheinitialdesignprocessorlater,afterthesystemhasbeeninuseforawhile.Onceadatabasehasbeendesignedandpopulatedwithdata,changingtheconceptualschemarequiresasignificanteffortintermsofmappingthecontentsofrelationsthatareaffected.Nonetheless,itmaysometimesbenecessarytorevisetheconceptualschemainlightofexperiencewiththesystem.Wenowconsidertheissuesinvolvedinconceptualschema(re)designfromthepointofviewofperformance.Severaloptionsmustbeconsideredwhiletuningtheconceptualschema:Wemaydecidetosettlefora3NFdesigninsteadofaBCNFdesign.Iftherearetwowaystodecomposeagivenschemainto3NForBCNF,ourchoiceshouldbeguidedbytheworkload.SometimesWemightdecidetofurtherdecomposearelationthatisalreadyinBCNEInothersituationsWemightdenormalize.Thatis,wemightchoosetoreplaceacollectionofrelationsobtainedbyadecompositionfromalargerrelationwiththeoriginal(larger)relation,eventhoughitsuffersfromsomeredundancyproblems.Alternatively,wemightchoosetoaddsomefieldstocertainrelationstospeedupsomeimportantqueries,evenifthisleadstoaredundantstorageofsomeinformation(andconsequently,aschemathatisinneither3NFnorBCNF).Thisdiscussionofnormalizationhasconcentratedonthetechniqueofdecomposition,whichamountstoverticalpartitioningofarelation.Anothertechniquetoconsiderishorizontalpartitioningofarelation,whichwouldleadtoourhavingtworelationswithidenticalschemas.Notethatwearenottalkingaboutphysicallypartitioningthecuplesofasinglerelation;rather,wewanttocreatetwodistinctrelations(possiblywithdifferentconstraintsandindexesoneach).Incidentally,whenweredesigntheconceptualschema,especiallyifwearetuninganexistingdatabaseschema,itisworthconsideringwhetherWeshouldcreateviewstomaskthesechangesfromusersforwhomtheoriginalschemaismorenatural.TUNINGQUERIESANDVIEWSIfwenoticethataqueryisrunningmuchslowerthanweexpected,wehavetoexaminethequerycarefullytoendtheproblem.Somerewritingofthequery,perhapsinconjunctionwithsomeindextuning,canoften?xtheproblem.Similartuningmaybecalledforifqueriesonsomeviewrunslowerthanexpected.Whentuningaquery,thefirstthingtoverifyisthatthesystemisusingtheplanthatyouexpectittouse.Itmaybethatthesystemisnotfindingthebestplanforavarietyofreasons.Somecommonsituationsthatarenothandledefficientlybymanyoptimizersfollow:Aselectionconditioninvolvingnullvalues.Selectionconditionsinvolvingarithmeticorstringexpressionsorconditionsusingtheorconnective.Forexample,ifwehaveaconditionE.age=2*D.ageintheWHEREclause,theoptimizermaycorrectlyutilizeanavailableindexonE.agebutfailtoutilizeanavailableindexonD.age.ReplacingtheconditionbyE.age2=D.agewouldreversethesituation.Inabilitytorecognizeasophisticatedplansuchasanindex-onlyscanforanaggregationqueryinvolvingaGROUPBYclause.Iftheoptimizerisnotsmartenoughtoandthebestplan(usingaccessmethodsandevaluationstrategiessupportedbytheDBMS),somesystemsallowuserstoguidethechoiceofaplanbyprovidinghintstotheoptimizer;forexample,usersmightbeabletoforcetheuseofaparticularindexorchoosethejoinorderandjoinmethod.AuserwhowishestoguideoptimizationinthismannershouldhaveathoroughunderstandingofbothoptimizationandthecapabilitiesofthegivenDBMS.(8)0THERTOPICSMOBILEDATABASESTheavailabilityofportablecomputersandwirelesscommunicationshascreatedanewbreedofnomadicdatabaseusers.Atoneleveltheseusersaresimplyaccessingadatabasethroughanetwork,whichissimilartodistributedDBMSs.Atanotherlevelthenetworkaswellasdataandusercharacteristicsnowhaveseveralnovelproperties,whichaffectbasicassumptionsinmanycomponentsofaDBMS,includingthequeryengine,transactionmanager,andrecoverymanager.UsersareconnectedthroughawirelesslinkwhosebandwidthistentimeslessthanEthernetand100timeslessthanATMnetworks.CommunicationcostsarethereforesignificantlyhigherinproportiontoI/OandCPUcosts.Users,locationsareconstantlychanging,andmobilecomputershavealimitedbatterylife.Therefore,thetruecommunicationcostsisconnectiontimeandbatteryusageinadditiontobytestransferred,andchangeconstantlydependingonlocation.Dataisfrequentlyreplicatedtominimizethecostofaccessingitfromdifferentlocations.Asausermovesaround,datacouldbeaccessedfrommultipledatabaseserverswithinasingletransaction.Thelikelihoodoflosingconnectionsisalsomuchgreaterthaninatraditionalnetwork.Centralizedtransactionmanagementmaythereforebeimpractical,especiallyifsomedataisresidentatthemobilecomputers.WemayinfacthavetogiveuponAClDtransactionsanddevelopalternativenotionsofconsistencyforuserprograms.MAINMEMORYDATABASESThepriceofmainmemoryisnowlowenoughthatwecanbuyenoughmainmemorytoholdtheentiredatabaseformanyapplications;with64-bitaddressing,modernCPUsalsohaveverylargeaddressspaces.Somecommercialsystemsnowhaveseveralgigabytesofmainmemory.ThisshiftpromptsareexaminationofsomebasicDBMSdesigndecisions,sincediskaccessesnolongerdominateprocessingtimeforamemory-residentdatabase:Mainmemorydoesnotsurvivesystemcrashes,andsowestillhavetoimplementloggingandrecoverytoensuretransactionatomicityanddurability.Logrecordsmustbewrittentostablestorageatcommittime,andthisprocesscouldbecomeabottleneck.Tominimizethisproblem,ratherthancommiteachtransactionasitcompletes,wecancollectcompletedtransactionsandcommittheminbatches;thisiscalledgroupcommit.Recoveryalgorithmscanalsobeoptimizedsincepagesrarelyhavetobewrittenouttomakeroomforotherpages.Theimplementationofin-memoryoperationshastobeoptimizedcarefullysincediskaccessesarenolongerthelimitingfactorforperformance.Anewcriterionmustbeconsideredwhileoptimizingqueries,namelytheamountofspacerequiredtoexecuteaplan.Itisimportanttominimizethespaceoverheadbecauseexceedingavailablephysicalmemorywouldleadtoswappingpagestodisk(throughtheoperatingsystem'svirtualmemorymechanisms),greatlyslowingdownexecution.Page-orienteddatastructuresbecomelessimportant(sincepagesarenolongertheunitofdataretrieval),andclusteringisnotimportant(sincethecostofaccessinganyregionofmainmemoryisuniform).(一)从历史日勺角度回忆从数据库B初期开始,存储和操纵数据就一直是重要的应用焦点。第一种通用的DBMS是由CharlesBechman于20世纪60年代初期在通用电器企业设计0,称为集成数据存储QntegratedDataStOre).它奠定了网状数据模型0基础。网状数据模型由数据系统语言协会(CODASYL)原则化,并在整个20世纪60年代对数据库系统产生了巨大的影响。由于Bachman在数据库领域的奉献,他成为第一种ACM图灵奖(相称于计算机科学界的诺贝尔奖)的