Você está na página 1de 5

Datacompression

FromWikipedia,thefreeencyclopedia

Insignalprocessing,datacompression,sourcecoding,[1]orbitratereductioninvolvesencoding
informationusingfewerbitsthantheoriginalrepresentation.[2]Compressioncanbeeitherlossyorlossless.
Losslesscompressionreducesbitsbyidentifyingandeliminatingstatisticalredundancy.Noinformationis
lostinlosslesscompression.Lossycompressionreducesbitsbyremovingunnecessaryorlessimportant
information.[3]Theprocessofreducingthesizeofadatafileisreferredtoasdatacompression.Inthe
contextofdatatransmission,itiscalledsourcecoding(encodingdoneatthesourceofthedatabeforeitis
storedortransmitted)inoppositiontochannelcoding.[4]

Compressionisusefulbecauseitreducesresourcesrequiredtostoreandtransmitdata.Computational
resourcesareconsumedinthecompressionprocessand,usually,inthereversaloftheprocess
(decompression).Datacompressionissubjecttoaspacetimecomplexitytradeoff.Forinstance,a
compressionschemeforvideomayrequireexpensivehardwareforthevideotobedecompressedfast
enoughtobeviewedasitisbeingdecompressed,andtheoptiontodecompressthevideoinfullbefore
watchingitmaybeinconvenientorrequireadditionalstorage.Thedesignofdatacompressionschemes
involvestradeoffsamongvariousfactors,includingthedegreeofcompression,theamountofdistortion
introduced(whenusinglossydatacompression),andthecomputationalresourcesrequiredtocompressand
decompressthedata.[5][6]

Contents

1 Lossless

2 Lossy

3 Theory

3.1 Machinelearning

3.2 Datadifferencing

4 Uses

4.1 Audio

4.1.1 Lossyaudiocompression

4.1.1.1 Codingmethods

4.1.1.2 Speechencoding

4.1.2 History
4.2 Video

4.2.1 Encodingtheory

4.2.2 Timeline

4.3 Genetics

4.4 Emulation

5 Outlookandcurrentlyunusedpotential

6 Seealso

7 References

8 Externallinks

Lossless
Losslessdatacompressionalgorithmsusuallyexploitstatisticalredundancytorepresentdatawithoutlosing
anyinformation,sothattheprocessisreversible.Losslesscompressionispossiblebecausemostrealworld
dataexhibitsstatisticalredundancy.Forexample,animagemayhaveareasofcolorthatdonotchangeover
severalpixelsinsteadofcoding"redpixel,redpixel,..."thedatamaybeencodedas"279redpixels".This
isabasicexampleofrunlengthencodingtherearemanyschemestoreducefilesizebyeliminating
redundancy.

TheLempelZiv(LZ)compressionmethodsareamongthemostpopularalgorithmsforlosslessstorage.[7]
DEFLATEisavariationonLZoptimizedfordecompressionspeedandcompressionratio,butcompression
canbeslow.DEFLATEisusedinPKZIP,Gzip,andPNG.LZW(LempelZivWelch)isusedinGIF
images.AlsonoteworthyistheLZR(LempelZivRenau)algorithm,whichservesasthebasisfortheZip
method.LZmethodsuseatablebasedcompressionmodelwheretableentriesaresubstitutedforrepeated
stringsofdata.FormostLZmethods,thistableisgenerateddynamicallyfromearlierdataintheinput.The
tableitselfisoftenHuffmanencoded(e.g.SHRI,LZX).CurrentLZbasedcodingschemesthatperform
wellareBrotliandLZX.LZXisusedinMicrosoft'sCABformat.

Thebestmodernlosslesscompressorsuseprobabilisticmodels,suchaspredictionbypartialmatching.The
BurrowsWheelertransformcanalsobeviewedasanindirectformofstatisticalmodelling.[8]

Theclassofgrammarbasedcodesaregainingpopularitybecausetheycancompresshighlyrepetitiveinput
extremelyeffectively,forinstance,abiologicaldatacollectionofthesameorcloselyrelatedspecies,ahuge
versioneddocumentcollection,internetarchival,etc.Thebasictaskofgrammarbasedcodesis
constructingacontextfreegrammarderivingasinglestring.SequiturandRePairarepracticalgrammar
compressionalgorithmsforwhichsoftwareispubliclyavailable.
Inafurtherrefinementofthedirectuseofprobabilisticmodelling,statisticalestimatescanbecoupledtoan
algorithmcalledarithmeticcoding.Arithmeticcodingisamoremoderncodingtechniquethatusesthe
mathematicalcalculationsofafinitestatemachinetoproduceastringofencodedbitsfromaseriesofinput
datasymbols.ItcanachievesuperiorcompressiontoothertechniquessuchasthebetterknownHuffman
algorithm.Itusesaninternalmemorystatetoavoidtheneedtoperformaonetoonemappingofindividual
inputsymbolstodistinctrepresentationsthatuseanintegernumberofbits,anditclearsouttheinternal
memoryonlyafterencodingtheentirestringofdatasymbols.Arithmeticcodingappliesespeciallywellto
adaptivedatacompressiontaskswherethestatisticsvaryandarecontextdependent,asitcanbeeasily
coupledwithanadaptivemodeloftheprobabilitydistributionoftheinputdata.Anearlyexampleofthe
useofarithmeticcodingwasitsuseasanoptional(butnotwidelyused)featureoftheJPEGimagecoding
standard.IthassincebeenappliedinvariousotherdesignsincludingH.264/MPEG4AVCandHEVCfor
videocoding.

Lossy
Lossydatacompressionistheconverseoflosslessdatacompression.Intheseschemes,somelossof
informationisacceptable.Droppingnonessentialdetailfromthedatasourcecansavestoragespace.Lossy
datacompressionschemesaredesignedbyresearchonhowpeopleperceivethedatainquestion.For
example,thehumaneyeismoresensitivetosubtlevariationsinluminancethanitistothevariationsin
color.JPEGimagecompressionworksinpartbyroundingoffnonessentialbitsofinformation.[9]Thereisa
correspondingtradeoffbetweenpreservinginformationandreducingsize.Anumberofpopular
compressionformatsexploittheseperceptualdifferences,includingthoseusedinmusicfiles,images,and
video.

Lossyimagecompressioncanbeusedindigitalcameras,toincreasestoragecapacitieswithminimal
degradationofpicturequality.Similarly,DVDsusethelossyMPEG2videocodingformatforvideo
compression.

Inlossyaudiocompression,methodsofpsychoacousticsareusedtoremovenonaudible(orlessaudible)
componentsoftheaudiosignal.Compressionofhumanspeechisoftenperformedwithevenmore
specializedtechniquesspeechcoding,orvoicecoding,issometimesdistinguishedasaseparatediscipline
fromaudiocompression.Differentaudioandspeechcompressionstandardsarelistedunderaudiocoding
formats.Voicecompressionisusedininternettelephony,forexample,audiocompressionisusedforCD
rippingandisdecodedbytheaudioplayers.[8]

Theory
Thetheoreticalbackgroundofcompressionisprovidedbyinformationtheory(whichiscloselyrelatedto
algorithmicinformationtheory)forlosslesscompressionandratedistortiontheoryforlossycompression.
TheseareasofstudywereessentiallyforgedbyClaudeShannon,whopublishedfundamentalpapersonthe
topicinthelate1940sandearly1950s.Codingtheoryisalsorelatedtothis.Theideaofdatacompression
isalsodeeplyconnectedwithstatisticalinference.[10]

Machinelearning
Thereisacloseconnectionbetweenmachinelearningandcompression:asystemthatpredictstheposterior
probabilitiesofasequencegivenitsentirehistorycanbeusedforoptimaldatacompression(byusing
arithmeticcodingontheoutputdistribution)whileanoptimalcompressorcanbeusedforprediction(by
findingthesymbolthatcompressesbest,giventheprevioushistory).Thisequivalencehasbeenusedasa
justificationforusingdatacompressionasabenchmarkfor"generalintelligence."[11][12][13]

Datadifferencing

Datacompressioncanbeviewedasaspecialcaseofdatadifferencing:[14][15]Datadifferencingconsistsof
producingadifferencegivenasourceandatarget,withpatchingproducingatargetgivenasourceanda
difference,whiledatacompressionconsistsofproducingacompressedfilegivenatarget,and
decompressionconsistsofproducingatargetgivenonlyacompressedfile.Thus,onecanconsiderdata
compressionasdatadifferencingwithemptysourcedata,thecompressedfilecorrespondingtoa
"differencefromnothing."Thisisthesameasconsideringabsoluteentropy(correspondingtodata
compression)asaspecialcaseofrelativeentropy(correspondingtodatadifferencing)withnoinitialdata.

Whenonewishestoemphasizetheconnection,onemayusethetermdifferentialcompressiontoreferto
datadifferencing.

Uses
Audio

Audiodatacompression,nottobeconfusedwithdynamicrangecompression,hasthepotentialtoreduce
thetransmissionbandwidthandstoragerequirementsofaudiodata.Audiocompressionalgorithmsare
implementedinsoftwareasaudiocodecs.Lossyaudiocompressionalgorithmsprovidehighercompression
atthecostoffidelityandareusedinnumerousaudioapplications.Thesealgorithmsalmostallrelyon
psychoacousticstoeliminateorreducefidelityoflessaudiblesounds,therebyreducingthespacerequired
tostoreortransmitthem.[2]

Inbothlossyandlosslesscompression,informationredundancyisreduced,usingmethodssuchascoding,
patternrecognition,andlinearpredictiontoreducetheamountofinformationusedtorepresentthe
uncompresseddata.

Theacceptabletradeoffbetweenlossofaudioqualityandtransmissionorstoragesizedependsuponthe
application.Forexample,one640MBcompactdisc(CD)holdsapproximatelyonehourofuncompressed
highfidelitymusic,lessthan2hoursofmusiccompressedlosslessly,or7hoursofmusiccompressedinthe
MP3formatatamediumbitrate.Adigitalsoundrecordercantypicallystorearound200hoursofclearly
intelligiblespeechin640MB.[16]

Losslessaudiocompressionproducesarepresentationofdigitaldatathatdecompresstoanexactdigital
duplicateoftheoriginalaudiostream,unlikeplaybackfromlossycompressiontechniquessuchasVorbis
andMP3.Compressionratiosarearound5060%oforiginalsize,[17]whichissimilartothoseforgeneric
losslessdatacompression.Losslesscompressionisunabletoattainhighcompressionratiosduetothe
complexityofwaveformsandtherapidchangesinsoundforms.CodecslikeFLAC,Shorten,andTTAuse
linearpredictiontoestimatethespectrumofthesignal.Manyofthesealgorithmsuseconvolutionwiththe
filter[11]toslightlywhitenorflattenthespectrum,therebyallowingtraditionallosslesscompressionto
workmoreefficiently.Theprocessisreversedupondecompression.

Whenaudiofilesaretobeprocessed,eitherbyfurthercompressionorforediting,itisdesirabletowork
fromanunchangedoriginal(uncompressedorlosslesslycompressed).Processingofalossilycompressed
fileforsomepurposeusuallyproducesafinalresultinferiortothecreationofthesamecompressedfile
fromanuncompressedoriginal.Inadditiontosoundeditingormixing,losslessaudiocompressionisoften
usedforarchivalstorage,orasmastercopies.

Anumberoflosslessaudiocompressionformatsexist.Shortenwasanearlylosslessformat.Newerones
includeFreeLosslessAudioCodec(FLAC),Apple'sAppleLossless(ALAC),MPEG4ALS,Microsoft's
WindowsMediaAudio9Lossless(WMALossless),Monkey'sAudio,TTA,andWavPack.Seelistof
losslesscodecsforacompletelisting.

Someaudioformatsfeatureacombinationofalossyformatandalosslesscorrectionthisallowsstripping
thecorrectiontoeasilyobtainalossyfile.SuchformatsincludeMPEG4SLS(ScalabletoLossless),
WavPack,andOptimFROGDualStream.

Otherformatsareassociatedwithadistinctsystem,suchas:

DirectStreamTransfer,usedinSuperAudioCD
MeridianLosslessPacking,usedinDVDAudio,DolbyTrueHD,BlurayandHDDVD

Lossyaudiocompression

Lossyaudiocompressionisusedinawiderangeofapplications.Inadditiontothedirectapplications(MP3
playersorcomputers),digitallycompressedaudiostreamsareusedinmostvideoDVDs,digitaltelevision,
streamingmediaontheinternet,satelliteandcableradio,andincreasinglyinterrestrialradiobroadcasts.
Lossycompressiontypicallyachievesfargreatercompressionthanlosslesscompression(dataof5percent
to20percentoftheoriginalstream,ratherthan50percentto60percent),bydiscardinglesscriticaldata.[18]

Theinnovationoflossyaudiocompressionwastousepsychoacousticstorecognizethatnotalldatainan
audiostreamcanbeperceivedbythehumanauditorysystem.Mostlossycompressionreducesperceptual
redundancybyfirstidentifyingperceptuallyirrelevantsounds,thatis,soundsthatareveryhardtohear.
Typicalexamplesincludehighfrequenciesorsoundsthatoccuratthesametimeasloudersounds.Those
soundsarecodedwithdecreasedaccuracyornotatall.

Duetothenatureoflossyalgorithms,audioqualitysufferswhenafileisdecompressedandrecompressed
(digitalgenerationloss).Thismakeslossycompressionunsuitableforstoringtheintermediateresultsin
professionalaudioengineeringapplications,suchassoundeditingandmultitrackrecording.However,they
areverypopularwithendusers(particularlyMP3)asamegabytecanstoreaboutaminute'sworthofmusic
atadequatequality.

Codingmethods

Você também pode gostar