Escolar Documentos
Profissional Documentos
Cultura Documentos
STANDARD 15608-2
First edition
2008.08.22
Valid from
2008.09.22
ICS 33.160.01
ISBN 978-85-07-01115-6
Reference number
ABNT NBR 15608-2:2008
27 pages
ABNT 2008
ABNT NBR 15608-2:2008
ABNT 2008
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any
means, electronic or mechanical, including photocopying and microfilm, without permission in writing from either ABNT.
ABNT office
Av.Treze de Maio, 13 - 28 andar
20031-901 - Rio de Janeiro - RJ
Tel.: + 55 21 3974-2300
Fax: + 55 21 3974-2346
abnt@abnt.org.br
www.abnt.org.br
Published in Brazil
Contents Pages
Foreword......................................................................................................................................................................v
1 Scope ..............................................................................................................................................................1
2 Normative references ....................................................................................................................................1
3 Terms and definitions ...................................................................................................................................2
4 Abbreviation...................................................................................................................................................2
5 Video coding guidelines for full-seg service ..............................................................................................3
5.1 General ...........................................................................................................................................................3
5.2 Aspect ratio ....................................................................................................................................................3
5.2.1 For transmission............................................................................................................................................3
5.2.2 For reception..................................................................................................................................................3
5.3 Channel change time (zapping) ...................................................................................................................4
5.4 H.264 Compression details...........................................................................................................................4
5.4.1 Coding parameters ........................................................................................................................................4
5.4.2 Limit of video coding rates...........................................................................................................................4
5.4.3 Format conversion ........................................................................................................................................4
5.4.4 Coding profile@level.....................................................................................................................................4
5.4.5 Colorimetry.....................................................................................................................................................5
5.4.6 Transport stream restrictions ......................................................................................................................5
5.5 Use of pan-scan tools ...................................................................................................................................7
5.5.1 Description of pan-scan vectors..................................................................................................................7
5.5.2 Relationship of AFD with pan-scan vectors ...............................................................................................7
5.6 Use of AFD .....................................................................................................................................................8
5.6.1 AFD description .............................................................................................................................................8
5.6.2 Recommendations for the encoder .............................................................................................................8
5.6.3 Recommendations for the decoder .............................................................................................................8
5.6.4 Recommendations for the pan-scan vector................................................................................................9
5.7 Quality parameters ........................................................................................................................................9
6 Audio coding guidelines for full-seg reception..........................................................................................9
6.1 Pre- and post-processing .............................................................................................................................9
6.2 Downmixing ...................................................................................................................................................9
6.2.1 General guidelines.........................................................................................................................................9
6.2.2 Dynamic range .............................................................................................................................................10
6.2.3 Audio signal level and overflow .................................................................................................................10
6.2.4 Volume uniformity .......................................................................................................................................10
6.3 Channel mapping for program switching .................................................................................................10
6.4 Connectivity for home theater systems ....................................................................................................11
6.4.1 General guidelines.......................................................................................................................................11
6.4.2 Prioritizing the audio outlet ........................................................................................................................12
6.4.3 Home theater stereo and mono outputs ...................................................................................................12
6.4.4 Multichannel outputs for home theater systems .....................................................................................12
6.4.5 Transcoding to DTS format ........................................................................................................................12
6.5 Audio coding parameters for the full-seg reception................................................................................12
6.5.1 Audio coding profiles..................................................................................................................................12
6.5.2 Main full-seg coding parameters ...............................................................................................................13
6.5.3 Notes about the signaling and modes.......................................................................................................13
6.5.4 Coding parameters changes ......................................................................................................................14
6.6 Full-seg service quality parameters ..........................................................................................................15
6.6.1 Audio quality indicators..............................................................................................................................15
6.6.2 Range of audio coding bit rates.................................................................................................................16
Foreword
Associao Brasileira de Normas Tcnicas (ABNT) is the Brazilian Standardization Forum. Brazilian Standards,
whose content is the responsibility of the Brazilian Committees (Comits Brasileiros ABNT/CB), Sector
Standardization Bodies (Organismos de Normalizao Setorial ABNT/ONS), and Special Studies Committees
(Comisses de Estudo Especiais ABNT/CEE), are prepared by Study Committees (Comisses de Estudo CE),
made up of representatives from the sectors involved including: producers, consumers, and neutral entities
(universities, laboratories, and others).
Brazilian Standards are drafted in accordance with the rules given in the ABNT Directives (Diretivas), Part 2.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights.
ABNT shall not be held responsible for identifying any or all such patent rights.
ABNT NBR 15608-2 was prepared within the purview of the Special Studies Committees on Digital Television
(ABNT/CEE-00:001.85). The Draft Standard was circulated for National Consultation in accordance with ABNT
Notice (Edital) n 07, from July 07th, 2008, to August 08th, 2008, with the number Draft 00:001.85-008/2.
Should any doubts arise regarding the interpretation of the English version, the provisions in the original text
in Portuguese shall prevail at all time.
This Standard is based on the work of the Brazilian Digital Television Forum as established by the Presidential
Decree number 5.820 of June, 29th 2006.
ABNT NBR 15608 consists of the following parts, under the general title Digital terrestrial television Operational
Guidelines:
Part 2: Video coding, audio coding and multiplexing Guideline for implementation of ABNT NBR 15602:2007;
Part 3: Multiplexing and service information (SI) Guideline for implementation of ABNT NBR 15603:2007:
Part 4: Data coding and transmission specification for digital broadcasting Guideline for implementation
of ABNT NBR 15606:2007.
1 Scope
This part of ABNT NBR 15608 consists on a guideline for implementation of ABNT NBR 15602 and contains
additional information of the audio and video coding parameters of the Brazilian digital terrestrial television system
(SBTVD).
2 Normative references
The following referenced documents are indispensable for the application of this document. For dated references,
only the edition cited applies. For undated references, the latest edition of the referenced document (including any
amendments) applies.
ABNT NBR 15602-1:2007, Digital terrestrial television Video coding, audio coding and multiplexing
Part 1: Video coding
ABNT NBR 15602-2:2007, Digital terrestrial television Video coding, audio coding and multiplexing
Part 2: Audio coding
ISO/IEC 14496-10, Information technology Coding of audio-visual objects Part 10: Advanced Video Coding
ISO/IEC 13818-1, Information technology Generic coding of moving pictures and associated audio information:
Systems
ETSI TS 101 154:2007, Digital video broadcasting (DVB); Implementation guidelines for the use of Video and audio
coding in broadcasting applications based on the MPEG-2 transport stream
ITU Recommendation BT.601-6, Studio encoding parameter of digital television for standard 4:3 and
wide-screen 16:9 aspect ratios
ITU Recommendation BT.709-5, Parameter values for the HDTV standard for production and international
programme exchange
ITU Recommendation H.264, Advanced video coding for generic audiovisual services
3.1
full-seg receiver
device capable of decoding audio, video, data, etc., carried by transport stream layer of the 13 segments designed
for fixed indoor and mobile service
NOTE The full-seg classification should be applied to the digital converter, also called by set-top box, as well as the
thirteen-segment receiver integrated with display, but not exclusive to these. This kind of receiver must be enabling to decoder
signal from terrestrial digital television in high definition and, by maker criteria, also to decoder information carried on layer A
of the transport stream, applied for services oriented to portable receiver, here defined as one-seg
3.2
one-seg receiver
device capable of decoding audio, video, data, etc information transported on layer A located in the central
segment of the 13 segments
NOTE The one-seg classification should be designed for portable receivers, also known as handheld, specially
recommended for small displays, normally up to 7 inches. Among the products classified as one-seg, but not limited thereto,
are receiver integrated with cell phones, PDA, dongles and portable TV sets, which are powered by an internal battery
and, therefore, without the need to use of an external power supply source, as well as those intended for vehicular reception.
This kind of receiver should be capable of receiving and decoding only digital terrestrial television signals in layer A
of the transport stream, and therefore only the basic profile, intended for portable devices.
4 Abbreviation
For the purposes of this part of ABNT NBR 15608, the following abbreviations apply.
HD High Definition
SD Standard Definition
TS Transport Stream
VUI Video Usability Information
5.1 General
The design of any full-seg receiver should be made under the assumption that any legal structure as permitted
by ITU-T Recommendation H.264 may occur in the transport stream complying with ABNT NBR 15602-1, even
if initially not used by the broadcasters.
When using a different aspect ratio from that of the original video source for transmission, such as when using
the pillar-box or letterbox format, bars should be inserted as part of the transmitted image. The added bars may
at the content producers discretion contain graphics and colors.
Special care should be taken with the image transition regions, between the original content region and the regions
artificially inserted in the image, so that coding is smooth and no undesirable artifacts are inserted in the content.
Full-seg receivers should be capable of decoding pictures with luminance resolutions as shown in Table 1 and
applying up-sampling to allow the decoded pictures whenever necessary to be displayed at full-screen size.
In addition, full-seg should be capable of decoding pictures with lower resolutions than the monitor (SD)
and displaying them with pillar-box.
Displayed picture -
Coded picture
Horizontal up sampling
Luminance resolution
Aspect ratio Aspect_ratio_idc 4:3 Monitors 16:9 Monitors
(horizontal x vertical)
1 080 x 1 920 16:9 1 x 4/3 x1
720 x 1280 16:9 1 x 4/3 x1
480 x 720 4:3 3 x1 x 3/4
480 x 720 a 16:9 5 x 4/3 x1
a
In this case, the up-sampling may be applied to the 16:9 image pixels in order to allow them to be
displayed in a 4:3 monitor.
Images with resolution lower than 480 lines should not be transmitted, even if the content has been converted
to one of the resolutions permitted in the system.
Up-sampling of 4:3 pictures for display on a 16:9 monitor is optional in the decoder, as 16:9 monitors can
be switched to operate in 4:3 mode.
For the display of 480x720 content with 4:3 aspect ratio on a 16:9 screen, it is recommended removing 8 pixels
from each side of the image before applying the x3/4 scaling.
At a random access point, the video stream is characterized by one of the two following items of information:
2) Sequence Parameter Set, Picture Parameter Set, Recovery Point SEI Message and I-type picture.
The time interval between random access points may vary between programs and within the same program.
However, the maximum interval between consecutive access points should be set by broadcasters within 1 s range.
Receivers should be capable of initiating decoding and presenting pictures based on one random access point.
Compliance with the video coding restrictions of ABNT NBR 15602-1 may be ensured noting the restrictions
of the profile@level, HP@L4.0 of ITU-T Recommendation H.264.
Both field picture format and frame picture format may be used for interlaced images. These formats may
be switched within sequential units.
The range of video coding rates may provide a guideline regarding video coding rates of transmission devices
used during regular operations. No guarantee limit is specified for operating of the receiver units.
It is recommended that the upper limit of video coding rates be defined by ITU-T Recommendation H.264 profiles.
As described in 5.2, it is recommended converting the content originally produced in a different resolution from
those listed in ABNT NBR 15602-1 to one of the permitted transmission formats. In these cases it is strongly
recommended that the AFD information be transmitted in order to identify the interest region of the transmitted
video.
Although ABNT NBR 15602-1 allows for either Main or High coding profiles, both for SDTV and HDTV source
signals, the use of High profile is strongly recommended for video coding.
High profile should be correctly signaled by using setting profile_idc at value 100.
Similarly, although level 4.0 is the maximum level to be borne by video decoders, the value of level_idc for each
resolution should be according to Table 2 and ABNT NBR 15602-1.
Luminance resolution
Frame rate Level_idc options
(vertical x horizontal)
480 x 720 60i 30, 31, 32 or 40
480 x 720 60p 31, 32, 40
720 x 1280 60p 32 or 40
1 080 x 1 920 60i 40
5.4.5 Colorimetry
The information of the colorimetry system and the specifications of ABNT NBR 15602-1 assume that, in the case of
SDTV source signals, SMPTE 170M signals are used.
If the input signal is compliant with the luminance and chrominance (color-difference) signal equation under
provisions of other specifications, except either ITU-R Recommendation BT.709-5 or
ITU-R Recommendation BT.601-6, the difference between the equations should be compensated before
transmission.
In exceptional cases where other color standards are used, the VUI information parameters of Table 3 should
be signaled accordingly.
The SPS, Sequence Parameter Set, configurations in Table 4 and the VUI parameters, in Table 5 should be noted.
Main profile = 77 e
77 or 77 or
profile_idc 77 or 100 77 or 100 high profile = 100
100 100
(recommended)
constraint_set0_flag 0 0 0 0
constraint_set2_flag 0 0 0 0
constraint_set3_flag 0 0 0 0
chroma_format_idc 1 1 1 1 4:2:0
chroma_samplo_loc_type_top_field 0 0 0 0
chroma_samplo_loc_type_bottom_field 0 0 0 0
Not applicable to
qpprime_y_zero_transform_bypass_flag 0 0 0 0
Main profile
pic_width_in_mbs_minus1 119 79 44 44
pic_height_in_map_units
33 44 29 14
_minus1
frame_cropping_flag 1 0 0 0
frame_crop_left_offset 0 0 0 0
frame_crop_right_offset 0 0 0 0
frame_crop_top_offset 0 0 0 0
frame_crop_bottom_offset 2 0 0 0
When the fixed_frame_rate_flag is 1, the decoding and display interval of the adjacent picture is 1 001/15 000
or more and 2 * num_units_in_tick / time_scale for num_units_in_tick, time_scale is specified by VUI.
EXAMPLE 1 When time_scale = 30 000, num_units_in_tick = 1 001, 2*num_units_in_tick/ time_scale = 2 * 1 001 / 30 000 =
2 / 29.97, in other words the decoding and display interval of the adjacent picture should be 2/29.97 s.
EXAMPLE 2 When time_scale = 24 000,num_units_in_tick = 1 001, the interval of the adjacent picture should be 2/23.9 s.
Full-seg receivers that support pan-scan vectors enable 4:3 monitors to completely fill the screen for 16:9 coded
images.
The pan-scan vector corresponds to the horizontal offsets in the center of the video frame centre specified by
a non-zero value in frame_centre_horizontal_offset field in a H.264 video stream.
The H.264 encoder may optionally include pan-scan vectors and AFD.
The decoder may use the AFD as part of the logic that decides how the full-seg decoder processes and positions
the reconstructed image for display on a monitor, when the monitor aspect ratio does not match the source aspect
ratio that is, whether to use pan-scan vectors, or generate a letterbox.
The AFD should indicate the portion of the coded video frame that is of interest to a heterogeneous receiver
population. The AFD can ensure optimal content presentation in 4:3 and 16:9 monitors, and considering user
defined preferences.
For correct displaying of the visual contents, it is recommended using this mechanism in conjunction with
the aspect ratio of the content and with the frame format of the television.
The AFD should be a 4 bits field, that is, it is possible to configure up to 16 options, but only 10 can be used.
It is not necessary to use the AFD when protecting only the central region in 4:3 format of a 16:9 image.
This mechanism is optional in full-seg receivers. For a more detailed description, the standard ETSI TS 101 154
should be consulted.
The aspect ratio information should be used to define the active format to be broadcast regardless of the format of
the transmitted frame (4:3 or 16:9).
1) if the aspect ratio is bigger than 16:9, the Active format 4 should be used (preferred to 8);
2) if the aspect ratio is 4:3, the frame mode should be coded in 4:3;
3) if the movie is coded to be shown in a bigger aspect ratio (14:9) (with shout & protect), the Active
format 13 should be used, otherwise the Active format 9 should be used;
4) if the aspect ratio is 16:9 and if the movie was coded to be shown in a smaller aspect ratio (14:9 or 4:3)
(with shout & protect), the Active format 14 or 15 should be used, otherwise the Active format 10;
To enable cutting of the images using the Active format 13, 14 or 15 when the coded movie was not made for using
a different aspect ratio, it is advisable to undertake subjective tests for determining the most adequate value.
The decoder should always consider the possible presence of an AFD field, even if this information is not used.
The decoder should not presume that the encoder did not use AFD.
If AFD information is not present, user preferences should be used, or preferentially, the format that enables the full
screen to be shown (letterbox if the image aspect ratio is greater than the TV, pillar-box if it is smaller).
Even where AFD is transmitted, it should be possible to bypass the AFD code by using the user preferences stored
in the decoder.
The pan-scan vector is a horizontal offset in video frame centre specified by a non-zero value in
the frame_centre_horizontal_offset parameter in a bitstream, in compliance with ITU-T Recommendation H.264.
Pan-scan vectors for a 4:3 window should be included in the transmitted bitstream when the source aspect ratio
is 16:9 or 2.21:1. The vertical component of the transmitted pan-scan vector should be zero.
It is up to each broadcaster to determine the bit rate employed when video coding SD and HD signals
for broadcasting. The quality of the encoded videos should be checked when coding with the chosen bit rates.
Such checks can be performed using commercially available video quality analyzers.
Pre-processing functions are frequently applied to the audio input signal at the encoder so that the audio signal(s)
conform(s) to the recommended parameters of audio quality. Post-processing functions may include equalization,
re-sampling, gain control (intensity or volume) and dynamic range. In the use of the pre-processing, changes in the
original program content should be avoided.
Post-processing functions may be used to transform and adapt the signal(s) for final presentation in the receiver.
Post-processing functions at the receivers include mixing, downmixing, volume control, format transcoding and also
equalization with some devices. However, changes in the artistic program content should be avoided.
6.2 Downmixing
The automatic generation of downmix in the receiver should not be considered as the best quality compromise for
the generation of a stereo program. Whenever possible, downmix should be performed in a professional studio
environment, prior to coding and transmission. The stereo program generated should be sent as an additional
program for selection by the user.
It is recommended that downmix for stereo, that is, the conversion of a multiple channel program into a stereo
program occurs in all full-seg receivers that receive multiple channel signals and that have two-channel exits
(stereo). It is important for set-top type full-seg receivers to always have a stereo audio exit.
It is recommended following the downmix for stereo equation described in ISO/IEC 14496-3 (matrix-mixdown
process), in addition to the recommendations for using it, including the recommended procedure when the
pseudo_surround_enable is activated. The method described can only be applied to mixing and converting
5 channel programs (3 front + 2 surround, discarding the LFE channel) into a stereo program. In a multichannel
broadcast, the downmix A coefficient to be used in the equation should always be transmitted as defined by the
ABNT NBR15602-2.
One of the objectives of this equation is to avoid overflow in the AAC decoder. However when equation Is used for
converting (downmixing) a 5 channel program into a stereo program, the difference between the sound levels
of the stereo audio generated by conversion of a normal 2 channel and a 5 channel program still may be significant
and the downmixing may continue producing clipping (correlated audio in mixed channels). In contrast to listening
to a DVD, where the user is expected to make active adjustments when the DVD begins to play, volume
adjustment by the viewer for every program can be difficult. It is recommended using the control DRC functionality
as an additional control of downmix for stereo so as to avoid clipping. The use of scale factors for downmix
operations is not recommended.
It is recommended that the receiver make use of the dynamic range control information whenever the respective
metadata information is present.
In order to effectively make use of the dynamic range for the downmix operation, A=0 should be adopted, since it
produces lower attenuation of the average signal levels L and R.
The average reference level prog_ref_level = 96 (0x60) should be used, corresponding to -24dBFS as the 0dB
reference.
In order to assure that the audio in each channel is free from overflow in any situation, , it is recommended
adopting values of A with higher attenuation, such as A = 0.707 (which will produce global attenuation
of approximately -7.7 dB). However, as this value causes a decrease in the normal audio level, higher order bits
may be virtually useless.
In the case of A = 0, when in-phase high level signals are present simultaneously in the five audio channels, L
and R may suffer clipping. However, this situation is considered unlikely in normal audio programming.
If clipping probability is low, the utilization of a coding scheme that limits the signal wave, when the clipping effect
occurs may avoid significant distortion in the audio signal.
In a 2-channel receiver (stereo), the perceived volume difference between a program originally produced in stereo
and a stereo program obtained by downmix of a 5 channel program should be the lowest possible. This happens
with higher probability by adopting a downmix factor of A=0,707, when the same attenuation factor is applied to the
central and surround channel.
Program switching may lead to audio mode switching. For the encoding system, the recommended mapping
of audio channel pairs during program switching and the transmission system is shown on Table 6, which uses
as an example pairs of audio channels delivered in AES/EBU (AES-3) format. Another transport system may
be adopted, using the same channel mapping by input pairs.
Table 6 defines the relationship between the audio mode during the input to the encoder and channel mapping.
The decoder should be capable of decoding the MPEG-4 AAC audio signal in any of the channel modes permitted
under ABNT NBR 15602-2, However it is optional for each manufacturer to provide an audio output terminal in any
type of receiver. The exception here involves set-top boxes, which should be equipped with a two-channel (stereo)
analog audio output.
The decoders should be capable of decoding HE-AACv1 level 2 signal (mono and stereo) at rates up to 48 kHz
and should be capable of decoding the HE-AACv1 level 4 signal (5.1 multichannel) at rates up to 48 kHz, according
to ETSI TS 101 154:2007, subsection 6.4.
If the full-seg receiver has two audio output formats (stereo and multichannel), the default settings for the audio
signal should be stereo and the user should be able to select multichannel audio for digital outputs (S/PDIF,
in accordance with the standard IEC 60958, or HDMI) or for analogue outputs when available.
It is recommended that the receiver make the appropriate conversion of the available signal for the selected output,
stereo or multichannel, analogue or digital.
If the decoder has a digital output in the S/PDIF or HDMI formats, it is recommended making the PCM stereo signal
available for home theater systems.
In receivers without multichannel outputs, the audio signals should be downmixed to stereo to the analog output.
For multichannel outputs, in the event the receiver supports HDMI or S/PDIF digital outputs, it is recommended
exporting the multichannel digital signal to home theater or having analog outputs in the receiver.
The multichannel signal can be exported in PCM format (uncompressed channels) or coded (for example,
AAC, HE-AAC or DTS), and home theater systems should decode and playback on the corresponding analog
outputs.
Transcoding to formats other than MPEG-4 AAC is not recommended from the point of view of quality assurance.
However, if on account of other requirements, transcoding to other existing formats is implemented (for example
DTS or Dolby AC-3) or another future system, it is recommended following the pertinent interfacing standards,
while home theater systems should decode from the chosen format and reproduce it on the corresponding
analogue outputs.
Transcoding to DTS audio streams is optional on full-seg receivers, however, once implemented it should be
in accordance to ETSI TS 102 114 at a fixed bit rate of 1.536 Mbps. In the presence of 5.1 signal, the DTS
transcoder should be set to AMode=9.
The coding and decoding of MPEG-4 AAC, MPEG-4 HE-AAC and MPEG-4 HE-AAC v2 elementary streams are
based on ISO/IEC 14496-3.
The MPEG-4 AAC and the MPEG-4 HE-AAC v1 profiles are subsets of the MPEG-4 HE-AAC v2 profile.
The MPEG-4 HE-AAC adds the audio object type (AOT) SBR to the MPEG-4 AAC profile. The MPEG-4 HE-AAC
v2 profile adds the audio object type (AOT) PS to the MPEG-4 HE-AAC profile to improve the audio quality at low
bit rates. Every HE-AAC decoder can decode an HE-AAC v2 bitstream, but may not be able to use the parametric
stereo (PS) information and therefore replay only a mono signal.
HE-AAC v2 coding is not applicable to the full-seg reception. However, all full-seg receivers should be capable
of decoding HE-AAC v1@L4, which corresponds to the most comprehensive compression format to be supported.
The audio parameters such as content description and signal formatting should be established, inserted
and modified in the following stages:
a) in the encoder input, defined by the signal input format (for example, sample rate);
b) in the encoder configuration (for example profile/level, ancillary metadata, multichannel mode).
c) in assembling the SI tables (for example audio component descriptor, MPEG-AAC parameters descriptor)
d) in the receiver, during decoding and playback (for example volume, preferred reproduction multichannel mode)
The main audio coding parameters recommended for full-seg services are shown in Table 7.
If the SBR tool is used, the AAC core codec runs at half of the sampling frequency indicated in Table 7.
Details of the audio encoder specification are not included in this Standard.
Table 8 shows the signaling options that should be applied to audio coding, according to ABNT NBR 15602-2.
The receiver should deal smoothly with the service changes in order to minimize undesirable effects for the user.
This recommendation applies to the modifications made on bitstream parameters within the same service ID
to be transmitted from a local station. Specifically, it applies to changes on the following parameters:
sampling rate;
bit rate;
When switching between audio parameters, decoding generates noise under specific circumstances and receiver
units insert a mute in many cases. Therefore silent passages should be inserted in the signal to the encoder in
order to prevent interruption to the audio program during switching.
a) changes made in any of the audio parameters should be made with aproximately 0.5 seconds of silence in
the input of the audio encoder. The silent passages should be kept to the minimum duration possible;
b) the encoder should wait until there is no more stored data in the encoder and decoder. Then, the encoder
should change the desired audio parameter and will continue the encoding process. After the coding resumes,
an amount of predefined coded audio data should be stored in the encoder memory and finally, the encoded
data should be transmitted to the decoder;
c) as, in accordance to ABNT NBR 15602-2, the audio data is transmitted using MPEG-4, a PTS should
be added to the first transmitted data frame after any interruption. And, to ensure that the decoder can detect
a parameter change, there should also be an interval of at least three frames between the old data PTS and
the PTS added to the new transmitted bitstream;
d) the audio decoder should have enough memory capacity for the maximum number of audio channels.;
e) on the receiver side, the memory may suffer underflow and silence should be emitted if the memory becomes
empty (if necessary, the audio level should decay immediately before the memory becomes empty). After the
memory is empty, the decoding process should resume when predetermined audio data arrives.
It is recommended using a fade out strategy in the signal during the muting period, in order to avoid
undesirable noise when the audio returns;
f) the decoder should stop the decoding process and should silence audio when there is no audio data in
its input memory. If bitstream data remain in the decoder input memory and a new audio frame is found,
the decoder should wait for the correct amount of data to arrive, and only after storage in the memory should
the decoding process resume based on the new audio parameters.
The decoder should stop audio silence and should emit audio signals when so requested (at anytime after having
completed the decoding process of two frames). However, note that the bitstreams generated by the model above
are, in practice, transmitted through MPEG-2 systems and that the decoder carries out memory control using
system memory and PTS. In this case, it is not always possible for the decoder to realize that its memory is empty,
despite the above condition. The decoder should then determine that a parameter change has been done
by noticing that the bitstreams are not successive, based on PTS information added to the first audio frame
that follows the parameter change and also taking into account the system clock.
When the sample rate is modified, the decoder should modify its reference clock, so that an unstable transient
condition should occur during a limited time period.
An imperceptible bit rate change can be ensured by controlling the memory resources on the encoder side. If
memory control is not possible because of, for example, coding delay changes caused by bitrate changes, the
general recommendations listed on 6.5.4.2 should be adopted.
A service change may result in a channel configuration change and therefore trigger a decoder reset at the splicing
point.
On the decoder side, the default audio output configuration should be PCM stereo, with the full-seg receiver
in charge of the appropriate conversion for the user-selected output format.
The recommended reference volume level should be -24 dBFS. However, it is optional for broadcasters to transmit
the reference volume level being used in the actual audio program, if different from the default level, so that
the full-seg receiver can properly equalize the volume level between channel switching.
Since the minimum acceptable signal level in an audio system is related to the minimum signal to noise ratio (SNR)
acceptable for low signal levels, determining it should be an operational decision. However, the lowest acceptable
signal should be at least 40 dB above the noise level. The usable dynamic range is therefore equal
to the difference between the maximum and minimum recordable audio levels (40 dB above noise level).
For a system capable of registering 90 dB above dynamic range, the minimum recordable level may be around - 62
dBu. This value may sometimes exceed the minimum acceptable level of absolute intensity, which will lead to lower
levels not being used.
The 0 dB intensity reference at the encoder should be around + 4 dBu or - 24 dBFS in intensity, in line with
equalization practices and compatible with the reference levels practiced by commercial equipment in studios. In
this case, 24 dB of reserved headroom are available.
The maximum acceptable signal level is related to the maximum total harmonic distortion (THD) and should ideally
not exceed 0.1 %.
If the user is listening to the program in a noisy environment, the audio dynamics level will be extremely reduced
by noise level, and may possibly not exceed 40 dB. The DRC functionality should be used to reduce the dynamic
range, thus assuring the intelligibility.
The audio quality can be measured by subjective or objective audio listening tests. In order to assure an indication
of minimum quality the SDG (subjective difference grade) and ODG (objective difference grade) values obtained
comparing the encoded/decoded signal with the original signal should be between 0 and 3.
Figure 1 indicates the typical bit rates ranges for the use of MPEG-4 HE-AAC v2, MPEG-4 HE-AAC and MPEG-4
AAC encoder in coding stereo signals.
Table 9 presents the range of coding rates for the audio coding sampled at 48kHz to be used as reference values
in order to assure uniformity in the minimum quality level of the broadcasting services. These values represent
indicative rates for LATM/LOAS using LC and HE AAC profiles.
One-seg video coding should follow ITU-T Recommendation H264 and ISO/IEC 14496-10.
All one-seg capable receivers compliant with ABNT NBR 15602-2 should be capable of correctly interpreting
and decoding the parametric stereo (PS) parameters when decoding stereo audio.
The video coding parameter restrictions should be in accordance with ABNT NBR 15602-1, which specifies
Baseline profile and level 1.3.
I and P slices: intracoding of macroblocks using I slices and P slices opens the way for intercoding using a
temporal prediction signal;
4x4 transformed: the residual prediction is transformed and quantized using 4x4 blocks;
CAVLC: the symbols of the coder are entropy-coded using a context-based variable length code;
Flexible Macroblock Order (FMO): this Baseline profile functionality, which permits arbitrary sampling of the
macroblocks within a slice, is not envisaged in ABNT NBR 15602-1. The prohibition is signaled
by constrained_set1_flag being equal to 1;
Arbitrary Slice Order (ASO): this Baseline profile functionality, which permits arbitrary ordering of slices within a
picture, is not envisaged in ABNT NBR 15602-1. The prohibition is signaled by constrained_set1_flag being
equal to 1.
Redundant Slices (RS): this Baseline profile functionality, which permits the transmission of redundant slices
that approach the primary slice, is not envisaged in ABNT NBR 15602-1. The prohibition is signaled
by constrained_set1_flag being equal to 1.
Considering the maximum resolution of the 288 x 352 pixels system and the maximum frame rate of the Baseline
profile and level 1.3, the coded video bit rate may reach the maximum bitrate of 768kbps. It is recommended
that encoders and one-seg receivers support this maximum bitrate.
For the one-seg service the restrictions of PES packets below should be noted:
d) the PTS difference of two consecutive PES packets should be within 0,7 s.
IDR access units should be inserted into the bit stream at intervals of approximately two seconds to shorten
the required reproduction time. However, they should be inserted at intervals not exceeding five seconds,
in accordance with ABNT NBR 15602-1.
Each IDR-AU should be an elementary stream access point described in ISO/IEC 13818-1 (see Figure 2).
The number and order of NAL units to configure the IDR AU and non-IDR AU should be in accordance with
Table 11. However, NAL units other than the following should not operate (see Table 11 and 12).
Quantity
Type and order of NAL unit
IDR AU Non-IDR AU
Access unit delimiter 1 1
Sequence parameter set (SPS) 1 0
Picture parameter set (PPS) 1 0 or 1
a
Supplemental enhancement information (SEI) 0 or 1 0 or 1 a
IDR picture coded slice 1 and above 0
Non-IDR picture coded slice 0 1 and above
Filler data 0 or 1 0 or 1
End of sequence 0 or 1 0 or 1
a
Insertion condition of SEI NAL unit is on Table 22.
Flags restrictions in the syntax are shown in the Tables 13 and 14. However, the ID of the SPS (Sequence
Parameter Set) and PPS (Picture Parameter Set) can be set to operate by a fixed value regardless of changes
in the context described in the parameters.
when the fixed_frame_rate_flag is 0, the interval for decoding and displaying the next picture should be any
value exceeding 1 001/15 000 and the multiples of num_units_in_tick / time_scale for num_units_in_tick
and time_scale should be specified by VUI.
EXAMPLE 1 When time_scale=30 000, num_units_in_tick = 1 001, num_units_in_tick/time_scale = 1 001/30 000 = 1/29.97, in
other words, the unit of value of cpb_removal_delay becomes 1/29.97 s. Moreover, since there are restrictions on 1 001/15 000
or more, the difference of the cpb_removal_delay (interval of adjacent pictures) is two or more.
EXAMPLE 2 When time_scale = 24 000, num_units_in_tick = 1 001, the unit of cpb_removal_delay becomes 1/ 23.9 s.
Also, since there are restrictions on 1 001/15 000 or more, the difference of cpb_removal_delay becomes 2 or more.
Restrictions on the SEI (Supplemental Enhancement Information) are shown in the Table 17. Buffering period,
picture timing, pan-scan, and filler payload SEI message can only be inserted in the SEI.
The existence and the order of each AU (access units) in the SEI message are as follows:
IDR-AU: Buffering period in the SEI message, picture timing in the SEI message, pan-scan rectangle
in the SEI message, and filler payload in the SEI message can be inserted in the relevant AU, and inserted
if necessary. The insertion order should be as follows:
non IDR-AU: only the picture timing in the SEI message and filler payload in the SEI message can be inserted
in the relevant AU, and inserted if necessary. The insertion order should be as follows:
In addition, the IDR-AU and the non-IDR-AU access unit, existing or not in the filler payload of the SEI is not
specified.
The buffering period in the SEI message is shown in Table 17 and picture timing in the SEI message is as shown
in Table 18.
The parameters related to the pan-scan rectangle are described in Table 19.
Restrictions regarding the insertion of the buffering period in the SEI message and Picture timing in the SEI
message are as shown in Table 20.
Table 20 - Restrictions on the buffering period in the SEI message and picture timing in the SEI message
When the buffering period in the SEI message and picture timing in the SEI message are inserted, at least
nal_hrd_parameters_present_flag or vcl_parameters_present_flag should be 1.
When neither the buffering period in the SEI message nor the picture timing in the SEI message are inserted,
both the l_hrd_parameters_present_flag and vcl_parameters_present_flag should be 0.
EXAMPLE When time_scale = 30 000, num_units_in_tick = 1 001, the frame rate is 29.97/2 frames/s.
Restrictions on inserting the pan-scan rectangle in the SEI message are in Table 21. When the pan-scan
parameters are used, they should be inserted in IDR-AU.
Table 22 presents the reference picture list reordering restrictions while the decoded reference picture marking
restrictions are in Table 23.
The stream input in the CPB should be set to be decoded within 1.5 s.
Broadcasters may transmit 16:9 images though the image format is only QVGA. When the pictures are 16:9,
pic_height_in_map_units_minus1 of the H.264 bitstream SPS should be 11.
As a rule, the pic_height_in_map_units_minus1 should not be modified for a program, they can be operated
in a semi-fixed state by broadcasters. However, during the period of simultaneous broadcasting with analog
spectrum, in the presence of program of 4:3 and 16:9 aspect ratio, the aspect ratio indication may change for each
program.
It is possible not to display borders, depending on the picture angle of the receiver unit by configuring the following
pan-scan parameters when delivering by a different aspect ratio for actual picture source, like pillar boxes and letter
boxes:
when displaying a part of an original picture source (320x180) in a 16:9 full screen picture display area, and
when the delivered picture format is QVGA 4:3(320x240) and the original image source is 16:9 (letterbox);
When displaying a part of original picture source (240x180) in 4:3 full screen picture display area, and when
the delivered picture format is QVGA 16:9 (320x180) and the original picture source is 4:3 (pilarbox)
It is recommended sending the center of the delivered image format and the center of the image source
corresponding in both horizontal and vertical directions.
The value of each parameter is indicated in Figure 3, when the above-mentioned operation is carried out.
The restrictions for each parameter during the pan-scan operation are presented on Table 24.
Pan-scan can be turned on/off by the coded video sequence unit. When the above pan-scan operation
is not carried out, the encoding of the pan-scan rectangle of the SEI message should not be done (when the pan-
scan operation is done, the pan-scan rectangle of the SEI is always included in IDR-AU).
The recommended AAC sampling frequencies should be 24 kHz, 22.05 kHz or 16 kHz, since the use of SBR
is mandatory in accordance with ABNT NBR 15602-2. Therefore the sampling rates of the decoded audio are
48kHz, 44.1kHz and 32kHz.
All one-seg receivers compliant with ABNT NBR 15602-2 should be capable of adequately interpreting and
decoding the parametric stereo (PS) on stereo signals decoding.
In regard to the range of audio coding rates (with sampling frequency 48 kHz), the following values may be used as
a guideline:
In regard to the range of audio coding rates (with sampling frequency 44.1 kHz), the following values may be used:
In regard to the range of audio coding rates (with sampling frequency 32 kHz), the following values may be used:
8.1 Random_access_indicator
When encoding, the random_access_indicator bit should be set whenever an RAP occurs in video streams in
accordance with ITU-T Recommendation H.264:2005, subsections 3.1 and 5.5.5.
When decoding, the random_access_indicator bit may be ignored by the decoder. However, it can be beneficially
utilized together with the elementary_stream_priority_indicator to identify a RAP.
8.2 Elementary_stream_priority_indicator
When encoding, the elementary_stream_priority_indicator bit should be set only when an access unit containing an
I or IDR picture (slice_type 0x02 ou 0x07) is present in the video stream. Similarly,
the elementary_stream_priority_indicator should be set in the adaptation header of the transport packet that
contains the first slice start code of this I or IDR picture. This adaptation header may be in the transport stream
packet immediately after the packet containing the random_acess_indicator
When decoding: the elementary_stream_priority_indicator bit may be ignored by the receiver. It can be beneficially
utilized to support complex operational modes.
Although ITU-T Recommendation H.264 allows multiple video pictures to be transmitted in a single PES packet,
this function should not be used by encoders.
When encoding, every PES header should contain the Presentation Time Stamp (PTS) and Decoding Time Stamp
(DTS), of the first access unit in the PES packet. The start of the first access unit should occur in the same
transport packet as the PES header or the packet of the same PID immediately following the packet with the PES
header, if the data preceding the access unit start code forces the access unit start code into the next transport
packet.
When a PES packet contains multiple access units, for any access units following the first access unit in the same
PES packet the syntax elements num_units_in_tick, time_scale, pic_struct (if preset), and the value of the variables
TopFieldOrderCnt and BottomFieldOrderCnt of the access unit should allow the derivation of the PTS and DTS
of the access unit.
When decoding, if the PTS is available and the DTS is not available for the first access unit in the PES packet,
the decoder should set the DTS value equal to the PTS value. The PTS and the DTS of any access units following
the first access unit in the same PES packet should be derived using the syntax elements num_units_in_tick,
time_scale, pic_struct (if present), and the value of the variables TopFieldOrderCnt and BottomFieldOrderCnt.
All transmission and reception rules of the PSI Tables defined by ARIB TR-B14:2006, volume 2, section 30 should
be applied.
Since the complete implementation of PSI/SI depends exclusively on the receiver manufactures, strictly, there is no
mandatory processing of PSI/SI information. However, some rules should be followed:
a) the basic selection functions should not be interrupted when any error occurs in the service information.
Viewers should be always able to select the desired service using PSI tables and the NIT. No service
information error should interrupt channel selection using the PSI only. Channel selection using PSI only
means that the viewers can choose at least the services including switching TS and that the default
components are present in the service. The SI is not always transmitted in the same layer as that of the
service; therefore, a part of SI may not be received when the service is received. Also, moving the portable
or mobile device may lead to SI information loss;
b) transmitted information should not be used for the different purpose than the transmission intended.
Each information item present in the SI is transmitted for a purpose. Since the presentation of SI information is
free, how to manipulate it and present it to viewers depends on the manufacturer of the receiver;
c) no malfunction should be observed when any field of the PSI area currently reserved for future extension
is used. This rule aims to ensure the gradual extension PSI/SI specifications. To comply with the rule, it is
recommended that the receiver ignore fields identified as ISO_639_language_code, reserved or reserved
for future use which it is unable to process. It is desirable that the the receiver should also consider invalid any
data informed other than the specified range of values. In these fields, it is recommended that the values
be informed according to current transmission rules, and it is possible that rules will be established in the future.
That is, if the use of these fields is extended, it is presumed that receivers prior to the change will be incapable
of processing this extension.
With regard to PSI/SI sections received, section headers should be correctly interpreted. When the correct section
header is received, the analysis process for the internal section can begin.
However, where there are errors in receiving the header, the rules of ARIB TR-B14:2006, volume 2, section 5
should be observed.