SRS For Speech Recognizer and Synthesizer

SAR (Synthesize and Recognize) Version 1.
0
Software Requirements Specification 16-09-2014
Mighty Midgets

1 | P a g e
Mighty Midgets | IIMET

Mighty Midgets

SAR
Technology That Listens

Software Requirements Specification

Version 1.0

Team Guide: Ms. Swati Saxena / Mr. Pankaj Jain

Members: Hitesh Khandelwal, Ojasvita Sharma, Himanshi Gupta, Ipseema Ved

College Name: International Institute of Management, Engineering and Technology

Department: Computer Science

State: Rajasthan

SAR (Synthesize and Recognize) Version 1.0
Mighty Midgets

2 | P a g e

Revision History

Date Version Description Author
12-09-2014 1.0 Synopsis Mighty Midgets
19-09-2014 2.0 SRS Mighty Midgets

Mighty Midgets

3 | P a g e
Table of Contents

Description Page No.

1.0 Introduction

1.1 Purpose

1.2 Scope

1.3 Definition, Acronyms, and Abbreviations

1.4 References

1.5 Technologies to be used

1.6 Overview

2.0 Overall Description

2.1 Product Perspective

2.2 Software Interface

2.3 Hardware Interface

2.4 Product Function

2.5 User Characteristics

2.6 Constraints

2.7 Architecture Design

2.8 Use Case Model Description

Mighty Midgets

4 | P a g e
2.9 Sequence Diagrams

2.10 Assumptions and Dependencies

3.0 Specific Requirements

3.1 Use Case Reports

3.2 Supplementary Requirements

Mighty Midgets

5 | P a g e

Software Requirements Specification

1.0 Introduction:

1.1 Purpose:

The purpose of this document is to present a detailed description of the
requirements for the desktop application "Speech Synthesizer and
Recognizer". It will explain the purpose and features of the system, the
interfaces of the system, what the system will do, the constraints under which
it must operate and how the system will react to external stimuli. This
document is primarily intended to be proposed to the Computer Science
Department, IIMET for its approval and a reference for developing the first
version of the system for the development team.

1.2 Scope:

The "Speech Synthesizer and Recognizer" is a desktop application which
provides a very natural way to the users to interact with the computer without
any training requirements. It almost eliminates the use of keyboard, mouse or
any other input providing interface. There are two purposes served by this
software:

i. Speech Synthesizer: This part of the application converts the normal
language text into speech irrespective of the file format for which the
operation has to be performed. It is the artificial production of human
speech. This application can be used as screen reader for people
with visual impairment, other than that this can also be used by people
with dyslexia and other reading difficulties as well as by pre-literate
children.

ii. Speech Recognizer: This is another speech analyzer which can be
defined as an independent, computer-driven transcription of spoken
language into computer-based language. This part of application allows
a computer to identify the words that a person speaks into a microphone,
recognizes the command given and performs the operation as per the
requirement of the user. This can be used in Hands-free Computing,
Mighty Midgets

6 | P a g e
Car-Based System and Health-Care System.

This software in platform dependent and require Window's Operating System
for its functioning. The synthesizer part of the software makes use of the
MBROLA language. This software needs a microphone for providing clarity in
the words been spoken by the user for the purpose of recognition. The
application maintains a JSGF file which is a grammar file that is it stores
acoustic language for proper recognition and synthesis of the normal human
language which is supposed to be US English. The software includes an xml
file which gets loaded whenever program runs. This is a configuration file
which includes the packages for recognition of the commands. The application
has the ability of reading the text from any kind of file which could be a word
file, pdf file and txt file. The recognizer has the ability to open or close any
application through voice recognition.

1.3 Definitions, Acronyms, and Abbreviations:

XML (Extensible Markup Language): It is a markup language that was
designed to transport and store data.
JDK (Java Development Kit): It is an implementation of either one of
the JAVA SE, JAVA EE or JAVA ME platforms in the form of a binary
product.
JSGF (Java Speech Grammar Format): It is a textual representation of
grammars for use in speech recognition.
API (Application Programming Interface): It specifies the software
component in terms of its operation, their inputs and outputs and
underlying types.
Mighty Midgets

7 | P a g e

1.4 References:

IEEE Software Engineering Standards Committee, IEEE Std 830-1998,
IEEE Recommended Practice for Software Requirements Specifications,
October 20, 1998.

Software Engineering, Sixth Edition, McGraw-Hill Education

Wikipedia http://www.wikipedia.com

Java - docs.oracle.com, Complete Reference, www.stackoverflow.com

Speech Recognizer API's - cmusphinx.sourceforge.net

Speech Synthesizer API's - www.github.com, www.itextpdf.com,
poi.apache.org

1.5 Technologies to be used:

JAVA: Application architecture.

XML: Extension Markup Language.

Localization: 1 Language English

1.6 Overview: The SRS will include two sections, namely:

Overall Description: This section will describe major components
of the system, interconnections, and external interfaces.

Specific Requirements: This section will describe the functions of
actors, their roles in the system and the constraints faced by the system.

Mighty Midgets

8 | P a g e
2.0 Overall Description:

2.1 Product Perspective:

This system will consist of two parts: one speech synthesizer and one speech
recognizer. The speech synthesizer will be used to convert the text into speech
while the speech recognizer will be used for converting commands given by
the user in a language which is interpreted by the computer in order to provide
proper output.

A speech synthesizer is a speech engine that converts text to speech. The
javax.speech.synthesis package defines the Synthesizer interface to support
Speech Synthesis plus a set of supporting classes and interfaces. As a type of
speech engine, much of the functionality of a Synthesizer is inherited from the
Engine interface in the javax.speech package and from other classes and
interfaces in that package.

Speech recognizer is a speaker independent application which can recognize
speech from any native user. Speaker-independence is obtained by pre-training
recognition systems with a large number of speakers, so when a new speaker
talks to the system, he/she can expect to fall within already trained or modelled
voice patterns. Unlike old software this software can operate with dictated
continuous speech. Large vocabulary with different words have been provided
for increasing the probability of correct recognition process. A constrained
syntax been used helps recognize words by disambiguating similar sounds.

Since this is a data-centric product a gram file is required to store the entire
vocabulary in a form which is understandable by the system. Speech
recognizer will communicate with this gram file in order to interpret the
commands given by the user. User is supposed to give the instruction in the
microphone which are decoded by the decoder, thus it is converted into a
string. Here the gram file comes in use where the strings in instruction is
compared with those present in gram file and hence the operation is performed
accordingly.

The Speech Recognizer and Synthesizer both are platform dependent and can
only run over Windows Operating System. Also the system must have a java
version greater than 1.6. In case the user does provides the commands clearly
the system will not react over the instruction.
Mighty Midgets

9 | P a g e

2.2 Software Interface:

Front End Client: Java 1.6 and above
Operating System: Windows 7, 8, 8.1

2.3 Hardware Interface:
Client Side: Microphone

2.4 Product Functions:

This Product can perform two functions that is of a speech synthesizer and
recognizer. In this software, a speech recognition module transcribes the users
speech into a word stream. The character flow is then processed by a language
engine dealing with syntax, semantics, and finally by the back-end application
program. A speech synthesizer converts resulting answers (strings of
characters) into speech to the user.

A speech synthesizer is a speech engine that converts text to speech. The
javax.speech.synthesis package defines the Synthesizer interface to support
Speech Synthesis plus a set of supporting classes and interfaces. As a type of
speech engine, much of the functionality of a Synthesizer is inherited from the
Mighty Midgets

10 | P a g e
Engine interface in the javax.speech package and from other classes and
interfaces in that package.

Speech Recognizer

Speech recognizer is a speaker independent application which can recognize
speech from any native user. Speaker-independence is obtained by pre-training
recognition systems with a large number of speakers, so when a new speaker
talks to the system, he/she can expect to fall within already trained or modelled
voice patterns. Unlike old software this software can operate with dictated
continuous speech. Large vocabulary with different words have been provided
for increasing the probability of correct recognition process. A constrained
syntax been used helps recognize words by disambiguating similar sounds.

2.5 User Characteristics:

Physically Disabled User: People with a wide range of disabilities can use
this application for removing various environmental barriers. The longest
application is in the use of screen readers for people with visual
impairment, also this application can be used by people with dyslexia and
other reading difficulties as well as by pre-literate children. This can also
be employed to aid those with severe speech impairment usually through a
dedicated voice output communication aid.

Normal User: This application can be used by any normal user who have
Mighty Midgets

11 | P a g e
certain basic knowledge of the system over which he/she is operating. The
only basic requirement is that the user must be capable enough of
understanding and speaking English language (US English is preferred).

2.6 Constraints:

Operating System: This software is platform dependent, that is, it cannot
work over any other operating system except for Windows. It can operate
over any version of Windows.
Runtime Environment: The system must have a version of java above 1.6,
that is, above jdk 1.6.
Clarity in Speech: The user must be able to speak in proper English. In case
the system fails to understand the instructions given it will not respond at
all.
Lack of Hardware: Microphone is an essential requirement for proper
working of this software.

2.7 Architecture Design:

Mighty Midgets

12 | P a g e

2.8 Use Case Diagram :

Speech Synthesizer

Mighty Midgets

13 | P a g e

Speech Recognizer

2.9 Sequence Diagrams:
Mighty Midgets

14 | P a g e

2.10 Assumptions and Dependencies:

I. Assumptions:

It is assumed that the user have basic knowledge about the system being
used
User can understand and speak English (US English is preferred) in well
manner. Lack of clarity may result in inefficient working of the software
The Operating System over which the software is installed is Windows.
In case this operating system is not available there will be certain
amendments needed to be done in the SRS.
JDK 1.6 or above is present in the system or else this will call for
changes in the features and code
It is assumed that the user is making use of a microphone

Mighty Midgets

15 | P a g e
II. Dependencies:

Hardware Dependency:

The working of the software depends upon the hardware components
present in system. For instance, the microphone will be used for the speech
recognition part of the software. The user is expected to give commands
through the microphone for the clarity purpose.

Software Dependency:

This software requires the presence of a particular operating system, that
is, Windows. Another software dependency is that the system over which
the software is installed must have of version of java above 1.6.

External Dependency:

This software is not suited for people who does not have proper
understanding of English language. Since the Synthesizer reads any file in
US English the user is expected to understand it in the same language. Also
the recognizer accepts the words in proper English.

3. Specific Requirements:

3.1 Use Case Reports:

3.1.1 Speech Synthesizer:
User: User is supposed to feed the text in a file. The file could be of
any format.

System: System will analyze the text provided by the user and then
performs linguistic analysis to generate waveform which will be then
Mighty Midgets

16 | P a g e
listened by the user through speakers

3.1.2 Speech recognizer:
User: User is supposed to give commands to the system through a
microphone.

System: System will recognize the commands and will perform task
accordingly.

3.2 Supplementary Requirements:

Microphone
Windows Operating System
JDK v1.6 and above

SRS For Speech Recognizer and Synthesizer

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

SRS For Speech Recognizer and Synthesizer

Enviado por

Direitos autorais:

Formatos disponíveis

SAR (Synthesize and Recognize) Version 1.

Você também pode gostar