Você está na página 1de 11

ADITYA COLLEGE OF ENGINEERING

PUNGANUR ROAD, MADANAPALLE-517325


IV-B.Tech (R13) II Sem- II Internal Examinations April-2017 (Descriptive) (CODE A)
(13A05802) NATURAL LANGUAGE PROCESSING (Computer Science & Engineering)
Time: 90 min Max Marks: 30

Part A
(Compulsory)
1. Answer the following questions.
a. Explain meaning representation of natural language using FOPC.
We can express the meaning in natural language. In order to analyze and manipulate the meaning of sentences, we will
transform the sentences into a meaning representation language. The simplest form is propositional logic, but it is not
powerful enough for our purposes. Predicate logic combines predicates and their arguments. Predicate logic has simple
rules of inference, such as modus ponens. There are many other issues which we may need to address in our meaning
representation language:
generalized quantifiers (for some, most, )
tense and aspect
modality and belief (need to allow formulas as arguments: John believes Fred likes Mary =
believe(John,like(Fred,Mary)) )
presupposition (All the men on Mars drink Coca-Cola.)
fuzziness (The milk is warm.)
b. What are the contents of an iplan.
Agents behavior does not follow a well formed plan, the plan recognition component must be able to recognize actions
based on this more general model of intentional behavior. For this purpose we can use a structure called
iplan(intentional plan) which consists of
1. A tree hierarchy of goals, with one leaf goal marked the active goal
2. A sequence of actions, where the final action achieved the active goal, and one action is marked the current
action.
Plan inference algorithm is adapted to model the decomposition chaining and other reasoning is used to fill out the goal
hierarchy and plan the sequence of actions to achieve the active goal. The agents behavior is defined in terms of a set of
rational updates to the iplan which is a relation between two iplans.
c. Explain about local discourse state.
Each discourse segment is associated with a local discourse state which consists of the following:
1. The sentences that are in the segment
2. The local discourse context, generated from the sentences in the segment.
3. The semantic content of the sentences in the segment together with the semantic relationships that make the
segment coherent.

d. Describe language as a multi-agent activity.


Language and all other forms of communication necessarily involve multiple agents. Communication can not occur if
one of the agents does not recognize the others attempt to communicate. Communication can occur only when one
agent intends to communicate and the other agent recognizes that intention. The crucial requirement for communication
is that there be an agreed-upon set of conventions with agreed-upon meanings.
To illustrate the distinction between the physical act performed and the communicative act, three acts are observed:
The locutionary act - the act of uttering a sequence of words
The illocutionary act the act that the speaker performs in saying the words
The prelocutionary act the act that actually occurs as a result of the utterance.

The language as a multi agent activity requires some means for the agents to coordinate their communicative acts and to
monitor whether they are being understood. In dialogue, agents use various mechanism to signal that they understand
each other.

e. What are the sources of intentions in a dialogue?


The intentions arise from a combination of the agnets beliefs and desires. New intentions arise from the decision to
do something to change the way things will be if the agent continues on its present course. A particular conversation
agent will have its desires specified as part of only one desire. Attentiveness constraint of the agents behavior will
help us to handle when a question is asked. Sincerity condition plays an important role in many approaches to speech
act. With the addition of the attentiveness constraint and a desire to be sincere, there are enough constraints to drive
their behavior of a simple question-answering agent. To handle misconceptions in a dialogue shared knowledge
preference can be added to the desirability function. An agent prefers states that minimize the difference between
what each agent believes is shared knowledge. The helpfulness preference for responses that allow the other agent to
achieve their intentions more efficiently. If all the things are equal then conciseness preference is preferred.
Part-B
UNIT-III & IV
2.(a) Explain stochastic part of speech tagging using HMM and Viterbi algorithm with an example.
With transition probability network, we can compute the probability of any sequence of categories simply by
finding the path through the network indicated by the sequence and multiplying the transition probabilities
together.
The validity of this depends on the assumption that the probability of a category occurring depends only on
the immediately preceding category.

This assumption is referred to as the Markov assumption, and networks are called Markov Chains.

The network representation can now be extended to include the lexical generation probabilities as well: we
allow each node to have an output probability which gives a probability to each possible output that could
correspond to the node.

For instance the node N in Fig. would be associated with a probability table that gives, for each word, how
likely that word is to be selected if we randomly select a noun.

A network like that in Figure with output probabilities associated with each node is called a Hidden Markov
Model (HMM).

More generally, the approximate probability of generating a sentence w1, ..., wT together with the sequence of
tags C1, ..., CT is

i=1,T Pr(Ci | Ci-1) Pr(wi | Ci)

Finding the Most Likely Tag Sequence

To find the most likely sequence, sweep forward through the words one at a time finding the most likely
sequence for each ending category.
In other words, you find the four best sequences for the two words Flies like: the best ending with like as a V,
the best as an N, the best as a P and the best as an ART.

You then use this information to find the four best sequences for the the words flies like a, each one ending in
a different category.

This process is repeated until all the words are accounted for.

The algorithm is usually called the Viterbi algorithm.

For a problem involving T words, and N lexical categories, the Viterbi algorithm is guaranteed to find the
most likely sequence using kTN2 steps for some constant k.

The Viterbi Algorithm


Given word sequence w1, ..., wT, lexical categories L1, ..., LN, lexical probabilities Prob(wt | Li) and bigram probabilities
Prob(Li | Lj), find the most likely sequence of lexical categories C1, ..., CT for the word sequence.

Initialization Step
for lexcat = 1 to N do
SeqScore(lexcat, 1) = Prob(w1 | Llexcat) Prob(Llexcat | <start>)
BackPtr(lexcat, 1) = 0

Iteration Step
for t = 2 to T do
for lexcat = 1 to N do
SeqScore(lexcat, t) = Maxj=1,N(SeqScore(j, t1) Prob(Llexcat | Lj)) Prob(wt | Llexcat)
BackPtr(lexcat, t) = index of j that gave the max above

Sequence Identification Step


CT = lexcat that maximizes SeqScore(i, T)
for word = T 1 to 1 do
Cword = BackPtr(Cword+1, word+1)
1. It's not clear what happens to BackPtr(-,-) if there is a tie for the maximizer of SeqScore(lexcat, t).
2. If SeqScore(lexcat, t) = 0, this says the sequence is impossible, so there is not point in having a BackPtr in this
case, and there is no point considering sequences with Ct = lexcat any further.

3. In the tables in the examples below, the name of the lexcat is used rather than the number.

SeqScore BackPtr
lexcat
(lexcat,1) (lexcat,1)

V 7.6106

N 0.00725

P 0

ART 0

Sample computation in iteration step:

SeqScore(V, 2) = Max[SeqScore(N, 1) Prob(V|N),


SeqScore(V, 1) Prob(V|V)] Prob(like|V)

= Max[0.00725 0.43, 7.6 106 0.0001] 0.1

= 3.12 104

The category that maximizes SeqScore(V, 2) is N, so BackPtr(V, 2) is N.

lexcat SeqScore SeqScore BackPtr


(lexcat,1) (lexcat,2) (lexcat,2)
V 7.6106 0.00031 N
N 0.00725 1.3105 N
P 0 0.00022 N
ART 0 0

(b) Describe word senses and ambiguity.


To develop a theory of semantics and semantic interpretation, we need to develop a structural model Each word of the
dictionary definitions reflects a different sense of the word. Dictionaries often give synonyms for particular word
senses. If every word has one or more senses then you are looking at a very large number of senses, even given that
some words have synonymous senses. Fortunately, the different senses can be organized into a set of broad classes of
objects by which we classify the world. The set of different classes of objects in a representation is called its ontology.
To handle a natural language, we need a much broader ontology than commonly found in work on formal logic. Two
of the most influential classes are actions and events. Events are things that happen in the world and are important in
many semantic theories because they provide a structure for organizing the interpretation of sentences. Actions are
things that agents do, thus causing some event. Like all objects in the ontology, actions and events can be referred to
by pronouns, as in the discourse fragment We lifted the box. It was hard work. Here, the pronoun "it" refers to the
action of lifting the box. Another very influential category is the situation. A situation refers to some particular set of
circumstances and can be viewed as subsuming the notion of events. In many cases a situation may act like an
abstraction of the world over some location and time. For example, the sentence "We laughed and sang at the football
game" describes a set of activities performed at a particular time and location, described as the situation "the football
game".

Ambiguity is a serious problem during semantic interpretation. We can define a word as being semantically
ambiguous if it maps to more than one sense. But this is more complex than it might first seem, because we need to
have a way to determine what the allowable senses are. A few linguistic tests have been suggested to define the notion
of semantic ambiguity more precisely. One effective test exploits the property that certain syntactic constructs
typically require references to identical classes of objects. Virtually all senses involve some degree of vagueness, as
they might always allow some more precise specification. A similar ambiguity test can be constructed for verb senses
as well. In addition to lexical ambiguity, there is considerable structural ambiguity at the semantic level. Some forms
of ambiguity are parasitic on the underlying syntactic ambiguity. But other forms of structural ambiguity are truly
semantic and arise from a single syntactic structure. A very common example involves quantifier scoping. Quantifiers
also vary with respect to vagueness. The quantifier all is precise in specifying every member of some set, but a
quantifier such as many, as in Many people saw the accident, is vague as to how many people were involved.

A very important aspect of context-independent meaning is the co-occurrence constraints that arise between word
senses. Often the correct word sense can be identified because of the structure and meaning of the rest of the sentence.
One of the most important tasks of semantic interpretation is to utilize constraints such as this to help reduce the
number of possible senses for each word.
or
3. (a) Explain about Best First Parsing algorithms in detail.
Algorithms can be developed that attempt to explore the high-probability constituents first called best-first parsing
algorithms. The best parse can be found quickly and much of the search space, containing lower-rated possibilities, is
never explored. All the chart parsing algorithms can be modified fairly easily to consider the most likely constituents
first. The central idea is to make the agenda a priority queue - a structure where the highest rated elements are always
first in the queue. The parser then operates by always removing the highest-ranked constituent from the agenda and
adding it to the chart. With the modified algorithm, if the last word in the sentence has the highest score, it will be
added to the chart first. The problem this causes is that you cannot simply add active arcs to the chart (and depend on
later steps in the algorithm to extend them). In fact, the constituent needed to extend a particular active arc may
already be on the chart. Thus, whenever an active arc is added to the chart, you must check to see if it can be extended
immediately, given the current chart. Thus we need to modify the arc extension algorithm. The complete algorithm is
shown as follows:
Adopting a best-first strategy makes a significant improvement in the efficiency of the parser.
Write a short note on Probabilistic Context Free Grammar.
Context Free Grammars with some probabilities assigned to every possible rule of the grammar such that the sum of
all probabilities for all rules expanding the same non terminal is equal to 1. Non-terminals that expand more than one
way distribute their probabilities over their rules. The motivation behind augmenting CFGs with probabilities lies in
the fact that in the real world phrases are not uniformly distributed. PCFG parsing takes advantage of probabilities by
giving the most probable parse for a sentence. This makes for a more accurate natural language understander parsing
can be done in O(n3) time using CKY parsing algorithm or its variants. PCFGS are still not powerful to describe the
context sensitive languages. They describe exactly the same languages as their non stochastic counterparts
Properties of PCFG:
1. Assigns a probability to each left most derivation or parse tree allowed by the underlying CFG.
2. Say we have a sentence S, set of derivations for that sentence is T(S). Then a PCFG assings a probability P(t)
to each member of T(S) i.e. we now have a ranking in order of probability.
3. The most likely parse tree for a sentence s is Argmax tT(S)P(t)
Example :

(b) Describe Semantic interpretation using Feature Unification.


Many systems do not explicitly use lambda expressions and perform semantic interpretation directly using feature
values and variables. The basic idea is to introduce new features for the argument positions that earlier would have
been filled using lambda reduction. For instance, instead of using the following rule

(S SEM (?semvp ?semnp)) -> (NP SEM ?semnp) (VP SEM ?sernvp)

a new feature SUBJ is introduced, and the rule becomes

(S SEM ?semvp) -> (NP SEM ?semnp) (VP SUBJ ?semnp SEM ?semvp)

The SEM of the subject is passed into the VP constituent as the SUBJ feature and the SEM equations for the VP insert
the subject in the correct position. The new version of rule that does this is

(VP VAR ?v SUBJ ?semsubj SEM (?semv ?v ?semsubj ?semnp)) ->

(V[_none] SEM ?semv) (NP SEM ?semnp)


Figure shows how this rule builds the SEM of the sentence Jill saw the dog. Comparing this to the analysis built
using Grammar shows the differences appear in the treatment of the VP. Here the SEM is the full proposition with the
subject inserted, whereas before the SEM was a lambda expression that would be applied to the subject later in the
nile that builds the S. The modified grammar looks as follows:

1. (S SEM ?semvp) ->

(NP SEM ?semsubj) (VP SUBJ ?semsubj SEA! ?semvp)

2. (VP VAR ?v SUBJ ?semsubj SEM (?semv ?v ?semsubj)) ->

(V[nonej SEA! ?semv)

3. (VP VAR ?v SUBJ ?semsubj SEM (?semv ?v ?semsubj ?semnp)) ->

(V[_np] SEM ?semv) (NP SEM ?semnp)

4. (NP VAR ?v SEM (PRO ?v ?sempro)) -> (PRO SEM ?sempro)

5. (NP VAR ?v SEM (NAME ?v ?semname)) -> (NAME SEM ?semname)

6. (NP VAR ?v SEM <?semart ?v ?semcnp>) ->

(ART SEM ?semart) (CNP SEM ?semcnp)

7. (CNP VAR ?v SEM (?semn ?v)) * (N SEM ?semn)

Head features for S, VP, NP, CNP: VAR.

UNIT- IV & V
4. (a) Explain Plan Inference algorithm and its limitations.
Whenever an action is added to a plan, all its preconditions and effects must be included with the appropriate
arcs linking them to the actions. The A-set, P-set, E-set must be updated with the new information. There are 3
classes of input depending on whether the new sentence describes
An action
A state
A goal.
The algorithm considers each case separately.
Case 1: when input describes a goal:
1. If E is the empry plan then
1.1 If G describes an action, add the action and its preconditions and effects to the E-plan and mark the
action as goal.
1.2 If G describes a state then find all actions A1,A2,.An that could have G as an effect. Create a
new E-plan for each Ai as described in step 1.1.
2. If E is not empty then
2.1 Try to incorporate G into E
2.2 If 2.1 failed, then let Old G be the goal of E. build the possible new E-plans with G as the foal. Then
try to incorporate Old G into the new E-plans. For those that match successfully add Old G and the
Old E-plan into the new E-plan.
Case 2: Incorporating action
If the E-plan is empty then incorporate the action using the incorporate goal algorithm.
If the E-plan is not empty then you can check for matches into the plan in 3 ways:
1. The action matches an action in the A-set.
2. The action has an effect that matches a state in the P-set.
3. The action has a precondition that matches a state in the E-set.
For each match found, a new E-set can be generated by adding the action and links appropriate to the part of the E-
plan method.
Whenever an action is added, all its preconditions and effects are added as well.
If no match found using these techniques then you may expand the actions in the E-plan and try again.

Case 3: The final class of input to be handled involves sentences that describe states but are not goal statements. Such
statements are typically used to provide background or to describe the effects of actions in the plan. If current E-plan
is empty then the state should be interpreted as background information. This is done by adding the state to the E-set
and P-set of the plan. While it is not an effect or precondition of an action currently in the plan this allows the state to
be connected to actions that are incorporated later.

Limitations:
1. This would not be able to find the connection between the two sentences in the discourse.
2. It cannot handle undesirable state and suggest goals.

(b) Explain Statistical Word sense Disambiguation in detail.


Selectional restrictions provide only a coarse classification of acceptable and non acceptable forms. As a result many
cases of sense ambiguity cannot be resolved. Simple unigram statistics helps to resolve them.
or
5. Write short notes on the following
a. Lambda Calculus and Lambda Reduction
The lambda calculus provides a formalism to express logical forms with missing components. Although the
lambda calculus has the power to represent all computable functions, its uncomplicated syntax and semantics provide
an excellent vehicle for studying the meaning of programming language concepts. All the functional programming
languages can be viewed as suntactoc variations of the lambda calculus, so that both their semantics and
implementation ca be analyzed in the context of the lambda calculus.denotational semantics, one of the foremost
methods of formal specification of languages grew out of research in the lambda calculus and express its definitions
using the higher order functions of the lambda calculus.
Churchs lambda notation allows the definition of an anonymous function
Eg. Cube : Integer->Integer where cube(n)=n3
n. n3 defines the function that maps each n in the domain to n3 . The number and order of the parameters to the
function are specified between the symbol and an expression.
Lambda notation resolves the ambiguity by specifying the order of the parameters.
Eg. n. m. N2 +m.
-reduction
simplification of the -expression is called -reduciton.
Let P(X) be a proposition about the variable X. Read (X,P(X)) as (X,P(X)) applied to a. then -reduction says that
(X,P(X))(a)=P(X|a). i.e. to apply (X,P(X)) to a we replace every occurrence of X in P(X) by a.

b. Structural Cue Phrases.


A cue phrase is a connective expression such as now, meanwhile, anyway,or on the otherhand that links spans of
discourse and signal semantic relations in a text. Each cue phrase can be classified in terms of three factors :
1. Whether it indicates a termination of a segment
2. The start of a new segment
3. Continuation of a segment
An individual cue phrase may indicate one more of these factors. These phrases may have effects on inferential
processing as well.

c. Discourse structure & reference.


A reference problem motivates to have a hierarchical discourse structure. The attentional stack provide the
mechanism that account this problem. Hierarchical discourse model provides an elegant solution to the reference
problem that retains intuitions about the important recency. Cue Phrases signal the segment boundaries. Otherwise it
becomes quite difficult to detect when a discourse has no explicit signal.
The reference analysis itself suggest segmentation.
d. Thematic roles
Thematic roles, are one of the oldest classes of constructs in linguistic theory. Thematic roles are used to indicate the
role played by each entity in a sentence and are ranging from very specific to very general. The entities that are
labelled should have participated in an event. Some of the domain-specific roles are from airport, to airport, and depart
time. Some of the verb-specific roles are eater and eaten for the verb eat. Although there is no consensus on a
definitive list of semantic roles some basic semantic roles such as agent, instrument, etc are followed by all.

Examples of Thematic Roles

AGENT

Agent is one who performs some actions. AGENT is a label representing the role of an agent.
Joe played well and won the price.
Here, Joe is the person who did playing.

CAUSE

Cause is one that causes something or it is a reason for some happenings.


Rain makes me happy.
Here, rain causes happiness and so is the cause.

EXPERIENCER

One who experienced is experiencer.


Johan felt very painful when heard of the sudden demise of his friend.
Here, Johan experienced the pain so he is the experiencer.

BENEFICIARY

I prayed early in the morning for Susan.


Here Susan is the beneficiary.

LOCATION

Steve was swimming in the river.


Needless to say river is the location.MANNER

Tom behaved very gently even when he was insulted.


Here gently should be labelled as manner.

INSTR

Tom broke the wooden box with the hammer.


Here hammer is the instrument used to break the wooden box

Specific Roles

FROM-LOC

means from location.

John received the prize from the President.

TO-LOC

means to location.

Susan threw a pen to John.


AT-LOC

This label means at location.

The box contains a ball.

AT-TIME

This label means at time.

I woke up at 5o clock to prepare for the examination.

e. Semantic Networks.
A semantic network or net is a graph structure for representing knowledge in patterns of interconnected nodes and
arcs. Computer implementations of semantic networks were first developed for artificial intelligence and machine
translation, but earlier versions have long been used in philosophy, psychology, and linguistics. The Giant Global
Graph of the Semantic Web is a large semantic network.
What is common to all semantic networks is a declarative graphic representation that can be used to represent
knowledge and support automated systems for reasoning about the knowledge. Some versions are highly informal, but
others are formally defined systems of logic. Following are six of the most common kinds of semantic networks:
1.Definitional networks
2.Assertional networks
3.Implicational networks
4.Executable networks
5.Learning networks
6.Hybrid networks
ADITYA COLLEGE OF ENGINEERING
PUNGANUR ROAD, MADANAPALLE-517325
IV-B.Tech (R13) II Sem- II Internal Examinations April-2017 (Objective) (CODE A)
(13A05802) NATURAL LANGUAGE PROCESSING (Computer Science & Engineering)
Name: Roll No:
Time: 20 min Max Marks: 10

Answer all the questions 51=5M


1. Define Modal operators & Model structure..
Modal Operator is an NLP term that is used to identify specific words that enable us to identify our rules. There are six
types of Modal Operators: Modal Operators of Necessity. Eg Should, must, ought to, have to, supposed to.
2. Define HMM.
The Hidden Markov Model (HMM) is a popular statistical tool for modelling a wide
range of time series data. In the context of natural language processing(NLP), HMMs have
been applied with great success to problems such as part-of-speech tagging and noun-phrase
chunking.
3. Define predicate.
In NLP terms, visual, auditory, kinesthetic and auditory digital words are called predicates. The predicates that a
person uses will provide you with an indication of the person's preferred representational system.
4. What is Discourse.
It deals with how the immediately preceding sentence can affect the interpretation of the next sentence.
5. Semantic Interpretation of Prepositional Phrases.
PPs can modify an NP or a VP, or may be subcategorized for by a head word, in which case the preposition
acts more as a flag for an argument than as an independent predicate.

Examples:
The man in the corner ate lunch.
PP modifies NP the man

The dog barked in the alley.


PP modifies VP barked

She is ready to take up the challenge.


up flags object of "take-up"

With a PP modifying an NP, the sem of the PP is a unary predicate to be applied to the sem of the
NP:

PP(sem(lambda(Y, ?semp(Y, ?semnp))) P(sem(?semp)) NP(sem(?semnp))

Given in the corner if the sem of the in is at_loc1, and the sem of the NP is the<c1, corner1>, then
the sem of the PP would be the unary predicate
lambda(Y, at_loc1(Y, the<c1, corner1>))

In the context the man in the corner, we need a rule to attach the PP to the CNP (common noun
phrase) man:

CNP(sem(lambda(N1, &(?semcnp(N1), ?sempp(N1))))) CNP(sem(?semcnp)) PP(sem(?sempp))

[Here & means "and". It is used here as a prefix operator. Thus& (happy(fido), dog(fido)) would
mean "Fido is happy and Fido is a dog".& (man1(x), in_loc1(x, the<c1, corner1>)) means "x is a
man and (&) x is in the corner".

Choose the correct answer from following questions. 10 1/2 = 5 M


1. Bayes theorem is based on
a) Conditional Probability b)Prior Probability c) Context Probability d)none [ a ]
2. ____is the use of ratios from the corpus as the probability to predict the interpretation of the new sentence. [ a ]
a) Maximum Likelihood Estimator b) Minimum Likelihood Estimator c) Medium Likelihood Estimator d)None
3. Viterbi algorithm uses ____ probability model.
a) Random b) Gaussian c) Bigram d) None [ c ]
4. The representation of context-independent meaning is called the logical form.
a) True b) False [ a ]
5. The set of different classes of objects in a representation is called its_______
a) passive form b) Ontology c) noun form d) All the above [ b ]
6. A word is being semantically ambiguous if it maps to more than one sense.
a) True b) False [ a ]
7. In __ you cannot substitute freely equal terms when they occur within the scope of a modal operator
a) PROLOG b) Failure of substitution c) Failure of ambiguity d) LOGIC [ c ]
8. Quasi logical form is an example of separate level of representation from the logical form.
a) True b) False [ a ]
9. Inverse to Lambda reduction is
a)Lambda Abstraction b) Lambda Deduction c) Lambda incrementation d) All the above. [ a ]
10.The amount of text examined for each word is called the
a) Window b) Collocation c) Mutual Information d) All of the mentioned [ a ]

Você também pode gostar