Escolar Documentos
Profissional Documentos
Cultura Documentos
First Last
Personnel ID Name Name Position Workphone Home Phone
The DEF company keeps track of all the work phone and home phone numbers
of its employees. Furthermore, the DEF company records the current position of
each employee. The DEF company uses the employee id for identifying an
employee, although most employees use the combination of first and last name,
while that combination is not necessarily unique .
1
Step 1b: Split into elementary facts
We prefer to work with elementary or ‘stand-alone’ facts. Hence, we will split
up sentences 2, 6 and 10 into 2 ‘stand-alone’sentences that each contain exactly
one fact. Later we will discuss a heuristic to formally check whether such an
splitting process leads to elementary facts.
2
entity type of the entities student 126734 and student 123546 is student. The
entity type of the entities employee piet and employee jake is employee.
A name class is a set of names that can be used for identifying instances of a
given entity type (entities).
The name class of 10 digit codes (e.g. 0437885367, 0456356378) that can be
used to identify the dutch phone extensions.
The name class of student ID’s (e.g. 997564, 995678) that can be used to
identify the UM students
The name class of integers (1, 2, 3,…) that can be used to identify an amount.
We will now apply the qualifying procedure on our example . Once again, for
each variable part in the sentence type we ask:
By (names from) what name class are entities of the entity type identified?
For the second variable part in the example sentence group we ask the same
questions:
By (names from) what name class are entities of the entity type identified?
In case the name class and/or entity type is part of any of the fixed parts of the
sentence types, we relabel these parts as name class and/or entity type
3
respectively until we have processed all variable parts in a sentence group. In
some initial verbalizations all entity types and name classs will be contained in
the fixed parts of the sentence groups. E.g. in the following sentence: The car
with licenseplate number 23-JK-56 is owned by the person with person id
234577, all entity types (car, person) and name classs (licenseplate number,
person id) are contained in the initial verbalization. In case the entity type and/or
name class designators are not contained in the initial verbalization they will be
added (this is the process of qualification). The remaining parts of the fixed parts
are by definition labeled as verb, sometimes omitting semantic irrelevant fill-up
words. Applying this procedure on our example sentence groups yields:
Group 1:
Group 2:
Group 3:
4
Group 42:
Group 5:
The sentence format that contains all entity type and name classes explicitly will
be called a deep structure sentence.
Let's take a look again at the following classified and qualified sentences from
the example group 1:
2 For sentence groups 4 and 5 we only give a name class for the 2nd variable part in the
fact type form because it basically refers to an alternative name for the entity type for the
first variable part (i.e. Employee)
5
In figure 1 we will show the transformation template for the creation of diagrams
starting from those fully classified and qualified expressions (deep structure
sentences).
In figure 1 an exact mapping from the elements in the deep structure sentence is
given to the elements in the information diagram. Note that the individual names
from the variable parts of the sentence groups can be considered as being
elements of the fact type population. The fixed parts can be considered as being
elements of the fact type form.
In figure 2 a template is given for the integration of 'stand-alone ' fact types
into an integrated diagram.
6
Figure 2: Fact type integration template
In figure 3 the fact types and fact type forms (and the example fact population)
of the classified and qualified sentences for example 3 are given including the
relevant entity types and name classes or name types.
First name
FT4
R7 R8
7
Step 6: Adding Uniqueness rules (NIAM/ORM CSDP
step 4)
For each elementary (or atomic) fact type having N roles, exactly one of the
following rules applies:
8
For the derivation of uniqueness integrity rules it is easy to refer to a sentence by
listing the individual names in the roles as follows:
0012 manager3
We will now create a second sentence in which the value of the first role is
changed:
0013 manager
The analyst will now ask the domain expert whether these two sentences are
allowed in combination.
The analyst can now conclude that this combination does not lead to a
uniqueness integrity rule.
We will now create a third sentence in which the value for the second role is
changed.
The analyst can now conclude that this combination leads to a uniqueness
integrity graphically defined on role r1. This uniqueness integrity rule (uc 1) is
the fact-based equivalent of domain rule 3.3 . In figure 4 we have shown fact
type FT1 with uniqueness integrity rule uc1.
3 It should be noted that the deep structure sentences can be easily recreated by filling in
the individual names into the fact type formulation (see figure 15).
9
Figure 4: Fact type FT1 with uniqueness integrity rule
We can now generalize the procedure for deriving uniqueness integrity rules for
any fact type having an arity N. For each role in the fact type we have to create a
sentence instance in which the individual name occurence for that role is
changed. Each non-allowed combination of such a sentence with the first
sentence will lead to a uniqueness integrity rule on the other N-1 roles.
Let’s now analyze fact type FT2. Consider one allowed sentence from the fact
type population e.g.
0012 3678
We will now create a second sentence in which the value of the first role is
changed:
0013 3678
The analyst will now ask the domain expert whether these two sentences are
allowed in combination.
10
The analyst can now conclude that this combination does not lead to a
uniqueness integrity rule. We will now create a third sentence in which the value
for the second role is changed.
Domain rule 3.10: Each employee has at most one company phone extension
We now have illustrated how the fact based uniqueness integrity rule protocol
can be used for ‘rule mining’, i.e. finding domain rules in a domain that are not
yet made explicit but that become visible when the domain experts are
confronted with combinations of fact instances that are not allowed to exist in
combination. The newly found domain rule 3.10 is mapped onto uniqueness
integrity rule uc2 in figure 5. Note that uniqueness integrity rule uc3 corresponds
with domain rule 3.4.
11
Applying the Fact-Based Modeling Protocol: The DEF phone
directory example (part 2)
Let’s assume that following explicit documented domain rules hold in our example UoD
Let’s extend our DEF Universe of Discourse with the following example:
Senior manager
Manager
Applying the Fact-Based modeling protocol on this example leads to the following verbalizations:
We now can initially classify this sentence group (6) in variable and fixed parts.
fixed 1|variable1|fixed 2 |
Group 6:
We now notice that the extension of the UoD with this new example will lead to the addition of a unary
fact type (ft 6) to the conceptual schema for this UoD.
1
Step 7: Add other integrity rules:
7.1 Mandatory rule
In step 6 of the fact-based modeling methodology we derived the uniqueness rules for our knowledge
model. Another group of integrity rules is called mandatory (or non-empty) rule.
A mandatory rule is a rule defined on an entity type. The mandatory rule is defined on one role.
Each instance of an entity from the entity type must play the role on which the mandatory rule
is defined.
We will now derive the mandatory rule of the DEF phone book example. In this introductory skills
training we will only apply the definition of the mandatory rule on the specified business rules. In
Bollen (2011a, 2011b) the methodological support or protocol for the ‘mining’ of these mandatory rules
is provided. Domain rule 3.1 can be directly mapped as a mandatory role constraint mc2 defined on
role r3. Furthermore, domain rule 3.3 can be mapped onto a mandatory rule mc1 defined on role r1
(see figure 1).
Figure 1: FB conceptual model of DEF phone book example including mandatory rules.
2
7.2 Set comparison rules
Set comparison rules are defined on roles or role combinations that in which the same (combination)
of entity types play a role. Furthermore,set comparison rules are only relevant in cases where an entity
type performs more than one role in the conceptual schema. The first type of set comparison constraint
will be the subset rule.
A subset rule is defined on two (ordered combinations of) roles having the same (ordered
combinations of) entity type(s). The subset rule enforces a subset relation between these role
(combination) populations.
In our DEF example we note that only for some of the positions a company car is entitled. This points
at the existence of a subset (set-comparison) rule sc1 in which the instance population of role R11 at all
times must be subset of the instance population of role R2 (note that both roles are played by the entity
type Position (see figure 2)).
An equality rule is defined on two (ordered combinations of) roles having the same (ordered
combinations of) entity types. The equality rule enforces the set equality of these role
(combination) populations.
Domain rule 3.8: Each employee has at most one first name.
Domain rule 3.9: Each employee has at most one last name.
Suppose that an additional domain requirement is that if a first name is known for an employee there a
last name also must be known and the other way round (domain rule 3.12)
Domain rule 3.12: An employee has a first name and a last name OR no name at all.
We can model this requirement now as an equality rule ec1 between roles R7 and R9 (see figure 2)
An exclusion constraint is defined on two (ordered combinations of) roles having the same
(ordered combinations of) entity types. The exclusion rule enforces the disjunction of these role
(combination) populations.
If we refer once again to the business rules 3.5 and 3.7 from the DEF phone book example we can
conclude that the populations of roles R4 and R6 will always exclude one another (see figure 2).
3
uc4 First name
FT4
R7 R8
mc2
ec1 <R7> has <R8>
0012 Jim
= 0013 Dave
0014 Jim
mc1 uc5 Ft5 Last name
R9 R10
<R9> has <R10>
0012 Jones
0013 Leary
0014 Jones
0012 Manager
0013 Manager
uc6 Ft6
0014 Clerk sc1
0015 Secretary
0016 Senior manager R11
<R11> is entitled to
a company car
Manager
Senior manager
ex1
0012 3678 0012 043-7869463
0013 3678 0013 043-7869551
0014 7896 0014 043-7869551
0015 1237 0015 050-2758439
0016 9434
FT1
FT1 FT1
xxxx
xxxx
a1 A
xxxx a1
a1 A A (ac ode)
(ac ode) (ac ode)
FT2
FT2 FT2
yyy
yyy yyy a1
a1
a2
4
We can now summarize the algorithm as a decision-table in which a given combination of allowed
existence or non-allowed existence of each of the example extensions as confirmed by the domain user
in an analyst-user dialogue leads to the detection of (at most) one set-comparison constraint (see table
2). We note that for the other 4 possible outcomes of the algorithm no set-comparison constraints will
be derived.
1 2 3 4
EXT1 Allowed Not Allowed Not Allowed Allowed
EXT2 Allowed Allowed Allowed Not Allowed
EXT3 Not Allowed Allowed Not Allowed Not Allowed
Set- Subset1 Subset2 equality exclusion
comparison (FT2 ≤ FT1) (FT1≤ FT2)
rule type
A value rule defined on an entity type, constrains the possible occurences of such an entity type
to the values defined in the value rule.
Let’s review the domain rules again. The first domain rule that can lead to a value rule in the fact-
based conceptual schema is domain rule 3.5:
Domain rule 3.7: DEF currently knows senior manager, manager, clerk and secretary positions.
Contrary to value constraints vc1 and vc2, domain rule 3.7 must be mapped onto a value rule that is
defined on an entity type, rather than on a specific role. So domain rule is now mapped as value rule
vc3 defined on the entity type Position (see figure 4).
5
7.4 Occurrence frequency rule
In this section we will introduce the occurence frequency rule (sometimes called cardinality rule).
Definition 7.4:
An occurence frequency rule defined on an individual role, constrains the number of times that a
specific value (or name) may occur in the role population.
This rule limits the total number of times that a specific value (or name) of an entity type or a name
type in a given role is allowed , it can be fixed, a minimum, a maximum or a range . Note that the
special case where the occurrence frequency or cardinality is 1 results an uniqueness rule defined on
that role.
When we inspect the domain rules for our DEF domain we run into domain rule 3.11:
Domain rule 3.11: At any moment in time a maximum of 15 people might be employed on a given
position
This domain rule can be mapped onto occurrence frequency rule of1 defined on role R2.
mc3 mc4 R7 R8
mc2
ec1 <R7> has <R8>
0012 Jim
= 0013 Dave
vc3:[clerk, manager, secretary, 0014 Jim
mc1 senior manager] uc5 Ft5 Last name
of1:[1..15]
R9 R10
<R9> has <R10>
0012 Jones
0013 Leary
0014 Jones
0012 Manager
0013 Manager
uc6 Ft6
0014 Clerk sc1
0015 Secretary
0016 Senior manager R11
<R11> is entitled to
a company car
Manager
Senior manager
ex1
0012 3678 0012 043-7869463
0013 3678 0013 043-7869551
0014 7896 0014 043-7869551
0015 1237 0015 050-2758439
0016 9434
Figure 4:FB conceptual model of DEF example including value and occurrence frequency
rules.
6
7.5 General rule
Consider the conceptual schema in figure 4 in which fact types and integrity rules on the domain of
tax collecting are given.
If a taxpayer has a work status of unemployed, the gross year income can never be more than
20.000,- guilders
In figure 5 we have added this domain rule as a general rule, by connecting the general rule code to the
roles that are involved in this rule and have phrased this rule in natural language sentences. A more
formal way of representing this general rule in a declarative way is:
7
Step 8: Define concepts
Now we will give concept definitions of all the concepts identified as the result of step 4 of the
CogNIAM modeling procedure in our example: Employee, Employee ID, Phone extension at home,
Phone extension at work, Number, Position, Position name , First name and Last name plus the
concepts that are identified on the extension of our example UoD: entitled to, Company car.
Concept Definition
Employee a person that works for the DEF company
Employee ID a name class, instances of which can be used to identify an <Employee> among the collection
of <employees> that have ever been, are or will be working for the DEF company
Phone extension at home
a landline telephone that is located at the house of an <employee>
Phone extension at work
a landline telephone that is connected to the DEF company telephone network
Number a name class, instances of which can be used to identify a <phone extension at home> among the
union of <phone extensions at home> OR to identify a <phone extension at work> among the
union of <phone extensions at work>
Position The type of job that is performed by an <employee> in the DEF company
Position name a name class, instances of which can be used to identify a <position> among the collection of
<positions> at DEF company.
First name a name class
Last name a name class
Entitled to privilege of an <employee> based on <position>
Company car a specific privilege for an <employee>in specified <position>s
8
Nesting, Objectified association, Nominalization
In figure 6 we have given a fact type in which a ‘flat’ relationship between the entity types Year, Item
and Amount is verbalized as the following fact type form:
The inventory cost in the Ulster production site in year <r4> for item <r5> is the amount <r6>
Let’s assume that in this UoD a concept exists called planningitem which is expressed as the following
fact type form:
The planningitem referred to by year <R4> anno domino and item with itemcode <R5>.
In that case it is possible to phrase the original fact type form as foloows:
The inventory cost in the Ulster production site for planningitem <R7> will be the amount of <R8>
dollar/unit/year .
9
Figure 7: Diagrammization of nominalized fact type (forms)
In essence a nominalization is a compound naming convention. Instead of having a specific name class
that identifies entities of a given entity type, we have a compound name (a.d and itemcode) that
identifies an instance of the nominalized concept (planningitem). We should remark, however, that
nominalizations or objectifications (or nesting as it is sometimes called) should only be applied when
the nested or nominalized concept exists in the domain. Splitting of objectfications only for the sake of
splitting of should not take place. In the example in figure 7 this nominalization is rooted in a domain
concept planningitem.
10
A Primer on Derivation Rules in Fact-Based Modeling
Peter Bollen
Department of Organization & Strategy
SBE
Maastricht University
HARDY'S RESTAURANT
0089 12-04-2016
beer $ 2,--
cheeseburger $ 2,50
-------
total $ 4,50
The input document is an order receipt for Hardy's Restaurant. We will further assume that we have a
human processor: the accountant of Hardy's Restaurant. Every month this human processor creates a
document of the type monthly sales sheet (see figure 2).
Derivation rules
The first derivation rule is the derivation rule in which the order total is derived from the amounts on
the individual order lines. In the terminology of the information grammar we can say that for a given
Receipt (R4) the total sales amount (R4) is equal to the sum (R3) over all order lines (R2) for the given
receipt (R1). We will use the following shorthand notation for this derivation rule DR1:
Gc2: For every tranasction the city in role R19 and the city in role R1 must be contained as an
instance of R7 and r8 or r8 and r7.