Escolar Documentos
Profissional Documentos
Cultura Documentos
all.
But as a practical matter, the difference
between these two classes of operators is
not particularly important.
Okay.
So the take away here is that there's a
big set of, of, you know, a rich set of
operators.
But if someone says the relation algebra
to you, the first thing you should think
of is set operations plus selection
projection and join.
Okay.
All right.
So this notion of sets versus bags, the
duplicate question.
Well, so first of all, what is a set?
A set is a collection of objects where
there are no duplicates, and a bag is a
collection of objects where there can be
duplicates.
And so right up here, you know, a is
repeated, is not repeated all in a set,
but it may be repeated in a bag.
And whether that's legal or illegal is
what gives you the semantics of a set
person's bag.
So you can define the relational algebra
in terms of these two different
semantics.
You can define in terms of set or you can
define in terms of bag.
And this notion of an extended relational
algebra come from the need to sort of
work with bags, as well as other things
like sorting, as I mentioned.
Okay, so the rule of thumb here, this is
the last time I'm really going to mention
this.
The rule of thumb here is that every
paper that you read, if you, if you end
up reading some on the papers we talked
about in this course or beyond.
Will you know unless it said explicitly,
we'll assume set semantics.
Okay, so be prepared for that.
While every implementation you know every
commercial database will assume bag
semantics.
And we'll sort of see where that comes up
in the language.
Okay so I just want to put, put that out
there up front that you know, I may play
fast and loose with the difference
between sets versus bags, but it, it can
be important in practice.
Okay, so when lifting set operation, you
can define the union of two sets in the
standard way, the union of two relations
is natural given that a relation is a set
of tuples.
And in relate algebra notation, I write
it like this, and I can also write it in
SQL with the union keyword.
And here's where setting that will come
up if I want to, by, unqualified union
does indeed remove duplicate in which
case the answer is of the, the union of
this relation with a1, b1 as a tuple and
a2, b1 and a1, b1, and a3, b4 is these
three tuples.
The duplicate of a1, b1 didn't get passed
through.
To express this in bags is to make sure
we do include duplicates, you can say
union all and that would include all four
tuples.
Okay.
You can find the difference operation the
same way, or in the same way a, in the
sense that you're lifting it from the
set, from the natural definition of over
sets, that find every se, every tuple in
this set and removing tuples that also
appear in this set.
And we see one, we see a1, b1 as we saw
before also appears in R1.
And so you, you get rid of it.
And all you're left with is this tuple.
Alright.
So why isn't this one in there?
Well we don't, if it doesn't, if a3, b4
doesn't appear in R1, we know it's not in
the set.
All we want is everything that's in R1.
Removing things that also appear in R2.
Okay.
Alright.
So what about intersection?
That's another set operation that we
could lift up.
You can indeed define intersection but
you don't necessarily need to have it as
a fundamental operator because you could
really express it in terms of difference.
Right so if I want the intersection of R1
and R2 two, I can take everything in R1
that is not in R2.
And then I can take everything R1 that is
not in that result.
So if you think about this for a second.
This expression returns everything that
isn't, that is only in R1.
And then this expression overall removes
everything that is only in R1, leaving
things that are both in R1 and R2, and so
that's what intersection is.
Okay.
And we'll touch on this later but you can
also express intersection in terms of
moment.
So John has a salary less than 40,000 so
he can be removed from the set and so the
result of this expression is this table.
Right, we have tabes in and tables out.
The result of this expression is this
table, same three columns and only two
tables in it.
Okay, I guess sometimes I gesture here
I'm not sure you can see it when I, I
don't know, maybe you can.