Você está na página 1de 5

Assign default value for a dimension

Page 1 sur 5

Kimball Forum

Assign default value for a dimension


Assign default value for a dimension
MLaforge on Wed Apr 07, 2010 1:36 pm

Hi,
We have the following case:
For the product code in the source data, lookup the product natural key in the product
dimension and assign the surrogate key returned by the lookup to the fact. If the
lookup fails, assign the unknown product whose natural key is "Unknown". The
unknown product is entered initially in the dimension and should never change
afterwards, we use negative SID for default entries (-1 for "unknown", -2 for "not
applicable" and -3 for "not available").
In the ETL design, there is a debate on how to implement this, some want to hard code
the SID for all dimension with default values (ex: if the lookup fails put the value -1 in
the key field). However, I have doubts since the ETL has to validate at all times that
the DB integrity is not compromised (integrity is turned off in the DB). My opinion is
that the "Unknown" dimension entry should always be retrieved for a "default" lookup
in the ETL to verify this integrity. This would result in the following ETL:
1. Lookup the product SID in the product dimension using the natural key
2. If the product is found, assign the product SID to the fact
3. Else lookup the default SID using the "Unknown" natural key and assign the SID
retrieved. (a similarl lookup can be done for "not applicable" or "not available)
This method works for all dimensions and guarantees the integrity of the database.
This also protects against "hard coding" a surrogate key which doesn't sound too good
imho... Even if the default keys will always be -1, -2 and -3 we should not use this
values directly.
Any thoughts on this topic?
Thanks in advance,
Mathieu
MLaforge
Posts: 2
Join date: 2010-04-07

Re: Assign default value for a dimension


ngalemmo on Wed Apr 07, 2010 2:10 pm

http://forum.kimballgroup.com/t486-assign-default-value-for-a-dimension

2012-04-23

Assign default value for a dimension

Page 2 sur 5

For the product code in the source data, lookup the product natural key in the
product dimension and assign the surrogate key returned by the lookup to the fact.

You lookup the product natural key.... so how is it the natural key is unknown?
If you cannot find a natural key in a dimension table you insert a new row WITH THAT
NATURAL KEY. You do not throw it away and use some generic value. Now, you may
set attributes, such as product description, to 'Unknown', but you need to retain the
identity of the dimension row at all costs.
You need to ask yourself why would such a situation occur... assuming the application
system has a product master that it refers to in transactions? It usually happens
because something went wrong with the dimension update (job failed or wasn't run).
You need to retain the product identity in the dimensions so that if and when the
source system product master is updated, it would properly populate dimension
attributes for that row.

ngalemmo
Posts: 1687
Join date: 2009-05-15
Location: Los Angeles

Re: Assign default value for a dimension


MLaforge on Thu Apr 08, 2010 6:04 am

Thank you for your answer.


In that case, the source is not allowed to insert a new product. Products are
maintained by a separate master source and an invalid product will be the result of an
error in the source system for this ETL. We want to keep the record for the monetary
fields but a report will be sent to the source with the invalid product code and
necessary keys to identify the record.
I guess my questions could be summarized this way: "Are you in favor or not of hard
coding a surrogate key in an ETL?"
Regards,

http://forum.kimballgroup.com/t486-assign-default-value-for-a-dimension

2012-04-23

Assign default value for a dimension

Page 3 sur 5

Mathieu
MLaforge
Posts: 2
Join date: 2010-04-07

Re: Assign default value for a dimension


ngalemmo on Thu Apr 08, 2010 8:32 am

The only time you may hard-code a surrogate key reference is in the case of a NULL
natural key, but from an ETL standpoint it is easier to simply hard code a default
natural key value when you encounter nulls and allow the normal surrogate key lookup
process to handle it.
But the point I am trying to make is if you receive a transaction that deals with product
"XYZ" and you don't know what "XYZ" is (i.e. it is not in the dimension table), you need
to create an entry for "XYZ" so, at the very least, you know what product ID the
numbers refer to. Without that, how do you reconcile things if there are 10 different
facts for different unknown products all pointing to the same dimension row?

ngalemmo
Posts: 1687
Join date: 2009-05-15
Location: Los Angeles

Re: Assign default value for a dimension


Joy on Mon Apr 12, 2010 10:42 am

There is no "one size fits all" for solving this problem. But hard-coding a default value
(e.g. -1) is near the bottom of my list. Here is my list of potential solutions to the
problem of how to handle RI failures between fact and dimension:
Halt processing and abort. Hard to imagine this is a good idea.
Throw away the bad fact row. I can imagine a handful of scenarios where this
could be what you want to do, but rarely.
Divert the fact row to an error table. This is a common, simple technical solution.
It should be implemented ONLY if you develop policies and procedures for getting
the bad fact rows out of jail.

http://forum.kimballgroup.com/t486-assign-default-value-for-a-dimension

2012-04-23

Assign default value for a dimension

Page 4 sur 5

Add a placeholder row in the dimension table. Technically the hardest to


implement (but not very hard). A nice solution because once you do receive the
correct dimension attributes, you simply update the dimension row. The fact
table has the correct surrogate key and hence does not need any correction.
Avoiding fact table updates is good. Before implementing this strategy, I like to
understand why I might receive a fact row before its corresponding dimension
member.
Hard code to the "unknown member" (e.g. -1) This is commonly implemented
because it's easy. The problem is that you're throwing away the information
(transaction system key) needed to fix the problem. I really do not like this
solution, and cannot recommend it as a best practice.

Good luck with your project. -- Joy

Joy
Posts: 19
Join date: 2009-02-03
Location: Kimball
Group

Re: Assign default value for a dimension


sac587118 on Mon Dec 20, 2010 4:02 am

If you are not able to update dimension table, you can create one more reference table
which can be inserted and updated and will hold negative running number (like
surogate key) for each product which was not found in dimension table. You need to
follow below steps
1. lookup on existing product dimension table if not found then lookup on newly added
refence table, If lookup failed with respect to new reference table then insert one
record with negative running number and product information.
2. When next time same product will come ( which not found in dimension table) that
will be found in reference lookup table and negative running number will be passed to
fact table.
This solution has 3 benifits
1. Allows every data load into fact table.
2. No need to use single default value which leads confusion in aggregation.
3. No data integrity issues in fact table.
While reprots you can take union of product dimension and new reference lookup table
and then join with fact to get full data.

http://forum.kimballgroup.com/t486-assign-default-value-for-a-dimension

2012-04-23

Assign default value for a dimension

Page 5 sur 5

Cheers,
Sachin

sac587118
Posts: 2
Join date: 2010-12-10

Similar topics
Assign default value for a dimension
Dummy Row in Dimension Tables
SSAS Cube structure of SCD Type 2 dimension
Date Dimension
Dimension table with no business key?

PERMISSIONS IN THIS FORUM:


You cannot reply to topics in this forum

Home

Forum free | phpBB | Free forum support | Report an abuse | Free forums

http://forum.kimballgroup.com/t486-assign-default-value-for-a-dimension

2012-04-23

Você também pode gostar