Escolar Documentos
Profissional Documentos
Cultura Documentos
14
Programming SQL Server Data Mining
The concept of data mining as a platform technology opens up the doors for the possibility of a new breed of intelligent applications. An intelligent application is one that does not need custom code to handle various circumstances; rather it learns business rules directly from the data. Additionally, as business rules change, intelligent applications are updated automatically by reprocessing the models that represent the business logic. !amples of intelligent applications are cross"sales applications that provide insightful recommendations to your users, call center applications that show only customers with a reasonable chance of ma#ing a purchase, and order"entry systems that validate data as it is entered without any custom code. These are $ust the tip of the iceberg; the fle!ibility and e!tensibility of the %&' %erver (ata )ining programming model will e!cite the creativity of the developer, leading to the invention of even more types of intelligent applications. *n the last chapter, we demonstrated that the core communication protocol for Analysis %ervices is +)' for Analysis ,+)'A-. This protocol provides a highly fle!ible, platform"independent method for accessing your data mining server. verything that can be done between the client and the server can be done through +)'A. .owever, as is true in the rest of your life, $ust because you can do it the hard way doesn/t mean that you have to. *n this chapter, you review programming interfaces and ob$ect models that ma#e it easy to write data mining applications using Analysis %ervices. 0ou see e!amples in 1isual 2asic. 3et, demonstrating how to implement typical data mining tas#s using the appropriate interface for each tas#, and e!plore some
special features of %&' %erver data mining that you can use to e!ploit data mining programming to the fullest. The sample code, along with versions in 1isual 45 .3 T, is available at wiley.com6tang64hapter14. *n this chapter, you learn about7 8 A9*s and their application to data mining
8 :sing Analysis %ervices A9*s 8 4reating and managing data mining ob$ects using A); 8 (ata mining client programming with A(;)(.3 T 8 <riting server"side stored procedures with %erver A(;)(.3 T
ADOMD.N !
ActiveX Data Objects "Multidimensional# for .N ! %erver ActiveX ) Data Objects "Multidimensional# Anal$sis Management Objects
D%O
DMX O- DB3DM
introduces the conce+t of data mining models as database objects. A communication +rotocol and XMformat for communicating 1ith an anal$tical server inde+endent of an$ +latform.
ADO
Active (ata ;b$ects ,A(;- was created to assist the 1isual 2asic programmer in accessing data residing in databases. The A(; libraries wrap the ;' (2 interfaces into ob$ects that are easier to program against. 2ecause ;' (2 for (ata )ining specifies that a data mining provider is first an ;' (2 provider, A(; can be used to e!ecute data mining >ueries $ust as it does relational database >ueries. A(; reduces the comple!ity of ;' (2 interfaces to three essential ob$ects7 the connection, the command, and the record set. The connection object is used to connect to the server and to issue schema rowset >ueries. The command object is used to e!ecute ()+ statements and optionally retrieve their results, and the record set object contains the result of any data returning >ueries.
ADO. ET
A(;.3 T is the managed data access layer. *t was created to allow managed languages, such as 1isual 2asic .3 T and 45, to access data, much as A(; was created for native languages. The philosophy of A(;.3 T is somewhat different from that of A(; in that A(;.3 T is designed to wor# in a disconnected mode, where data can be accessed and manipulated without maintaining an active connection to the server. <hen wor# is completed, a connection can be established, and all the appropriate updates will be propagated to the server, providing that there is server support for such behavior. A(;.3 T is more modular than A(; is. A(; wor#s in one way and that way only, and contains special code to interact with the %&' %erver provider better than other providers. A(;.3 T provides generic ob$ects that wor# with any ;' (2 provider, but also allows providers to create their own managed providers for data interaction. ?or e!ample, %&'A(;.3 T contains ob$ects optimi=ed for interacting specifically with %&' %erver, and similar managed providers can be written for any data source. %imilarly to A(;, A(;.3 T contains connection and command ob$ects. .owever, A(;.3 T introduces the dataset ob$ect for data interaction. A dataset is a cache of the server data contained in a set of datatables that can be independently updated or archived as +)'. (atasets are loaded using
dataadapters@@mdeither the generic adapter that is supplied with A(;.3 T or a provider"specific adapter such as the %&'(ataAdapter. ?or direct data access, A(;.3 T uses a datareader, which is similar in concept to the A(; record set, returned from its command ob$ect.
ADOMD. ET
A(;)(.3 T ,A(;.3 T @@nd )ultidimensional- is a managed data provider implementing the dataadapter and datareader interfaces of A(;.3 T specifically for Analysis %ervices, ma#ing it faster and more memory"efficient than the generic A(;.3 T ob$ects. *n addition to the standard A(;.3 T interfaces, A(;)(.3 T contains data mining and ;'A9"specific ob$ects, ma#ing programming data mining client applications easier. The )ining%tructure, )ining)odel, and )ining4olumn collections ma#e it easy to e!tract the metadata describing the ob$ects on the server. The )ining4ontent3ode ob$ect allows for the programmatic browsing of mining models and can be accessed from the root of the content hierarchy or randomly from any node in the content.
NOTE There also exists a native version of ADOMD.NET, appropriately named ADOMD. This interface is maintained mostly for backward compatibility with !" erver #$$$ and does not contain any ob%ects or interfaces for data minin& pro&rammin&.
Server ADOMD
%erver A(;)( is an ob$ect model for accessing Analysis %erver ob$ects, both data mining and ;'A9, directly on the server. *t is intended for use in user" defined functions, described later in this chapter.
AMO
A);, or Analysis )anagement ;b$ects, is the main management interface for Analysis %ervices. *t replaces the %&' %erver ABBB interface, (ecision %upport ;b$ects ,(%;-, which is still maintained for bac#ward compatibility, but has not been updated to ta#e advantage of all the new features of %&' %erver ABBC. 'i#e A(;)(.3 T, A); contains the )ining%tructures, )ining)odels, and )ining4olumns collections, and the li#e. .owever, whereas A(;)(.3 T is for browsing and >uerying, A); is for creating and managing. All the operations you perform in the user interfaces of the 2* <or#bench or %&' <or#bench are possible to perform programmatically using A);; in fact, the management operations of both user interfaces were written using A);.
TIP 'o( sho(ld (se ADOMD.NET when writin& data minin& client applications except when .NET is not available. Otherwise, (se ADO )or O"E D*+ for ,indows applications, or plain -M"A for thin client applications. .or applications in which yo( will be creatin& new models or mana&in& existin& models, (se AMO. NOTE ee *ooks Online for f(ll doc(mentation and samples of all A/0s (sed by Analysis ervices.
To ma#e your coding easier you can add code li#e the following to the top of your source files so that you don/t have to specify the fully >ualified name for every ob$ect.
1*.NET
Imports Microsoft.AnalysisServices
45
AMO %asi#s
A); is a rather straightforward ob$ect model placed on top of the +)' representation of Analysis %ervices ob$ects. *n addition to providing a convenient A9*, A); also provides basic validation and methods to update, change, and monitor ob$ects on the server.
NOTE To add AMO code to yo(r pro%ect, yo( need to add references to two assemblies2 )icrosoft.Analysis%ervices and )icrosoft.(ata<arehouse.*nterfaces. To make yo(r codin& easier, yo( can add the followin& line of code to the top of yo(r so(rce files so that yo( don3t have to specify the f(lly 4(alified name for every ob%ect. 1*.NET
Imports Microsoft.AnalysisServices
56
Using Microsoft.AnalysisServices
very ob$ect in A); implements the 3amed4omponent interface, which supplies 3ame, *d and (escription properties and a 1alidate method. An ob$ect/s *( is its immutable identifier that cannot be changed once set. This is useful, for instance, when developing user applications with fi!ed ob$ects. *t allows users to arbitrarily change ob$ect names for their own use, while providing a consistent way for your code to reference ob$ects. )a$or;b$ect inherits 3amed4omponent and adds the :pdate and Defresh methods to update the server with local changes and to refresh the local model with the server contents, respectively. Additionally )a$or;b$ects has methods to access referring and dependant ob$ects and contains an Annotations collection for arbitrary user e!tensions. The Dole ob$ect is an e!ample of a )a$or;b$ect. 9rocessable)a$or;b$ect inherits )a$or;b$ect, adding methods and properties to process the ob$ect and determine the processed state and last processed time. )ining%tructure is an e!ample of a 9rocessable)a$or;b$ect .
5terate objects Vie1 object definitions Modif$ objects Process objects Add or delete objects %et +ermissions :eceive traces
Access and :ead Definition Access and :ead Definition Administrator Access, :ead Definition, and Process Administrator Administrator Administrator
NOTE ome operations, s(ch as iteratin& ob%ects, re4(ire a hi&her level of permission (sin& AMO than when (sin& a command A/0, s(ch as ADOMD.NET. This is beca(se ADOMD.NET and other A/0s (se database schemas to access ob%ects rather than metadata definitions. TIP 'o( can test sec(rity in yo(r application by impersonatin& roles or specific (sers. et the Effective 7oles property in yo(r connection strin& to a comma8delimited set of roles yo( want to impersonate, or set the Effective 9sername connection strin& property to the name of the (ser. Note that only server administrators can connect with these properties. .or example2
svr.Connect("location=localhost;" _ & "Initial Catalog=My ata!ase;"ffective #oles=$imite%Access#ole"&
Ob*e#t Creation
To create mining models programmatically using A);, you perform all the same steps you would do if you were creating and managing the models in the user interface. That is, as described in 4hapter G, create a database, data source, data source view, mining structure, and finally a mining model. To create any ob$ect on the server, you generally perform the following steps7
1. *nstantiate the ob$ect A. %et ob$ect 3ame and *( properties G. %et ob$ect"specific properties 4. Add ob$ect to its parent container C. 4all :pdate to the ob$ect or its parent
?or e!ample, 'isting 14.1 demonstrates how to connect to a local server and create a database.
S'! Create ata!ase(& im svr As Server im %! As ata!ase ( Create server o!)ect an% connect svr = *e+ Server(& svr.Connect("location=localhost"& ( Create %ata!ase an% set properties %! = *e+ ata!ase(& %!.*ame = "MovieClic," %!.I = "MovieClic," ( A%% %ata!ase an% commit to server svr. ata!ases.A%%(%!& %!.Up%ate(& ( isconnect from server svr. isconnect(& "n% S'!
"Initial Catalog=MovieClic,;Integrate% Sec'rity=3r'e"& ( Create %ata a%apters from %ata!ase ta!les an% loa% schemas im %aC'stomers As *e+ S5l ataA%apter("Select 6 from S'rvey"0 cn& %aC'stomers.7illSchema(%set0 Schema3ype.Mappe%0 "C'stomers"& im %aChannels As *e+ S5l ataA%apter("Select 6 from Channels"0 cn& %aChannels.7illSchema(%set0 Schema3ype.Mappe%0 "Channels"& ( A%% relationship !et+een C'stomers an% Channels im %rC'stomerChannels As *e+ ata#elation("C'stomerChannels"0 _ %set.3a!les("C'stomers"&.Col'mns("S'rvey3a,enI "&0 _ %set.3a!les("Channels"&.Col'mns("S'rvey3a,enI "&& %set.#elations.A%%(%rC'stomerChannels& ( Create the S/0 a%% the %ataset0 an% a%% to the %ata!ase im %sv As *e+ ataSo'rce/ie+l ataSo'rce/ie+("MovieClic,"0 "MovieClic,"& %sv. ataSo'rceI = "MovieClic," %sv.Schema = %set.Clone(& ( Up%ate the %ata!ase to create the o!)ects on the server. %!.Up%ate(Up%ate-ptions."4pan%7'll& "n% S'!
The (%1 of 'isting 14.A contains the customer table and the channels table, but the models you want to build need more specific information than is present in the raw data7 the customers/ generation and a list of only the premium movie channels they watch. To accomplish this, you need to modify the code to add a calculated column to the 4ustomers table and swap out the 4hannels table with a named >uery returning only the limited set of channels you are interested in. 'isting 14.G contains 4reate(ataAccess;b$ects modified with a named calculation and named >uery.
S'! Create ataAccess-!)ects(.y/al %! As ( Create relational %ataso'rce im %s As *e+ #elational ataSo'rce("MovieClic,"0"MovieClic,"& %s.ConnectionString = "1rovi%er=S2$-$" .; ata So'rce=localhost;" & _ "Initial Catalog=MovieClic,;Integrate% Sec'rity=3r'e" %!. ataSo'rces.A%%(%s& ( Create connection to %ataso'rce to e4tract schema to %ataset im %set As *e+ ataSet(& im cn As *e+ S5lConnection(" ata So'rce=localhost;" & _ "Initial Catalog=MovieClic,;Integrate% Sec'rity=3r'e"& ( Create the c'stomers %ata a%apter +ith the ( calc'late% col'mn appen%e% ata!ase&
im %aC'stomers As *e+ S5l ataA%apter("S"$"C3 60 " & _ "(CAS" 89"* (Age : ;<& 39"* (=en>( " & _ " 89"* (Age ?= ;< A* "7#-M C'stomers"0 cn& %aC'stomers.7illSchema(%set0 Schema3ype.Mappe%0 "C'stomers"& ( A%% e4ten%e% properties to the =eneration col'mn ( in%icating to AnalysisServices that it is a ( calc'late% col'mn. %set.3a!les("C'stomers"&.Col'mns("=eneration"&. _ "4ten%e%1roperties.A%%(" !Col'mn*ame"0 "=eneration"& %set.3a!les("C'stomers"&.Col'mns("=eneration"&. _ "4ten%e%1roperties.A%%(" escription"0 _ "C'stomer =eneration"& %set.3a!les("C'stomers"&.Col'mns("=eneration"&. _ "4ten%e%1roperties.A%%("Is$ogical"0 "3r'e"& %set.3a!les("C'stomers"&.Col'mns("=eneration"&. _ "4ten%e%1roperties.A%%("Comp'te%Col'mn"4pression"0 _ "CAS" 89"* (Age : ;<& 39"* (=en>( " & _ " 89"* (Age ?= ;< A* Age : @<& 39"* (=enA( " & _ " "$S" (.a!y .oomer( "* "& ( Create a (pay channels( %ata a%apter +ith a c'stom 5'ery ( for o'r name% 5'ery. im %a1ayChannels As *e+ S5l ataA%apter("S"$"C3 6 7#-M Channels "& _ "89"#" Channel I* ((Cinema4(0 ("ncore(0 (9.-(0 (Sho+time(0 " & _ "(S3A#BC(0 (3he Movie Channel(&"0 cn& %a1ayChannels.7illSchema(%set0 Schema3ype.Mappe%0 "1ayChannels"& ( A%% e4ten%e% properties to the 1ayChannels ta!le ( in%icating to AnalysisServices that it is a ( name% 5'ery. %set.3a!les("1ayChannels"&. _ "4ten%e%1roperties.A%%("Is$ogical"0 "3r'e"& %set.3a!les("1ayChannels"&. _ "4ten%e%1roperties.A%%(" escription"0 _ "Channels re5'iring an a%%itional fee"& %set.3a!les("1ayChannels"&. _ "4ten%e%1roperties.A%%("2'ery efinition"0 _ "S"$"C3 6 7#-M Channels 89"#" Channel I* ((Cinema4(0 " & _ "("ncore(0 (9.-(0 (Sho+time(0 (S3A#BC(0 (3he Movie Channel(&"& %set.3a!les("1ayChannels"&. _ "4ten%e%1roperties.A%%("3a!le3ype"0 "/ie+"& ( A%% relationship !et+een C'stomers an% 1ayChannels im %rC'stomer1ayChannels As *e+ ata#elation("C'stomer1ayChannels"0_ %set.3a!les("C'stomers"&.Col'mns("S'rvey3a,enI "&0 _ %set.3a!les("1ayChannels"&.Col'mns("S'rvey3a,enI "&& %set.#elations.A%%(%rC'stomer1ayChannels& ( Create the %sv0 a%% the %ataset0 an% a%% to the %ata!ase im %sv As *e+ %sv. ataSo'rceI ataSo'rce/ie+("MovieClic,"0 "MovieClic,"& = "MovieClic," Age : @<& 39"* (=enA( " & _ " "$S" (.a!y .oomer( "* & AS =eneration" & _
%sv.Schema = %set.Clone(& %!. ataSo'rce/ie+s.A%%(%sv& ( Up%ate the %ata!ase to create the o!)ects on the server. %!.Up%ate(Up%ate-ptions."4pan%7'll& "n% S'!
( Create the col'mns of the MiningStr'ct're0 ( setting the type0 content0 an% %ata !in%ing. ( UserI im UserI col'mn As *e+ ScalarMiningStr'ct'reCol'mn("UserI%"0 "UserI%"&
UserI .3ype = MiningStr'ct'reCol'mn3ypes.$ong UserI .Content = MiningStr'ct'reCol'mnContents.Eey UserI .IsEey = 3r'e ( A%% %ata !in%ing to the col'mn. UserI .EeyCol'mns.A%%("C'stomers"0 "UserI%"0 -le !3ype.Integer& ( A%% the col'mn to the MiningStr'ct're ms.Col'mns.A%%(UserI & ( =eneration col'mn im =eneration As *e+ ScalarMiningStr'ct'reCol'mn _ ("=eneration"0 "=eneration"& =eneration.3ype = MiningStr'ct'reCol'mn3ypes.3e4t =eneration.Content = MiningStr'ct'reCol'mnContents. iscrete =eneration.EeyCol'mns.A%%("C'stomers"0 "=eneration"0 _ -le !3ype./arChar& ( A%% the col'mn to the MiningStr'ct're. ms.Col'mns.A%%(=eneration& ( A%% *este% 3a!le !y creating a ta!le col'mn
( an% a%%ing a ,ey col'mn to the neste% ta!le. im 1ayChannels As *e+ 3a!leMiningStr'ct'reCol'mn _ ("1ayChannels"0 "1ayChannels"& im Channel As *e+ ScalarMiningStr'ct'reCol'mn _ ("Channel"0 "Channel"& Channel.3ype = MiningStr'ct'reCol'mn3ypes.3e4t Channel.Content = MiningStr'ct'reCol'mnContents.Eey Channel.IsEey = 3r'e Channel.EeyCol'mns.A%%("1ayChannels"0 "Channel"0 -le !3ype./arChar& 1ayChannels.Col'mns.A%%(Channel& ms.Col'mns.A%%(1ayChannels& ( A%% the MiningStr'ct're to the %ata!ase. %!.MiningStr'ct'res.A%%(ms& ms.Up%ate(& "n% S'!
NOTE 'o( may wonder why yo( specify that the col(mn content is Hey and also have to set the *sHey property to True. This is d(e to the extensibility in the content types defined in the O"E D* for Data Minin& specification. 5(rrently Analysis ervices s(pports three types of keys2 :ey, :ey Time, and :ey e4(ence. ;avin& a separate *sHey property allows yo( to take advanta&e of this extensibility in the f(t(re.
( algorithm an% parameters. Cl'sterMo%el = ms.CreateMiningMo%el(3r'e0 _ "1remi'm =eneration Cl'sters"& Cl'sterMo%el.Col'mns.Clear(& Cl'sterMo%el.Algorithm = "Microsoft_Cl'sters" Cl'sterMo%el.Algorithm1arameters.A%%("C$US3"#_C-U*3"0 <& ( A%% the case ,ey F every mo%el m'st contain the case ,ey. mmc = Cl'sterMo%el.Col'mns.A%%("UserI "& mmc.So'rceCol'mnI = "UserI " mmc.Usage = "Eey" ( A%% the =eneration col'mn. mmc = Cl'sterMo%el.Col'mns.A%%("=eneration"& mmc.So'rceCol'mnI = "=eneration" ( A%% the neste% ta!le. mmc = Cl'sterMo%el.Col'mns.A%%("1ayChannels"& mmc.So'rceCol'mnI = "1ayChannels" ( A%% the neste% ,ey G re5'ire% for neste% ta!les mmc = mmc.Col'mns.A%%("Channel"& mmc.So'rceCol'mnI = "Channel" mmc.Usage = "Eey" ( Copy the cl'ster mo%el an% change the necessary properties ( to ma,e it a tree mo%el to pre%ict =eneration. 3reeMo%el = Cl'sterMo%el.Clone(& 3reeMo%el.*ame = "=eneration 3rees" 3reeMo%el.I = "=eneration 3rees" 3reeMo%el.Algorithm = "Microsoft_ ecision_3rees" 3reeMo%el.Algorithm1arameters.Clear(& 3reeMo%el.Col'mns("=eneration"&.Usage = "1re%ict" 3reeMo%el.Col'mns("1ayChannels"&.Usage = "1re%ict" ms.MiningMo%els.A%%(3reeMo%el& ( S'!mit the mo%els to the server. Cl'sterMo%el.Up%ate(& 3reeMo%el.Up%ate(& "n% S'!
DETERMI I + SER,ER CAPA%ILITIES -.en #reating mo'els on t.e server/ it is (se$(l to (n'erstan' e0a#tl" 1.at 2in's o$ mo'els "o( #an #reate. T.e algorit.m sele#tion varies bet1een Stan'ar' an' Enter&rise e'itions o$ SQL Server/ &l(s t.ere ma" be &l(g3in
algorit.ms installe' as 1ell. A''itionall"/ ea#. algorit.m s(&&orts a variet" o$ &arameters 1.ose 'e$a(lt val(es ma" var" 'e&en'ing on t.e server #on$ig(ration. T.e )*3*3EI% D1*4 % an' )*3*3EI9ADA) T D% s#.ema ro1sets 'es#ribe' in C.a&ter ! #ontain 'es#ri&tions o$ t.e available algorit.ms an' t.eir #a&abilities. 4o( #an (se an" #lient #omman' API to a##ess t.ese s#.emas/ or/ even better/ "o( #an (se t.e ob*e#t mo'el &rovi'e' in ADOMD. ET to iterate 5(i#2l" t.ro(g. t.e server6s 'ata mining #a&abilities. T.e $ollo1ing #o'e 'emonstrates .o1 to iterate t.ro(g. t.e mining servi#es an' t.eir res&e#tive &arameters.
S'! iscoverServices(& im cn As *e+ A%om%Connection("location=localhost"& im ms As MiningService im mp As MiningService1arameter cn.-pen(& 7or "ach ms In cn.MiningServices Console.8rite$ine("ServiceH " & ms.*ame& 7or "ach mp In ms.Availa!le1arameters Console.8rite$ine(" 1arameterH " & mp.*ame " efa'ltH " & mp. efa'lt/al'e& *e4t *e4t cn.Close(& "n% S'!
& _
( Create the trace o!)ect to trace progress reports ( an% a%% the col'mn containing the progress %escription. t = svr.3races.A%%(& e = t."vents.A%%(3race"ventClass.1rogress#eportC'rrent& e.Col'mns.A%%(3raceCol'mn.3e4t ata& t.Up%ate(& ( A%% the han%ler for the trace event.
A%%9an%ler t.-n"vent0 A%%ress-f 1rogress#eport9an%ler 3ry ( Start the trace process of the %ata!ase0 then stop it. t.Start(& %!.1rocess(1rocess3ype.1rocess7'll& t.Stop(& Catch e4 As "4ception "n% 3ry ( #emove the trace from the server. t. rop(& "n% S'! S'! 1rogress#eport9an%ler(.y/al sen%er As -!)ect0 _ .y/al e As 3race"ventArgs& l!l1rogress.3e4t = e(3raceCol'mn.3e4t ata& "n% S'!
IInitial Catalog=MovieClic,"& 3ry ( "4port the mo%el to a share on the %estination server. im cm%"4port As *e+ A%om%Comman% cm%"4port.Connection = cnSo'rce cm%"4port.Comman%3e4t = ""A1-#3 MI*I*= M- "$ =eneration3ree " & _ "3- (JJ1ro%'ctionServerJ3ransferJ=eneration3ree.a!, ( " & _ "8I39 1ASS8-# = (My1ass+or%(" cnSo'rce.-pen(& cm%"4port."4ec'te*on2'ery(& ( Import the mo%el into the c'rrent %ata!ase on the ( %estination server. im cm%Import As *e+ A%om%Comman% cm%Import.Connection = cn est cm%Import.Comman%3e4t = "IM1-#3 7#-M " & _ " KcHJ3ransferJ=eneration3ree.a!,L " & _ " 8I39 1ASS8-# = (My1ass+or%( " cn est.-pen(& cn est."4ec'te*on2'ery(& Catch e4 As "4ception "n% 3ry cnSo'rce.Close(& cn est.Close(& "n% S'!
*n this e!ample, you simply move one model between servers. The +9;DT command is fle!ible enough to e!port multiple models or entire mining structures as well. *f you need to reprocess the models on the destination server, you can append *34':( ( 9 3( 34* % to the +9;DT command, and the necessary (atasource and (%1 ob$ects will be included in the e!port pac#age.
NOTE D(e to the fact that O"A/ ob%ects do not s(pport ob%ect8level importin& and exportin&, O"A/ minin& models cannot be exported (sin& the +9;DT command.
access permissions of that role. 'isting 14.K demonstrates creating a role and assigning permissions.
S'! SetMo%el1ermissions(.y/al %! As ata!ase0 .y/al mm As MiningMo%el& ( Create a ne+ role an% a%% mem!ers. im r As *e+ #ole("Mo%el#ea%er"0 "Mo%el#ea%er"& r.Mem!ers.A%%(*e+ #oleMem!er ("M-/I"C$ICEJMamiemac"&& r.Mem!ers.A%%(*e+ #oleMem!er ("M-/I"C$ICEJBhaotang"&& ( A%% the role to the %ata!ase an% 'p%ate %!.#oles.A%%(r& r.Up%ate(& ( Create a permission o!)ect referring to the role. im mmp As *e+ MiningMo%el1ermission(& mmp.*ame = "Mo%el#ea%er" mmp.I = "Mo%el#ea%er" = "Mo%el#ea%er" mmp.#oleI
( Assign access rights to the permission. mmp.#ea% = #ea%Access.Allo+e% mmp.Allo+.ro+sing = 3r'e mmp.Allo+ rill3hro'gh = 3r'e mmp.Allo+1re%ict = 3r'e ( A%% permissions to the mo%el an% 'p%ate mm.MiningMo%el1ermissions.A%%(mmp& mm.Up%ate(& "n% S'!
mining >uery prepared from the 9rediction &uery 2uilder, you can embed the results using the following steps.
1. :sing the 2* or %&' <or#bench, create a data mining >uery in the 9rediction &uery 2uilder. %witch to %&' view and copy the generated >uery. A. *n 1isual %tudio, add an ;le(b(ataAdapter to a form. G. *n the (ata Adapter 4onfiguration <i=ard, add a connection to your database using the provider entitled )icrosoft ;' (2 9rovider for Analysis %erver L.B. 4. ;n the following page, select :se %&' %tatements. Advance the wi=ard and paste your prediction >uery into the te!t bo!. 4lic# the Advanced ;ptions button and clear the Eenerate *nsert, :pdate, and (elete %tatements chec# bo!. C. ?inish the wi=ard, setting any other desired options and ignoring any warnings. F. Eenerate a (ata%et by right"clic#ing the ;le(b(ataAdapter you created and selecting Eenerate (ata%et. J. Add a (ataErid control to your <indows form. %et the (ata%ource property of the (ataErid to the table inside the (ata%et you generated. K. (ouble"clic# the form and add the following line of code to the form load event7
-le ! ataA%apterN.7ill( ataSetNN& ( Change names as appropriate
2uild and run your application, and the result of your data mining >uery is loaded into the data grid on your form. This e!ample may be trivial and of limited use, but it does demonstrate e!actly how simple it is to embed data mining results into an arbitrary application.
A(;.3 T will notice that the only differences between the A9*s thus far are the names of the data access classes.
1rivate S'! Single#es'lt2'ery(& ( Create connection an% comman% o!)ects. im cn As *e+ A%om%Connection("location=localhost; " & _ "Initial Catalog=MovieClic,"& im cm% As *e+ A%om%Comman%(& ( InitialiDe comman% +ith 5'ery cm%.Connection = cn cm%.Comman%3e4t = "S"$"C3 1re%ict(=eneration& " & _ "7#-M O=eneration 3reesP *A3U#A$ 1#" IC3I-* M-I* " & _ "S"$"C3 (S"$"C3 (9.-( AS Channel U*I-* " & _ "S"$"C3 (Sho+time( AS Channel& as 1ayChannels as t" ( -pen connection an% +rite res'lt to %e!'g +in%o+ cn.-pen(& K"4ec'teScalar is not s'pporte% in the #3M version of
im rea%er As A%om% ata#ea%er #ea%er = cm%."4ec'te#ea%er(& #ea%er.#ea%(& e.'g.8rite$ine(rea%er.=et/al'e(<& .3oString(&&
:se !ecuteDeader when e!ecuting >ueries returning multiple columns or rows as in 'isting 14.1B, which performs the same prediction as in 'isting 14.L but returns the flattened result of 9redict.istogram, so you can see the li#elihood of all possible prediction results.
1rivate S'! M'ltiple#o+2'ery(&
( Create connection an% comman% o!)ects. im cn As *e+ A%om%Connection("location=localhost;" & _ "Initial Catalog=MovieClic,"& im cm% As *e+ A%om%Comman%(& ( InitialiDe comman% +ith 5'ery cm%.Connection = cn cm%.Comman%3e4t="S"$"C3 7$A33"*" 1re%ict9istogram(=eneration& " & _ "7#-M O=eneration 3reesP *A3U#A$ 1#" IC3I-* M-I* " & _ "(S"$"C3 (S"$"C3 (9.-( AS Channel U*I-* " & _ "S"$"C3 (Sho+time( AS Channel& as 1ayChannels& as t" ( -pen connection an% e4ec'te 5'ery
IM rea%er AS A%om% ata#ea%er cn.-pen(& rea%er = cm%."4ec'te#ea%er(& ( 8rite fiel% names to %e!'g +in%o+ im i As Integer 7or i = < 3o rea%er.7iel%Co'nt F N e!'g.8rite(rea%er.=et*ame(i& & "Jt"& *e4t e!'g.8rite$ine(""& ( Iterate res'lts to %e!'g +in%o+ 8hile rea%er.#ea% 7or i = < 3o rea%er.7iel%Co'nt F N e!'g.8rite(rea%er.=et/al'e(i&.3oString(&& *e4t e!'g.8rite$ine(""& "n% 8hile ( Close rea%er an% connection rea%er.Close(& cn.Close(& "n% S'!
*n the last e!ample, you flatten the results of a nested table >uery for ease of iteration. *n some situations, however, flattening the results is not practical, for e!ample when you have a >uery returning multiple nested tables, or even nested tables inside nested tables. 'isting 14.11 demonstrates how to iterate the results of the previous e!ample with the ?'ATT 3 ( #eyword removed.
im neste%rea%er As A%om% ata#ea%er 8hile rea%er.#ea%(& neste%rea%er = rea%er.=et#ea%er(<& 8hile neste%rea%er.#ea%(& e!'g.8rite$ine(neste%rea%er.=et/al'e(<&.3oString(&& "n% 8hile neste%rea%er.Close(&( .e s're to close the neste% rea%ersC "n% 8hile
-isting 89.88 5terating the Attribute column of the nested PredictAistogram result
%o far, everything you have done could have been done, albeit less efficiently, with A(;.3 T. 3e!t, you learn to e!pand your application/s functionality by using a parameteri=ed >uery to change the prediction input. A(;.3 T does not support named parameters for providers other than the %&' %erver relational
engine. To use named parameters in your >uery, you are forced to use A(;)(.3 T. 'isting 14.1A demonstrates your data mining >uery using named parameters.
( InitialiDe comman% +ith parameteriDe% 5'ery cm%.Comman%3e4t = "S"$"C3 1re%ict9istogram(=eneration& " & _ "7#-M O=eneration 3reesP *A3U#A$ 1#" IC3I-* M-I* " & _ "(S"$"C3 (S"$"C3 QChannelN AS Channel U*I-* " & _ "S"$"C3 QChannelR AS Channel& as 1ayChannels& as t" ( InitialiDe parameters an% a%% to comman% im ChannelN As *e+ A%om%1arameter(& im ChannelR As *e+ A%om%1arameter(& ChannelN.1arameter*ame = "QChannelN" ChannelR.1arameter*ame = "QChannelR" cm%.1arameters.A%%(ChannelN& cm%.1arameters.A%%(ChannelR& ( Set parameter val'es cm%.1arameters("QChannelN"&./al'e = "9.-" cm%.1arameters("QChannelR"&./al'e = "Sho+time"
'isting 14.1A assumes that you #now that you only allow and re>uire two channels to perform the prediction. ;bviously, this is not always the case. A(;)(.3 T allows you use a parameter to pass an entire table as the input data source. This allows you to easily perform predictions using data that is on the client or otherwise unavailable to the server. 'isting 14.1G demonstrates using shaped table parameters as prediction input.
( Create ta!le for case im case3a!le as ne+ case3a!le.#o+s.A%%(N& ( Create neste% ta!le im neste%3a!le as ne+ ata3a!le neste%3a!le.Col'mns.A%%("C'stI "0 _ System.3ype.=et3ype("System.Int;R"&& neste%3a!le.Col'mns.A%%("Channel"0 _ System.3ype.=et3ype("System.String"&& neste%3a!le.#o+s.A%%(N0"9.-"& neste%3a!le.#o+s.A%%(N0"Sho+time"& ( InitialiDe comman% +ith parameteriDe% 5'ery cm%.Comman%3e4t = "S"$"C3 1re%ict9istogram(=eneration& " & _ "7#-M O=eneration 3reesP *A3U#A$ 1#" IC3I-* M-I* " & _ ata3a!le case3a!le.Col'mns.A%%("C'stI "0 System.3ype.=et3ype("System.Int;R"&&
"S9A1" S QCase3a!le T " & _ "A11"* "as t" ( InitialiDe parameters an% a%% to comman% im case1aram As *e+ A%om%1arameter(& im neste%1aram As *e+ A%om%1arameter(& case1aram.1arameter*ame = "Case3a!le" neste%1aram.1arameter*ame = "*este%3a!le" cm%.1arameters.A%%(case1aram& cm%.1arameters.A%%(*este%1aram& ( Set parameter val'es cm%.1arameters("Case3a!le"&./al'e = case3a!le cm%.1arameters("*este%3a!le"&./al'e = neste%3a!le (S Q*este%3a!le T " & _ "#"$A3" C'stI to C'stI & AS Channels " & _
%ro1sing Mo'els
As described in 4hapter A, all the model metadata and content are accessible through schema rowsets. .owever, using A(;)(.3 T, you can browse the server and models using a rich ob$ect model instead. ?igure 14.A shows the ma$or data mining ob$ects of A(;)(.3 T.
As you can see from the ob$ect model, you can simply connect to the server and iterate over any of the data mining ob$ects without having to resort to schema >ueries. A nice benefit to application developers is that if a connected user does not have access to a particular ob$ect, that ob$ect will simply not appear in its collection, as if it didn/t e!ist. The most interesting ability you gain by using the A(;)(.3 T ob$ect model is the ability to iterate mining model content in a natural, hierarchical, manner using ob$ects instead of trying to unravel the flat schema rowset form. :sing this ob$ect model ma#es it easy to write comple! programs to e!plore or display the content to your users. ?or e!ample, an interesting problem for the )icrosoft (ecision Trees algorithm is this7 given an attribute, find all of the trees that contain a split on that attribute.
'isting 14.14 demonstrates using the content ob$ect model to e!plore trees to find splits on a specified attribute. ?irst, you identify all child nodes of the root that represents trees and then recursively chec# the children of the trees to see whether their marginal rule contains the re>uested attribute. 2y loo#ing at the node type rather than at the algorithm used, this function will wor# against any model containing trees, whether it uses the )icrosoft (ecision Trees algorithm, the )icrosoft Time %eries algorithm, or any third"party tree"based algorithms.
( I%entify all the attri!'tes that split ( on a specifie% attri!'te. S'! 7in%Splits(.y/al cn As A%om%Connection0 _ .y/al Mo%el*ame As String0 .y/al Attri!'te*ame As String& ( 7in% the specifie% mo%el. im mo%el As MiningMo%el mo%el = cn.MiningMo%els(Mo%el*ame& If Is .*'ll(mo%el& 3hen #et'rn ( $oo, for the attri!'te in all mo%el trees. im no%e As MiningContent*o%e 7or "ach no%e In mo%el.Content.Item(<& .Chil%ren If no%e.3ype = Mining*o%e3ype.3ree 3hen 7in%Splits(no%e0 Attri!'te*ame& "n% If *e4t "n% S'! ( #ec'rsively search for the attri!'te among content no%es ( #et'rn +hen chil%ren e4ha'ste% or attri!'te is fo'n% S'! 7in%Splits(.y/al no%e As MiningContent*o%e0 _ .y/al Attri!'te*ame As String& ( Chec, for the attri!'te in the Marginal#'le. If no%e.Marginal#'le.Contains(Attri!'te*ame& 3hen ( 3he attri!'te col'mn contains the ( name of the tree. e!'g.8rite$ine(no%e.Attri!'te& #et'rn "n% If ( #ec'rse over chil% no%es im chil%*o%e As MiningContent*o%e 7or "ach chil%*o%e In no%e.Chil%ren 7in%Splits(chil%*o%e0 Attri!'te*ame& *e4t "n% S'!
0ou can also use the content to find the reason for a prediction by using the 9redict3ode*d function. ?or e!ample, you can use this >uery
to retrieve the *( of the node used to generate the prediction, and feed the result into a function li#e that in 'isting 14.1C.
7'nction =et1re%iction#eason(.y/al mo%el As MiningMo%el0 _ .y/al *o%eI As String& As String im no%e As MiningContent*o%e no%e = mo%el.=et*o%e7romUni5'e*ame(*o%eI & If Is .*'ll(no%e& 3hen 3hro+ *e+ System."4ception("*o%e not fo'n%"& ret'rn no%e. escription; "n% 7'nction
Store' Pro#e'(res
A(;)(.3 T provides an e!cellent ob$ect model for accessing server ob$ects and browsing content. .owever, there are some ma$or drawbac#s. ?or the ?ind%plits method in 'isting 14.14, you need to bring the entire content from the server to the client to determine the list. A model with 1,BBB trees and 1,BBB nodes per tree would re>uire the marshaling of over 1,BBB,BBB rows, even if only a handful of trees referenced the desired attribute. Also, in the Eet9redictionDeason function, even though you can access the desired node directly using Eet3ode?rom:ni>ue3ame, you are still causing a round"trip to the server on each call; performing this operation in batch is not recommended. There is a solution to these problems. Analysis %ervices in %&' %erver ABBC supports stored procedures that can be written in any managed language such as 45, 12.3 T, or managed 4MM. The ob$ect model, A(;)(M, is almost identical to that of A(;)(.3 T, ma#ing conversion between the two models simple. The clear advantage of A(;)(M is that all of the content is available on the server, and you can return only the information you need to the server. 0ou can call :(?s by themselves, using the 4A'' synta! or as part of a ()+ >uery. ?or e!ample, the following >uery
CA$$ MySprocs.3ree9elpers.7in%Splits(K=eneration 3reesL0L9.-L&
calls a stored procedure directly and simply returns the result, whereas the >uery
S"$"C3 1re%ict(=eneration&0 MySprocs.3ree9elpers.=et1re%iction#eason(1re%ict*o%eI%(=eneration&& U
calls a stored procedure for every row returned from the prediction >uery. *n this case, the >uery will return the prediction result plus the e!planation of the result for every row.
CALLI + ,%A A D E8CEL 97 CTIO S AS STORED PROCED7RES I$ "o( .ave Mi#roso$t O$$i#e installe' on t.e same ma#.ine as "o(r Anal"sis Servi#es server/ "o( #an leverage t.e $(n#tions o$ ,is(al %asi# $or A&&li#ations :,%A; an' E0#el as store' &ro#e'(res insi'e "o(r DM8 5(eries. 9or e0am&le/ "o( #an #onvert t.e &re'i#tion o(t&(t to lo1er#ase li2e t.is<
S"$"C3 $Case(1re%ict(MyMo%el.O9ome -+nershipP& 7#-M MyMo%el 1#" IC3I-* M-I* U.
I$ a $(n#tion e0ists in bot. E0#el an' ,%A/ "o( nee' to &re$i0 t.e $(n#tion name 1it. t.e name o$ t.e $(n#tion. 9or e0am&le/ to get t.e base 1= log o$ a &re'i#tion $rom E0#el/ an' t.e nat(ral log o$ t.e &re'i#tion $rom ,%A/ "o( 1o(l' iss(e a 5(er" li2e t.is<
S"$"C3 "4celC$og(1re%ict(Sales&&0 /.AC$og(1re%ict(Sales&& 7rom MyMo%el U.
I$ an E0#el or ,%A $(n#tion also e0ists in MD8 or DM8 or #ontains a N #.ara#ter/ "o( nee' to es#a&e t.e $(n#tion name 1it. s5(are bra#2ets :> ?;. 9or e0am&le to $ormat a &re'i#tion as #(rren#"/ $or e0am&le @!=.AB/ "o( 1o(l' iss(e a 5(er" li2e t.is<
S"$"C3 O7ormatP(1re%ict(Sales&0 KV%.%%L& 7#-M MyMo%el U.
T.e s(&&orte' $(n#tions $rom ,%A an' E0#el are liste' in A&&en'i0 %.
SE DI + COMPLE8 T4PES TO STORED PROCED7RES I$ "o( nee' to sen' #om&le0 t"&es/ s(#. as str(#t(res or arra"s/ to a store' &ro#e'(re/ "o( #an serialiCe t.em (sing t.e 8MLSerialiCer on t.e #lient an' sen' t.em as an 8ML string. On t.e server si'e/ 'eserialiCe t.e str(#t(re or arra"/ an' #all an overloa'e' $(n#tion (sing t.e #om&le0 t"&es "o( are intereste' in. 9or e0am&le/ "o( ma" .ave a $(n#tion t.at re5(ires an arra" o$ t.e $ollo1ing t"&e<
1'!lic Str'ct're My3ype 1'!lic a As Integer 1'!lic ! As String "n% Str'ct're
4o( #o(l' 1rite t.e $ollo1ing $(n#tion to serialiCe t.e arra" into an 8ML string an' sen' t.at string as a &arameter to t.e store' &ro#e'(re<
7'nction SerialiDeMy3ype(.y/al MyArray As My3ype(&& As String im s As *e+ System.Aml.SerialiDation.AmlSerialiDer(MyArray.=et3ype(&& im s+ = *e+ System.I-.String8riter(& im str As String s.SerialiDe(s+0 MyArray& #et'rn s+.3oString(& "n% 7'nction
On t.e server si'e/ "o( 1o(l' '(&li#ate t.e t"&e 'e$inition an' 1rite a st(b $(n#tion to 'eserialiCe t.e arra" an' #all t.e real $(n#tion.
1'!lic 7'nction MySproc(.y/al 4mlString As String& As ata3a!le im MyArray(& As My3ype im s As *e+ System.Aml.SerialiDation.AmlSerialiDer(MyArray.=et3ype(&& im sr = *e+ System.I-.String#ea%er(4mlString& MyArray = s. eserialiDe(sr& #et'rn MySproc(MyArray& "n% 7'nction 7'nction MySproc(.y/al MyArray As My3ype(&& As ata3a!le ... K 7'nction !o%y "n% 7'nction
T.is strateg" 1ill allo1 "o( to &ass #om&le0 t"&es an' 1ill &re&are "o( $or $(t(re versions t.at ma" allo1 nat(rall" &assing #om&le0 t"&es.
calling it won/t have any undesirable side effects; you wouldn/t want to create the same ob$ect twice, for instance. The 4onte!t ob$ect contains an !ecute?or9repare property that you can chec# before performing any time"consuming operations in your procedure. *f you are returning a (ataTable or (ata%et , you should fully define the ob$ects and return them empty of data so the client will #now the schema. *n general, you should not raise errors during preparation, especially for missing ob$ects, because the prepare call could be called during a batch >uery, and the ob$ects may e!ist by the time the procedure is called to return a result. To indicate that your procedure does not have any unwanted side effects, you must add the custom attribute %afeTo9repare.
( Create the res'lt ta!le an% a%% a col'mn. ( for the attri!'te im t!l#es'lt As *e+ ata3a!le(& t!l#es'lt.Col'mns.A%%("Attri!'te"0 _ System.3ype.=et3ype("System.String"&& ( If this is a prepare statement0 ret'rn the empty ( ta!le for schema information. If Conte4t."4ec'te7or1repare 3hen #et'rn t!l#es'lt ( Access the mo%el an% thro+ an e4ception if not fo'n%. ( "rror te4t +ill !e propagate% to the client. im mo%el As MiningMo%el mo%el = Conte4t.MiningMo%els(Mo%elI & If Is .*'ll(mo%el& 3hen 3hro+ _ *e+ System."4ception("Mo%el not fo'n%"& ( $oo, for the attri!'te in all mo%el trees. If mo%el.Content.Co'nt ? < 3hen im no%e As MiningContent*o%e 7or "ach no%e In mo%el.Content(<&.Chil%ren
If no%e.3ype = Mining*o%e3ype.3ree 3hen 7in%Splits(no%e0 Attri!'te*ame0 t!l#es'lt& "n% If *e4t "n% If ( #et'rn the ta!le containing the res'lt. #et'rn t!l#es'lt "n% 7'nction 1rivate 7'nction 7in%Splits(.y/al no%e As MiningContent*o%e0 _ .y/al Attri!'te*ame As String0 _ .y#ef t!l#es'lt As ata3a!le& As .oolean
( Chec, for the attri!'te in the Marginal#'le ( an% a%% ro+ to the ta!le if fo'n% If no%e.Marginal#'le.Contains(Attri!'te*ame& 3hen im ro+(& As String = Sno%e.Attri!'te.*ameT t!l#es'lt.#o+s.A%%(ro+& #et'rn 3r'e "n% If ( #ec'rse over chil% no%es im chil%*o%e As MiningContent*o%e 7or "ach chil%*o%e In no%e.Chil%ren If (7in%Splits(chil%*o%e0 Attri!'te*ame0 t!l#es'lt&& 3hen #et'rn 3r'e "n% If *e4t #et'rn 7alse "n% 7'nction :Safe3o1repare(3r'e&? _ 1'!lic 7'nction =et1re%iction#eason( _ .y/al *o%eI As String& As String ( #et'rn imme%iately if e4ec'ting for prepare If Conte4t."4ec'te7or1repare 3hen #et'rn "" ( #et'rn the no%e %escription. #et'rn Conte4t.C'rrentMiningMo%el. _ =et*o%e7romUni5'e*ame(*o%eI &. escription "n% 7'nction "n% Class
*n this e!ample, if you wanted to change the model that was performing the prediction, you would change the >uery inside the stored procedure, and you wouldn/t have to change >ueries embedded inside your application. ;f course, you can parameteri=e your >uery as demonstrated in 'isting 14.1A.
NOTE tored proced(res cannot be (sed to implement sec(rity in Analysis ervices. The sec(rity context of the c(rrent (ser is (sed to determine the access to the ob%ects inside the Analysis ervices server. That is, any (ser callin& a proced(re that 4(eries a minin& model who does not read permission on that model will receive a permission error. imilarly, a (ser callin& the Eet9redictionDeason 9D. from "istin& <=.<> who does not have browse permission on the model will also receive a permission error.
S&''ary
*n this chapter, you learned about the variety of A9*s that can be used to access functionality of Analysis %ervices programmatically. Although many A9*s are supported, the two most important A9*s are A); and A(;)(.3 T. A); is used for programmatically creating, processing, and managing mining models, structure, and your servers. A(;)(.3 T is the general client A9* for browsing and prediction >ueries. :sing these A9*s, you can create intelligent applications of your own. The logic of your application can involve dynamically creating mining models to solve user"defined problems. *t can apply the predictive power of the data mining algorithms or e!amine the learned content of the mining models to provide new insights and new abilities to your users. And finally, you can leverage your server in your application by writing user"defined functions that have access to all of the server resources through a .3 T programming model.