Escolar Documentos
Profissional Documentos
Cultura Documentos
toEnd Walkthrough
SQL Server 2016 and later
SetExecutionPolicyUnrestrictedScopeProcessForce
3. Run the following command to download script files to a local directory. If you do not specify a different directory, by
default the folder C:\tempR is created and all files saved there.
$source='https://raw.githubusercontent.com/Azure/AzureMachineLearning
DataScience/master/Misc/RSQL/Download_Scripts_R_Walkthrough.ps1'
$ps1_dest="$pwd\Download_Scripts_R_Walkthrough.ps1"
$wc=NewObjectSystem.Net.WebClient
$wc.DownloadFile($source,$ps1_dest)
.\Download_Scripts_R_Walkthrough.ps1DestDir'C:\tempR'
If you want to save the files in a different directory, edit the values of the parameter DestDir and specify a different folder
on your computer. If you type a folder name that does not exist, the PowerShell script will create the folder for you.
4. The Windows PowerShell command console should look like this after the download completes:
5. In the PowerShell console, you can run the command ls to view a list of the files that were downloaded to DestDir. For a
list and description of the files, see What's Included.
#InstallrequiredRlibrariesforthiswalkthroughiftheyarenotinstalled.
if(!('ggmap'%in%rownames(installed.packages()))){
install.packages('ggmap')
}
if(!('mapproj'%in%rownames(installed.packages()))){
install.packages('mapproj')
}
if(!('ROCR'%in%rownames(installed.packages()))){
install.packages('ROCR')
}
if(!('RODBC'%in%rownames(installed.packages()))){
install.packages('RODBC')
}
install.packages("ggmap",lib=grep("ProgramFiles",.libPaths(),value=TRUE)[1])
install.packages("mapproj",lib=grep("ProgramFiles",.libPaths(),value=TRUE)[1])
install.packages("ROCR",lib=grep("ProgramFiles",.libPaths(),value=TRUE)[1])
install.packages("RODBC",lib=grep("ProgramFiles",.libPaths(),value=TRUE)[1])
Notes:
This example uses the R grep function to search the vector of available paths and find the one in Program Files. For
more information, see http://www.rdocumentation.org/packages/base/functions/grep.
If you think the packages are already installed, check the list of installed packages by using the R function,
installed.packages().
On the client, you can install to a user library if you cannot write to the main library in Program Files. However, when
installing packages to the SQL Server computer, you must install them in the default library used by SQL Server R Services.
Do not use a user library. For more information, see Installing New R Packages on SQL Server.
.\RunSQL_R_Walkthrough.ps1
Troubleshooting
If you run into trouble, you can run all or any of the steps manually, using the lines of the PowerShell script as examples.
Make a note of the path to the downloaded data file and the file name where the data was saved. You will need the path to load
the data to the table using bcp.
The table schema was created but there is no data in the table
If the rest of the script ran without problems, you can upload the data to the table manually by calling bcp from the command
line as follows:
Using a SQL login
bcpTutorialDB.dbo.nyctaxi_sampleinc:\tempR\nyctaxi1pct.csvt','Srtestserver.contoso.com
fC:\tempR\taxiimportfmt.xmlF2C"RAW"b200000U<SQLlogin>P<password
bcpTutorialDB.dbo.nyctaxi_sampleinc:\tempR\nyctaxi1pct.csvt','Srtestserver.contoso.com
fC:\tempR\taxiimportfmt.xmlF2C"RAW"b200000T
.\RunSQL_R_Walkthrough.ps1server<serveraddress>dbname<newdbname>u<username>p
<password>csvfilepath<pathtocsvfile>
.\RunSQL_R_Walkthrough.ps1serverMyServer.subnet.domain.comdbnameMyDBuSqlUserNamep
SqlUsersPasswordcsvfilepathC:\temp\nyctaxi1pct.csv
Files
RunSQL_R_Walkthrough.ps1 You'll run this script first, using PowerShell. It calls the SQL scripts to load data into the
database.
taxiimportfmt.xml A format definition file that is used by the BCP utility to load data into the database.
RSQL_R_Walkthrough.R This is the core R script that will be used in rest of the lessons for doing your data analysis and
modeling. It provides all the R code that you need to explore SQL Server data, build the classification model, and create
plots.
SQL Scripts
This PowerShell script executes multiple TransactSQL scripts on the server. The following table lists the TransactSQL script files.
What it does
Creates database and two tables:
nyctaxi_sample: Table that stores the training data, a onepercent sample of the NYC taxi data set. A
clustered columnstore index is added to the table to improve storage and query performance.
nyc_taxi_models: An empty table that youll use later to save the trained classification model.
PredictTipBatc
hMode.sql
Creates a stored procedure that calls a trained model to predict the labels for new observations. It accepts a
query as its input parameter.
PredictTipSingl
eMode.sql
Creates a stored procedure that calls a trained classification model to predict the labels for new
observations. Variables of the new observations are passed in as inline parameters.
PersistModel.s
ql
Creates a stored procedure that helps store the binary representation of the classification model in a table
in the database.
fnCalculateDist
ance.sql
Creates a SQL scalarvalued function that calculates the direct distance between pickup and dropoff
locations.
fnEngineerFeat
ures.sql
Creates a SQL tablevalued function that creates features for training the classification model
All the SQL queries that are used in this walkthrough have been tested and can be run asis in your R code. However, if you want
to experiment further or develop your own solution using SQL queries, we recommended that you use a development
environment such as SQL Server Management Studio to test and tune your queries first, before adding them to your R code.
Next Lesson
Lesson 2: View and Explore the Data Data Science EndtoEnd Walkthrough
Previous Lesson