(a.k.a. Coefficient of Determination) is a well-known metric that measures
the goodness fit of your regression model. Its value lies between 0 and 1. The
closer R-Squared is to 1, the better your model is with respect to the training
can be calculated as:
R-Squared = 1 – RSS / TSS
where RSS is the
residual sum of squares, and TSS is the total sum of squares.
interesting measure is Adjusted
R-Squared, which adjusts the R-Squared according to the number of
explanatory terms in a model.
article shows you how to compute both these measures for Microsoft mining
models using our favorite mechanism – stored procedures!
with a stored procedure
have put together an accompanying
package for this article that contains the following:
- A sample dataset that you
should attach to your SQL Server 2005 Database Engine instance.
- A sample SQL Server Analysis
Services project that helps you build a linear
regression model. You need to deploy the project to your Analysis Services
2005 server instance.
- A stored procedure to calculate
R-Squared for data mining models.
- A Windows application that
demonstrates how to call the stored procedure. To build this project, you
need to add a reference to Microsoft.AnalysisServices.AdomdClient.dll
located in “%ProgramFiles% \Microsoft.NET\ADOMD.NET\90”.
Let's walk you through the process of using the pieces in the
package to explore the R-Squared sample.
build and deploy the stored procedure (note that this version only
supports models that allow drill through):
Open the project in the
Visual Studio 2005.
Add a reference to the
msmgdsrv.dll (Microsoft.AnalysisServices.AdomdServer.dll) class library
This class library is available in the location where Analysis Services
2005 is installed By default, this location has the form:
C:\Program Files\Microsoft SQL Server\MSSQL.2\OLAP\bin
Please note that, depending on your installation options, MSSQL.2 might be
MSSQL.1 or MSSQL.3 or so
Build the project
Deploy the stored procedure on
Server Management Studio
the target Analysis Services server instance
In the Object
Explorer, select the 'Assemblies' node at the server level
and select 'New Assembly'
FileName, field, use the
Browse (...) button to navigate to the location where the stored
procedure was built. A file named RSquareHelper.dll should be available
at that location, under the bin\debug or bin\release folder. Select that
should be automatically filled with
permission of this assembly to unrestricted.
Select OK and
deploy the stored procedure
call the stored procedure, you can execute the following command in SQL Server
RSquareHelper.RSquareHelper.RSquare.CalculateRSquareFromDrillThroughModel (<modelName>, <targetAttributeName>)
<modelName> is the name of the target model
<targetAttributeName> is the name of the target
Now, go ahead and deploy the "RSquareLinearRegression"
Analysis Services project. This project creates a simple linear
regression model with two integer attributes x and y, where x is used as input
and y as output.
Finally, you can build and run the C# application CalculateRSquare in
the package. This application demonstrates how to call the stored procedure to calculatethe R-Squared metric.
following figure shows the interface of this application:
the interface, you can specify a server name and press the Connect to server
button to establish the connection to the Analysis Services server. The
application will display a list of valid catalogs, models and target attributes.
After selecting the desired model and target attribute, you can then click Calculate
R-Squared button to check the result. The application calls the stored
procedure to calculate the statistics. After fetching the result,
it shows the statistics in the text boxes at the bottom of the screen.