SQLServerCentral - www.sqlservercentral.com

A community of more than 1,600,000 database professionals and growing

Featured Contents

The Voice of the DBA

Normal Data

This editorial was originally published on Jan 28, 2015. It is being republished as Steve is out of the office.

I thought this was an interesting post: Your data is not normal. It talks about the fact that we often assume some normal distribution when working with data. We seem to think we can extrapolate, often in a linear fashion, based on past experience. We often use functions and write queries that tend to think of normality as having some standard distribution.

However that's not often the case. It's not necessarily even the best way to begin looking at data for patterns. We might need to apply a number of different types of algorithms to a set of data to see what types of analysis might be appropriate. Not sure what I mean? Let me give you a few examples.

As a speaker, I often get feedback from my sessions as a series of forms where people have rated the session on a scale and possibly included comments. Many speakers receive the data in the same format, but I've often seen speakers get one form that rates them poorly, or has a complaint. The speaker will then view that one item as somehow representative of the session. This despite the fact that they might have ten forms with high ratings. They're somehow assigning more weight to the negative rating then the all the others. There's perhaps some value here in doing so, since there might be a legitimate complaint, but often we need to discard this one form as an outlier.

As another example, I worked with an import firm that was trying to estimate future purchases based on past ones for ordering purposes. Initially they looked at an average of the last three months (rolling) to place future orders. However I pointed out that we had some seasonal fluctuations in our business. There were a few times a year that the rolling average would dramatically under or over estimate the product we needed. As we dug deeper into our historical data, we learned that a better estimate required some level of complexity in our formulas, but we also realized we needed to allow for human overrides when other information was available. That was a fun project, but one that was quite complex when we analyzed the data.

There are numerous other examples I could give, but the point is this. Learn about your data. Dig into patterns, and don't assume that normal for one set of data looks anything like normal for other sets.

Steve Jones from SQLServerCentral.com

Join the debate, and respond to today's editorial on the forums

ADVERTISEMENT
SITC

SQL in the City Summits - New York, London & Chicago

This October, Redgate are inviting those interested in learning how their business can benefit from implementing Compliant Database DevOps to attend one of our SQL in the City Summits. If you manage SQL Server databases, or manage a team of people who do so,  Redgate’s SQL in the City Summit is the conference for you. Find out who’s presenting and register for a Summit near you today.
Register now

SQL Provision

NEW SQL Provision: Create, protect, & manage SQL Server database copies for compliant DevOps

Create and manage database copies effortless and keeps compliance central to the process. With SQL Provisions virtual cloning technology, databases can be created in seconds using just MB of storage, enabling business to move faster. Sensitive data can be anonymized or replaced with realistic data to ensure data is protected as it moves between environments. Download your free trial

Featured Contents

 

Stairway to Biml 9 – Refactoring a Simple Custom Biml Framework

Andy Leonard from SQLServerCentral.com

This level of the Stairway to Biml examines how to refactor your Biml into an easier to maintain format. More »


 

Free eBook: Performance Tuning with SQL Server Dynamic Management Views

Additional Articles from Redgate

Dynamic Management Views (DMVs) are a significant and valuable addition to the DBA's troubleshooting armory, laying bare previously unavailable information regarding the under-the-covers activity of your database sessions and transactions. More »


 

Get SQL Server Row Count and Space Used with the sp_rows

Additional Articles from MSSQLTips.com

Check out this stored procedure sp_rows to get a listing of space used and rows counts for your SQL Server database tables. More »


 

From the SQLServerCentral Blogs - Checking Trace Flags with dbachecks, online docs and PSPowerHour

Rob Sewell from SQLServerCentral Blogs

It’s been a few weeks since i have blogged as I have been busy with a lot of other things.... More »


 

From the SQLServerCentral Blogs - Adaptive Joins and Join Hints

Grant Fritchey from SQLServerCentral Blogs

At a recent all-day seminar on query performance tuning I was asked a question that I didn’t know the answer... More »

Question of the Day

Today's Question (by Steve Jones):

I have a data frame that contains the top ten home run  hitters in baseball. The data frame looks like this:

 > HR.hitters rank players Hr 1 1 Barry Bonds 762 2 2 Hank Aaron 755 3 3 Babe Ruth 714 4 4 Alex Rodriguez 696 5 5 Willie Mays 660 6 6 Ken Griffey, Jr. 630 7 7 Albert Pujols 619 8 8 Jim Thome 612 9 9 Sammy Sosa 609 10 10 Frank Robinson 586 

I want to get just the all time ranking and the number of home runs. How can I do this in R to return this data?

 rank Hr 1 1 762 2 2 755 3 3 714 4 4 696 5 5 660 6 6 630 7 7 619 8 8 612 9 9 609 10 10 586

Think you know the answer? Click here, and find out if you are right.


We keep track of your score to give you bragging rights against your peers.
This question is worth 1 point in this category: R Language.

We'd love to give you credit for your own question and answer.
To submit a QOTD, simply log in to the Contribution Center.

ADVERTISEMENT

Exam Ref 70-765 Provisioning SQL Databases

Prepare for Microsoft Exam 70-765–and help demonstrate your real-world mastery of provisioning SQL Server databases both on premise and in SQL Azure. Designed for experienced IT professionals ready to advance their status, Exam Ref focuses on the critical thinking and decision-making acumen needed for success at the MCSA level. Get your copy from Amazon today.

Yesterday's Question of the Day

Yesterday's Question (by Steve Jones):

I have this table:

 CREATE TABLE UserConfig ( UserConfigKey INT IDENTITY(1,1) NOT NULL CONSTRAINT UserConfigPK PRIMARY KEY , UserID INT , IsActive BIT SPARSE , IsSubscriber BIT SPARSE , DefaultQuantity INT SPARSE , Options XML COLUMN_SET FOR ALL_SPARSE_COLUMNS ) GO 

I decide to add data like this:

 INSERT dbo.UserConfig ( UserID , Options ) VALUES (8, '<IsActive>0</IsActive><IsSubscriber>0</IsSubscriber><DefaultQuantity>0</DefaultQuantity>') 

What happens?

Answer: The row is inserted

Explanation:

You can insert data into the sparse columns or the column set.

Ref: Use Column Sets - click here


» Discuss this question and answer on the forums

Database Pros Who Need Your Help

Here's a few of the new posts today on the forums. To see more, visit the forums.

SQL Server 2017 : SQL Server 2017 - Development

Init-cap the characters in SQL server - Hi Sir, How to Init-cap the characters in SQL server in SQL query without creating any function? create table abc (Month_Name nvarchar(10)) [code...

Finding when date changes for customer in table - In table UserData, it accumulates rows for users every time there is some sort of transaction.  Looking to pull in this...


SQL Server 2016 : SQL Server 2016 - Administration

Connectivity issue with a multisubnet AlwaysOn - Hello, I have a problem with a multisubnet that I can't figure out how to resolve. I have a 2 replicas AlwaysOn...

Not able to see SSIS packages when SQL Instance and Integration Services are on two separate nodes. - I have a three nodes in Failover Clustered Instance 2016 SQL Server. Also I have SSIS installed on all three...

MDS: Source Control and CI - Is there anyone out there who has a slick way of managing MDS Models & is prepared to share their experience? What...


SQL Server 2016 : SQL Server 2016 - Development and T-SQL

Given table names generate a Select template these tables. - At the moment I have some time on my hands. So I want to extend a Generation procedure with some 'extra'...

Difference between two join method - Hi guys, Can anybody tell me the difference between the two below examples and why I would use one over the...

how to loop through table2 with column value in table 1 - Hi All, I have two tables table 1 and table 2. Table1 : ID Type StartDate  EndDate Units AAA 1  4/3/2018  4/7/2018 1 AAA 1  4/8/2018  4/21/2018 1 AAA 1  1/8/2017  2/6/2017 2 AAA...

Select Query Assistance (Comparison) - I am modifying a Query I have.  (I am using dummy data below as the set i have is quite...

Understanding the numeric and decimal data types - I am testing various values to better understand the numeric and decimal data types.  First I believe they are the...

Algorithm to combine 2 integer into one value and then disintegrate back into two - Hi, I may not be the first one to ask this, but I'm looking to find a way to combine two...

Importing from XML to SQL 2014 - This is my 4th day on SQL so I am a newbie. I created a query to import from an...


SQL Server 2014 : Administration - SQL Server 2014

MS SQL Server 2014 and TLS 1.2 settings - Dear Gurus, I have a situation here and need your kind help. 1)  ONE DB SERVER with following details: OS : MS Windows 2012...


SQL Server 2014 : Development - SQL Server 2014

optimize sql stored proc for better performance - Hello, I have this stored proc I am trying to determine if it can be optimized for better performance. If anyone...


SQL Server 2012 : SQL 2012 - General

Database Management vs Database Administration Meaning - I have been thinking about this terminology and just wanted to get some opinions. My first effort is below:  Database Management...


SQL Server 2008 : SQL Server 2008 - General

Information from ALL SQL Servers - Hi   I would need to grab all in


SQL Server 2008 : T-SQL (SS2K8)

sp_RefreshSQLModule run on a view completely changed the view definition - SQL Version: Microsoft SQL Server 2008 R2 (SP3) - 10.50.6000.34 (X64)     Aug 19 2014 12:21:34     Copyright (c) Microsoft Corporation     Standard Edition (64-bit) on...


Cloud Computing : SQL Azure - Development

Got an error about Staffold-DbContext - I'm going through an training course titled, "Building Web APIswith ASP.NET Core"on Lynda.com.  I tried issuing the Stafford-DbContext command, followingthe instructions...


Reporting Services : Reporting Services

select on .. - Hi, how do I give a login access to query ReportServer.dbo.ExecutionLog3 without granting RSExecRole? grant select on .. to   does...


Data Warehousing : Integration Services

Extract middle part of the string in SSIS - Hi, I need to extract a part of the string which is the date. For. e.g. -  to extract '310818' from the...

This email has been sent to newsletter@newslettercollector.com. To be removed from this list, please click here.
If you have any problems leaving the list, please contact the webmaster@sqlservercentral.com.
This newsletter was sent to you because you signed up at SQLServerCentral.com.
Feel free to forward this to any colleagues that you think might be interested.
If you have received this email from a colleague, you can register to receive it here.
This transmission is ©2018 Redgate Software Ltd, Newnham House, Cambridge Business Park, Cambridge, CB4 0WZ, United Kingdom. All rights reserved.
Contact: webmaster@sqlservercentral.com