SQLServerCentral - www.sqlservercentral.com

A community of more than 1,600,000 database professionals and growing

Featured Contents

Featured Script

The Voice of the DBA

Building Test Data

One of the debates I've seen over the last few months is about test data in development environments. As I've been preparing for and learning more about the GDPR, it seems that many companies are concerned about holding sensitive data in their development systems. I think it's a valid concern, and I've often had to deal with this issue in the past, before any regulation impacted my work.

In one of my early jobs, we stored emails from customers in a table. We also had an email feature for our application. Needless to say, we needed to test that, though we didn't want to obviously send emails to real customers when we were testing some sort of feature. I've done that, sent those emails accidentally, and usually it resulted in a complaint and some scolding of the development staff. As a result, I learned to ensure that anytime we restored production to our QA system, we ran a script that would either change all emails to invalid values, or reset them to something we could use in a test system. In some cases, we'd reset them all to a specific address that we could check to see if the emails actually were sent.

In talking to many people, they often don't build test data for development systems because the data isn't valid. What a developer thinks about, or what might be randomly generated by some utility often isn't seen as valuable. Most developers want to see real data, perhaps because they can then better relate their work or a specific feature to the actual live system. Maybe it's easier to see actual customers, products, accounts, etc. when working with clients or testers, but I do think that certainly in light of the GDPR and other regulation, there are risks here.

While many people want to just restore production to refresh environments, I do think it's a poor idea to use actual sensitive data. Even if you trust your developers, there have been no shortage of attacks against development systems, loss of laptops or other files with production data that were intended for developers. We just don't secure test and development environments like production, and that means we are making a fundamental error in how work habits.

I've gone through different views on this topic across the last couple decades, and now I want some real data, some not real data. What I'd really want is a bunch of random data that is close to production, mimicking the shape and skew of production, but without any sensitive data. Then what I'd like is a set of known cases of data that are the types of data that we need to ensure works in our system. Various cases of transactions and values designed to cover the functional edges that we support. 

Of course building these sets isn't' always easy, and it's never going to be done. As long as we write software, we will need to maintain tests and data alongside the code. I do think this is a worthwhile investment in regulated industries, and likely worth doing in all industries. The thing is, it's not interesting or fun, and likely not to be ever be done in most organizations. I'd like to change that, and I hope you do as well.

Steve Jones from SQLServerCentral.com

Join the debate, and respond to today's editorial on the forums


The Voice of the DBA Podcast

Listen to the MP3 Audio ( 4.3MB) podcast or subscribe to the feed at iTunes and Libsyn. feed

The Voice of the DBA podcast features music by Everyday Jones. No relation, but I stumbled on to them and really like the music.

ADVERTISEMENT
SQL Monitor

What’s the top challenge faced by SQL Server professionals in 2018?

Learn how 626 SQL Server professionals monitor their estates in our new report on the State of SQL Server Monitoring. Discover the challenges currently facing the industry, and what is coming next. Download your free copy of the report

SQL Prompt

Become a more efficient SQL developer with SQL Prompt

Learn how to write SQL faster and more efficiently with these exclusive short videos from Data Platform MVPs and SQL Server experts. With SQL Prompt you can strip out the repetition of coding and write SQL 50% faster. Check out the tips

Featured Contents

 

SQL Server From .Net: Reading SQL Server Metadata in ADO.Net

Tim Wellman from SQLServerCentral.com

A basic introduction for developers (or anyone) about reading the metadata of a SQL Server database from a .Net application. More »


 

Why you should automate your release pipeline

As the trend in development shifts toward more agile delivery, and customers clamor for continuous deployment, this model of monolithic code releases begins to strain. Automating your release pipeline can cut down on your release cycle time and alleviate many of the Release Day issues that arise from still trying to manually deploy code. More »


 

Using Regular Expressions With T-SQL: From Beginner To Advanced

Additional Articles from MSSQLTips.com

Its possible to use some comparable expressions to a full regular expression library for matching certain patterns with T-SQL using the like operator. This tutorial uses these expressions for filtering price phrases involving alphabetic, numeric, and special characters More »


 

From the SQLServerCentral Blogs - Resend an email sent from SQL Server using TSQL

david.fowler 42596 from SQLServerCentral Blogs

Sometimes emails from SQL Server go missing, especially when you share an inbox with colleagues. On most occasions it doesn’t... More »


 

From the SQLServerCentral Blogs - What queries are getting blocked?

Kenneth Fisher from SQLServerCentral Blogs

Monday Grant Fritchey (b/t) put up a great post EXTENDED EVENTS, THE SYSTEM_HEALTH SESSION, AND WAITS that talks about how... More »

Question of the Day

Today's Question (by Steve Jones):

I have a SQL Server 2017 RTM instance that isn't very busy. I have the minimum server memory set to 8GB, but after restarting the host OS, I find that SQL Server is only using about 2GB of RAM. Why is this?

Think you know the answer? Click here, and find out if you are right.


We keep track of your score to give you bragging rights against your peers.
This question is worth 1 point in this category: Memory.

We'd love to give you credit for your own question and answer.
To submit a QOTD, simply log in to the Contribution Center.

ADVERTISEMENT

Professional Microsoft SQL Server 2014 Integration Services

The 2014 release of Microsoft's SQL Server Integration Services provides enhancements for managing extraction, transformation, and load operations, plus expanded in-memory capabilities, improved disaster recovery, increased scalability, and much more. The increased functionality will streamline your ETL processes and smooth out your workflow, but the catch is that your workflow must change. New tools come with new best practices, and Professional Microsoft SQL Server 2014 Integration Services will keep you ahead of the curve. SQL Server MVP Brian Knight is the most respected name in the business, and your ultimate guide to navigating the changes to use Microsoft SQL Server Integration Services 2014 to your utmost advantage. Get your copy from Amazon today.

Yesterday's Question of the Day

Yesterday's Question

I have a set of data in this variable:

 > positions = c("QB", "HB", "FB", "WR", "WR", "T", "G", "C", "G", "T", "TE")

If I run this code, what is returned?

 > duplicated(positions) 

Answer: [1] FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE TRUE TRUE FALSE

Explanation:

The duplicated() function will return a vector that shows which elements are duplicates of others. This is represented by a TRUE or FALSE in each position.

Ref: duplicated - click here


» Discuss this question and answer on the forums

Featured Script

Entering random data into a table

Junior Galvão - MVP from SQLServerCentral.com

This block of code is intended to demonstrate how we can in Microsoft SQL Server from version 2008. Use any user database to perform a execution of this code block and table storage RandomDataTable.
You can use this block of code to test and study environments, with the purpose of generating and populate a table with random data.
These random data are produced using the 130 characters declared in the @Texto variable, during command execution While.
The variable @Posicao is used to control and position the use of characters declared in the @Texto variable.
The @RowCount variable has the function to control the number of rows stored during execution with comado While.
After the amount of random rows being inserted the Select command should display the mass of data created and inserted into the RandomDataTabela table.

More »

Database Pros Who Need Your Help

Here's a few of the new posts today on the forums. To see more, visit the forums.

SQL Server 2017 : SQL Server 2017 - Development

store a query instead of repeating it - If I have a complicated SQL query with joins and unions and with statement, and I would like to use...

Dates provide as string YYYYMM: Is best practice for storing as date (adding 01 as day) or as varchar(6) then manipulate to do date math? - I am in the enviable position (no sarcasm) of being able to start from scratch with designing tables for SSRS...

Transaction Replication With Temporal Tables - Hi All We are setting up transaction replication between our Source to Staging Server (In Dev environment  Both are on same...


SQL Server 2016 : SQL Server 2016 - Administration

sp_executesql - turn off - Hi,  I wondered if someone could answer a question for me please Is it possible to turn off sp_executesql on a server? or...

Solving a puzzle for RO AG node - Trying to solve a puzzle. I have a 2 node AG with OLTP on one and the other is used...

A lot of THREADPOOL waits, find worker thread blocking the scheduler - Hello,  I have a very strange problem. From time to time on a production servers the number of THREADPOOL waits increases...

Lost admin rights on SQL Server Tabular... - I am admin on that windows server but have lost access to the SQL Server Analysis Services Tabular server. Is there...

SQL Server 2016 - Tran log growing huge - Forum, We recently deployed a new server SQL 2016 (first)and built an Availability group, now the problem is the TRN log...


SQL Server 2014 : Administration - SQL Server 2014

ALWAYS ON Secondary Database Unable to Join - Hi All, Im trying to join a secondary database to ALWAYS on AG but its getting failure with message The Connection...

The server principal is not able to access the database under the current security context. URGENT - This is an error which seems to happen a lot but I cannot really get an answer We have a user...


SQL Server 2012 : SQL Server 2012 - T-SQL

Fast ways to load data from SQL server to SQL server - Hi  I have a table in SQL server which has over 5 million records and around 12 columns and wish to...

XML nodes help - The XML in the attached file is the XMLTEXT field in my CDRecords table. What code would i use to cross...

Deadlock issue - Hi All, Need some help in fixing the deadlock. Will an Intent exclusive (IX) lock can cause deadlock?? We are seeing...

carving out bussiness hours. - so i'm newish to SQL and SSRS but i'm working on a report that allows me to calculate average times...


SQL Server 2008 : SQL Server 2008 - General

Primary key incrementation - Hello, I know from Microsoft Access when you enter any value in any column it automatically adds a new row and...

Affinity Mask... - Hey Folks, Appreciate if somebody can share knowledge on factors one has to take into account before setting processor affinity mask for SQL instances....


SQL Server 2008 : T-SQL (SS2K8)

Return row as null instead of empty - Hi, below the sample structure, Declare @sample table(id int, name varchar, datetime); select * from @sample this will return empty row. but i...


Reporting Services : Reporting Services 2005 Administration

SSRS Report on Local Server works fine but not through Remote server - Hi, I deployed a SSRS (2005) report on my Report Server and is working fine, but when tried through Remote Server...


Career : Events

The SQL Saturday Thread - As popular as SQL Saturday is, I'm surprised that nobody created a thread dedicated to SQL Saturday, so I created...


Career : Presentations and Speaking

Potential presentation idea: Networking -- an interactive how-to - I have yet another presentation idea.  What do you folks think of this? My working title is "Networking: it isn't just...

This email has been sent to newsletter@newslettercollector.com. To be removed from this list, please click here.
If you have any problems leaving the list, please contact the webmaster@sqlservercentral.com.
This newsletter was sent to you because you signed up at SQLServerCentral.com.
Feel free to forward this to any colleagues that you think might be interested.
If you have received this email from a colleague, you can register to receive it here.
This transmission is ©2018 Redgate Software Ltd, Newnham House, Cambridge Business Park, Cambridge, CB4 0WZ, United Kingdom. All rights reserved.
Contact: webmaster@sqlservercentral.com