| A community of more than 1,600,000 database professionals and growing |
| | Self-Healing ETL The process of building ETL flows is cumbersome and time consuming. Many consultants have made a lot of money by building ETL processes that manage the movement of data in and out of databases, including to data warehouses. There is a lot of importance placed on consistent, quick, and accurate data loading. In fact, one of the most popular job needs these days is still for ETL developers. Even as data science grows and becomes popular, quite a bit of the work in data science is data preparation, which often involves ETL, or ELT. One of the struggles with an ETL process is the tedious nature of building lots of flows that are very similar. There may be a wide variety of ways to get data to move, especially as we have a crazy set of inconsistent formats in which data appears, but once we have the flow, we often repeat that for many different files or tables. I think Biml makes this better, but it's still not simple, and there can be plenty of issues that still arise as the source inputs change. There was an interesting article that describes the use of AI to assist in ETL flows. While there is some basic matching up of source and target data right now, what if AI would add a few things. The idea of looking at the data you've matched and suggesting alterations that can potentially help with data quality is interesting. The system would have to learn what issues occur over time and what good data looks like, but perhaps this could even extend when the source data gets slightly mangled in formatting. Perhaps detecting fixed width fields that have grown and shifted all columns would be possible, and even adjust the data flow. I am actually more interested in detecting PII data and alerting developers, or even operations people that the ETL process needs to be changed or secured in some way. AI should be able as well to help Operations people with scheduling, even pausing or moving ETL work to other machines to better scale the flow. My view would be that any AI system ought to be looking for anomalies and detecting those, as pattern recognition is one thing AI is good at. I'd think this would be more of an alert capability as I'm not sure I trust an AI system to actually adjust data. ETL flows are often too important to have data mangled in an unexpected way. Perhaps that's too cautious, as that's what developers do all the time. They mandle data, and we find ways to recover. Ultimately, I think that having lots of logging on what a developer or AI system does is more important. This might be especially true if voice commands are used to build the flows. I could see lots of ambiguities and mistakes from business analysts that are trying to describe how to build an ETL flow. Steve Jones from SQLServerCentral.comJoin the debate, and respond to today's editorial on the forums |
| The Voice of the DBA Podcast Listen to the MP3 Audio ( 4.2MB) podcast or subscribe to the feed at iTunes and Libsyn. The Voice of the DBA podcast features music by Everyday Jones. No relation, but I stumbled on to them and really like the music. | |
|
|
| ADVERTISEMENT | | Redgate University Self-paced online training courses with easy to follow classes on getting started, exploring advanced features, and making the most of Redgate products. Learn the fundamentals, best practices, and top tips from the experts - Redgate’s Microsoft Data Platform MVPs and engineers. Start Learning |
| | Benchmark your Database DevOps maturity level Get a better understanding of how advanced your current processes are, receive recommendations for improvements, and see how your maturity level compares with that of your peers. Complete the Database DevOps Maturity Assessment |
|
|
|
| | | Additional Articles from SimpleTalk Now that the GDPR has gone into effect, many are wondering what will happen in the United States. California is the first state to enact similar legislation, called the California Consumer Privacy Act of 2018 (CCPA). In this article, William Brewer explains the history of the law, what it means for companies doing business with California residents, and how it compares to the GDPR. More » |
| Additional Articles from MSSQLTips.com In this tip we look at how to collect data for Elastic Database Jobs and easily store the data in a central location for analysis and review. More » |
| david.fowler 42596 from SQLServerCentral Blogs I recently came across a really odd issue with the SQL Agent, there were two agent jobs attached to one... More » |
| Bert Wagner from SQLServerCentral Blogs This week I had the opportunity to learn about dashboards, reports, and datasets in Power BI using the best kind... More » |
|
|
| | Today's Question (by Evgeny Garaev): What is the new property for the COLUMNPROPERTY() function that appeared in SQL Server 2016? |
Think you know the answer? Click here, and find out if you are right. We keep track of your score to give you bragging rights against your peers. This question is worth 1 point in this category: T-SQL. We'd love to give you credit for your own question and answer. To submit a QOTD, simply log in to the Contribution Center. |
|
|
| |
ADVERTISEMENT | The company's new IT initiative, code named Phoenix Project, is critical to the future of Parts Unlimited, but the project is massively over budget and very late. The CEO wants Bill to report directly to him and fix the mess in ninety days or else Bill's entire department will be outsourced. Get your copy from Amazon today. | | |
|
|
|
|
|
| Yesterday's Question of the Day |
| Yesterday's Question (by Steve Jones): I have a data frame that contains the top ten home run hitters in baseball. The data frame looks like this: > HR.hitters rank players Hr 1 1 Barry Bonds 762 2 2 Hank Aaron 755 3 3 Babe Ruth 714 4 4 Alex Rodriguez 696 5 5 Willie Mays 660 6 6 Ken Griffey, Jr. 630 7 7 Albert Pujols 619 8 8 Jim Thome 612 9 9 Sammy Sosa 609 10 10 Frank Robinson 586 I want to get just the all time ranking and the number of home runs. How can I do this in R to return this data? rank Hr 1 1 762 2 2 755 3 3 714 4 4 696 5 5 660 6 6 630 7 7 619 8 8 612 9 9 609 10 10 586 Answer: HR.hitters[1-3] Explanation: I can use brackets and number the columns, separated by a dash. I could actually use HR.hitters[1-2-3] to return all three columns. Ref: Extract part of a data frame - click here » Discuss this question and answer on the forums |
|
|
| Database Pros Who Need Your Help |
| Here's a few of the new posts today on the forums. To see more, visit the forums. TempDB keeps growing for no reason. - Hi All, Thank you for looking into my question. I have SQL Server 2017 Enterprise server FailOver cluster (No Shared storage though)... How else can I view last queries run by anyone on an instance of SQL Server or a particular DB - Is there a way / TSQL to see last <N> queries run by anyone on an instance of SQL Server or... SQL Server 2017, 2012 Express Edition Insert Select Performance Issue with very less data. - Command 1: (Get Count from View1) More than one check constraint per column? - In the documentation for CREATE TABLE (Transact-SQL), under check constraints, I see: A column can have any number of CHECK constraints,... Init-cap the characters in SQL server - Hi Sir, How to Init-cap the characters in SQL server in SQL query without creating any function? create table abc (Month_Name nvarchar(10)) [code... Windows Reboot time in sys.dm_os_sys_info does not match systeminfo - I've got an unusual situation. to get the Operating System Reboottime (NOT the SQL Service restart time) i've used this query... Connectivity issue with a multisubnet AlwaysOn - Hello, I have a problem with a multisubnet that I can't figure out how to resolve. I have a 2 replicas AlwaysOn... Not able to see SSIS packages when SQL Instance and Integration Services are on two separate nodes. - I have a three nodes in Failover Clustered Instance 2016 SQL Server. Also I have SSIS installed on all three... Need help in converting Nvarchar to Datetime - Hi Experts, Please help me in Nvarchar to Datetime, my scenario is as below I'm getting data in NVARCHAR and format is... What are some really dumb things you could write in a SQL Query - We all know there are plenty of recommendations as far as what to avoid when writing T-SQL. For example, it's... Difference between two join method - Hi guys, Can anybody tell me the difference between the two below examples and why I would use one over the... Select Query Assistance (Comparison) - I am modifying a Query I have. (I am using dummy data below as the set i have is quite... Algorithm to combine 2 integer into one value and then disintegrate back into two - Hi, I may not be the first one to ask this, but I'm looking to find a way to combine two... Importing from XML to SQL 2014 - This is my 4th day on SQL so I am a newbie. I created a query to import from an... Showing all records that appear more than once - Hi everyone I'm trying to write some SQL that will allow me to pull all records where the NUMBER column appears... How to flip tables in sql server 2014 - Hello, I have a requirement wherein there are 2 tables (Staging & Target) in the same database. Everytime data is first loaded in... Database Management vs Database Administration Meaning - I have been thinking about this terminology and just wanted to get some opinions. My first effort is below: Database Management... HOW TO GET FIRST DATE - Dear all, I have data like: NAME SPONSOR_NAME DATE_OF_CALL A FI Information from ALL SQL Servers - Hi I would need to grab all in Evolution of career from perm/consulting to freelancing? - Hi, I am working as DBA for 17+years. permanent -> consultant-> permanent -> consultant ... at some point I start thinking about... |
|
| This email has been sent to newsletter@newslettercollector.com. To be removed from this list, please click here. If you have any problems leaving the list, please contact the webmaster@sqlservercentral.com. | This newsletter was sent to you because you signed up at SQLServerCentral.com. Feel free to forward this to any colleagues that you think might be interested. If you have received this email from a colleague, you can register to receive it here. | This transmission is ©2018 Redgate Software Ltd, Newnham House, Cambridge Business Park, Cambridge, CB4 0WZ, United Kingdom. All rights reserved. Contact: webmaster@sqlservercentral.com |
|
|