Documentation. The Under Recognised Skill Today we have a guest editorial from David Poole, as Steve is out on holiday. I have a confession to make. I enjoy writing documentation and do so for a number of reasons. Writing is a tool I find effective in helping me learn I enjoy capturing how the different parts of a system work and how the different parts relate to each other. I enjoy sharing information which is a key part of my role. Writing is a soft skill that enables this. I enjoy the craftsmanship of writing and the information architecture supporting it Many of you find the task of documenting systems to be a sin bin task. You didn’t start a career in IT to be a scribe. That is OK, everyone has different passions though I find that people give many reasons to avoid the task rather than say “sorry, I would rather do almost anything other than spend my time writing documentation”. Yet many of the skills and thought processes that go into a good database design have comparable counterparts in producing good documentation. As with a good database design you don’t notice the good until you have to deal with the bad. What does good look like? Let’s take a look at some Microsoft documentation to see some examples of the information architect’s skill. Information architecture for Microsoft official documentation for sp_column_privileges The screenshot of the sp_column_privileges documentation above also demonstrates some of the principles behind information mapping. Principle demonstrated | Details | Chunking | The information on system stored procedures is broken up into discrete pages per stored procedure. The pages themselves are broken up into discreet headings | Relevance | Our page is focussed entirely on one particular stored procedure | Labelling and consistency | The right hand “In this article” bookmarks label the relevant sections in a consistent way across the different stored procedures. The structure of the documentation page is also consistent. The consistency of the layout also reduces the cognitive load of trying to ingest the information on the page. There are some variations in that some stored procedures omit examples where as others have an additional and explicit Azure Synapse Analytics and PDW set of examples. | Accessible detail | We can use the right hand “In this article” panel to skip to the part of the documentation that is relevant to us. |
The other aspects on the page that are important are as follows. We see the document date As authors and readers we have an indicator as to how much time it will take to read the documentation We see supported versions of SQL Server clearly The left-hand table of contents is carefully categorised to group relevant information together We use the hyperlinked breadcrumb trail at the top of the page to keep track of where we are in the documentation set Although not demonstrated on the sp_column_privileges page, where a stored procedure is marked for deprecation Microsoft does emphasise the fact by placing a shaded warning block around the disclaimer text. The facility to allow the reader to feedback on whether or not the page is helpful to them gives valuable feedback to the authors. Documentation is written for the reader, not the writer after all. Although not visible to the reader, should you look at the page source for the documentation you will see a rich set of metadata categorising the content. Search engines can use this to make the content easier to find. In short, we probably don’t consider the thought, design and hard work has gone into the Microsoft online documentation. We can see similar thought processes have gone into the way that Redgate SQLDoc and Data Catalog present their information. What else supports good information husbandry? From the example above we can see two elements of good documentation that are vital for its success A clear template and pattern for documenting stored procedures and other SQL Server artefacts A carefully designed taxonomy and information architecture pattern to follow The user community contributes to Microsoft documentation which is important for the following reasons Shared ownership of the documentation A peer review and approval process Librarianship to keep the documentation current, catalogued and relevant. Data quality benefits from data stewards and governance and so too does information husbandry. Without these three things any documentation will quickly degrade into an unnavigable swamp full of pages of uncertain provenance. When people doubt the provenance, they lose trust and when they lose trust the hard work of assembling the documentation will be wasted through disuse. Where else I think we can improve By asking a few basic questions we can focus our efforts where they may be needed. Do we have a business model measuring the cost of delay for not having sufficient documentation against the cost of producing that documentation? I could waste a couple of hours (or more) Googling and reading Stack Overflow posts then trying to assimilate that information into a form useful for what I am actually trying to do. As our teams repeat this the cost of doing so soon mounts up. At the extreme end an absence of information may lead to system change being seen as too risky. Because of this the system is considered legacy and requiring an expensive rewrite. The reader determines the effectiveness of documentation so we must provide a mechanism for feedback. We must also encourage that feedback and also have a process for acknowledging and acting on it? Do the readers themselves have a means to upgrade the documentation? Do they have shared ownership? We devote considerable effort to measure the effectiveness and optimise our customer facing websites. Could we make better use of our SEO (Search Engine Optimisation) skills for our internal documentation? We must retire information that is no longer relevant because this will prevent it polluting our remaining documentation. Do we seek to improve and refactor documentation as we find better ways to express what it is trying to communicate? Authors must get clear guidance as to what should and should not get documented. Where should they publish that documentation? Do we provide training and tooling to help our would-be authors? Do we adapt to our authors’ needs? Concluding thoughts We find documentation "necessary" when it makes our life easier. That should be measurable in some way Pace of change statistics Reduction in interuptions New starter onboarding System longevity ...etc In an office environment I can simply interrupt a colleague to get the internal information I need. However this isn't practical in a distributed working environment such as we have during this pandemic. I feel the absence of suitable documentation more keenly whether I need to give or ask for help. I believe that businesses are slow to recognise the value of the support a suitably documented system can bring in much the same way as businesses are slow to recognise the value in the work described by Dr. Forsgren in her book Accelerate. Does the organisation see the way we do things today as simply the cost of doing business, not an optimisation opportunity? Hopefully increasing regulation and scrutiny of the data world is likely to change that perception. David.Poole Join the debate, and respond to today's editorial on the forums |