One of the things I've learned the hard way is to include a checklist of data-specific criteria in your Data Requirements document.
We have all heard the question come in from a developer asking about formatting of numeric values. "How many decimal places?" or "Does this value have a leading dollar sign?" The Data Requirements section of most Requirements Documents lists the name, description, data type and other characteristics of specific data elements. It defines whether each data element is a character (and its length) or numeric, and whether NULL's are allowed.
But there's a lot more that developers will need to know about data elements, and one would be surprised how often these items get overlooked and cause questions in development, defects in System Test, or even worse – errors in production due to data falling outside of boundaries, text that is formatted incorrectly, or NULL values that slip through the cracks. A checklist forces one to ask the question and then document the results.
Adding these detailed items also helps teach and enforce detailed thinking. I again harken back to a Senior BSA who kept beating the "You can never have too much detail" into the junior BSA's on his team. By looking at the deep details over and over again it becomes second nature; corner-cutting becomes less and less likely, and actively discouraged when uncovered. What seems like extra, unnecessary work often saves time overall, and improves the quality of the deliverable.
Here are some of the items that need to be included in data requirements definition:
Description of the Data Element
- Describes the data element in detail and how is it used. And no, using "The ID of the Customer" as the description for CustomerID is not enough. See fellow CapTech'er Tom Krieger's excellent discussion here about data defintion.
- All Upper Case? How will the data be used? Is all upper case appropriate as a display format in this specific situation?
- Upper Case for the First Letter? I got an advertisement once addressed to "david elliott". What impact would that have on a potential customer?
- What about case for multiple words in a field? How do you handle a city like "San Diego"?
- Is an empty string or spaces valid? For delimited files, there is a difference between ,, and , ,. One can equate to a NULL, while the other likely cannot.
- Are leading or trailing spaces valid? Should the data be trimmed?
- Are special characters valid? And which special characters? We allow "@" for some fields such as an email address, but what about "=" and "+"?
- Are there any downstream impacts? If data is exported, are there any pre-existing searches or formatting that are case-sensitive?
- Is zero valid? Can the price of an item be zero?
- Is a negative number valid? Can I pay my credit card with a negative value?
- How many decimal places should a value have? Financial data is presented in dollars and cents, but what about the calculations or values behind the scenes?
- If currency, is a leading dollar sign required? Are other currencies such as the Euro or Peso allowed?
- If it's a numeric value in a fixed length field, is the value zero padded?
- If it's a numeric value in a fixed length field, is the value right or left justified?
- If it's a numeric value in a fixed length field, is the decimal point implied or included?
- Are percentages expressed as a decimal or a whole number? A 200% increase is a lot more than 2%
- Is there a range of values or other rule that applies to a numeric field? Can the price of an item be less than the cost?
- What is the format? mm/dd/yyyy (USA) or yyyy-mm-dd (European), etc?
- If time is included, to what degree? Minutes? Seconds? Tenths of seconds?
- Are there any ranges or other restrictions that apply? Can a date be in the future?
And once the project is implemented, the detailed information about each element should then flow easily into a Metadata document for the next set of developers to review (and thankfully find detailed, accurate information!).
As noted above, often this effort seems to be overkill. But instead, the relatively small amount of additional time invested it makes a better Requirements Document for everyone, reducing questions, test defects, change requests, and production errors.
Once put into practice, the detailed thinking becomes the rule rather than the exception. And that increase in detail can only improve our results.
This is one of a series or articles in which CapTech Data Management and Business Intelligence (DM/BI) consultants explain the tips and tricks that make them successful on their projects.