One common theme in recent tectonic shifts in information technology is data management. Analyzing customer responses may require combing through unstructured emails and tweets. Timely analysis of web interactions may demand a big data solution. Deployment of data visualization tools to users may dictate redesign of warehouses and marts. The data architect is a key player in harnessing and capitalizing on new data technologies.
According to the Data Management Body of Knowledge, the data architect "provides a standard common business vocabulary, expresses strategic data requirements, outlines high level integrated designs to meet these requirements, and aligns with enterprise strategy and related business architecture." Based on the job descriptions I've seen this role definition is generally accepted, but there seems to be a wide variety of required skills for the data architect, with data modeling being the single common denominator.
A while ago Joe McKendrick, posting on the Informatica blog site, described personal qualities that the role requires. In effect, Mr. McKendrick's data architect is likeable, influential, respected, persuasive, and enthusiastic, and also "needs to be well-versed in the architectural and modeling principles that support data management." Expanding on that last capability, here's my list of required skills of the data architect:
- Foundation in systems development: the data architect should understand the system development life cycle; software project management approaches; and requirements, design, and test techniques. The data architect is asked to conceptualize and influence application and interface projects, and therefore must understand what advice to give and where to plug in to steer toward desirable outcomes.
- Depth in data modeling and database design: This is the core skill of the data architect, and the most requested in data architect job descriptions. The effective data architect is sound across all phases of data modeling, from conceptualization to database optimization. In my experience this skill extends to SQL development and perhaps database administration.
- Breadth in established and emerging data technologies: In addition to depth in established data management and reporting technologies, the data architect is either experienced or conversant in emerging tools like columnar and NoSQL databases, predictive analytics, data visualization, and unstructured data. While not necessarily deep in all of these technologies, the data architect hopefully is experienced in one or more, and must understand them sufficiently to guide the organization in understanding and adopting them.
- Ability to conceive and portray the big data picture: When the data architect initiates, evaluates, and influences projects he or she does so from the perspective of the entire organization. The data architect maps the systems and interfaces used to manage data, sets standards for data management, analyzes current state and conceives desired future state, and conceives projects needed to close the gap between current state and future goals.
- Ability to astutely operate in the organization: Here we arrive at Mr. McKendrick's five key characteristics, which point to the data architect's ability to operate politically in the organization. I'll refer you to his article for specifics, but his keywords are these:
- Well respected and influential
- Able to emphasize methodology, modeling, and governance
- Technologically and politically neutral
- Articulate, persuasive, and a good salesperson
To me the emphasis that many placement ads put on modeling skills is incomplete given the vast change underway in data management, and hiring a data architect who lacks some of the skills outlined here can restrict an organization's ability to capitalize on opportunities that new data management techniques bring.