No AI without data, and no data without EAM

Self-driving transport vehicles, self-organizing logistics chains and warehouses, automated translation in realtime, personalized advertising, computer-aided online dating services: The important thing to keep in mind about all of these is that they already exist today. Nevertheless, we are only at the beginning, and not at the end, of the developments that made them possible.

by Dr. Karsten Schweichhart 

More than anything else, these developments are based on a breakthrough in the field of artificial intelligence. AI in turn requires the availability of virtually unlimited amounts of data, which can and must be provided by well organized enterprise architecture systems: In other words, no AI without data, and no data without EAM.

New qualities in AI

After spending some 20 years in “hibernation,” AI algorithms are beginning to deliver what their proponents have always promised they would. Machine learning, deep learning, ever-more complex neural networks – both the computing power and the amount of training data available for AI have already surpassed the critical limits of what is actually needed. Just about anything has seemed possible ever since the world’s number one human Go player was easily defeated by a computer.

The important thing in terms of practical application is to ensure that algorithms like the one in that computer can in fact be used to develop all different types of services, bots, and analyses that directly help improve processes, procedures, relationships with customers, and even business models. This in turn requires the availability of sufficient amounts of the right kind of data.

New qualities in data

Data is the world’s new raw material – and its most valuable one as well. Data is the new gold, the new oil, as they say. Four powerful strategic trends have led to this situation:

1. Sensors installed just about everywhere now produce unlimited amounts of data on all things, processes, landscapes, and living organisms.

2. These sensors are being networked, which makes all of their data available for analysis and processing.

3. Cloud and edge computing: Data storage and processing are becoming more and more effective and widespread as the associated systems become cheaper, faster, and more compact. All of this data, including data on people, is available in the cloud and at the network edge.

4. Algorithms and AI: Enabled by massive amounts of data and computing capacity that go beyond the so-called critical limits, algorithms and AI have become the most powerful tools of modern technology.

An incalculable number of unpredictable possibilities are now being created. The one thing they all have in common is that data is the central raw material that drives them. History has shown that whenever human beings discover or create a new raw material, they tend to exploit it in three stages: piecemeal exploitation, sharing via trading, and the establishment of an economic framework for the raw material in question. That’s the way it was with gold and electricity, and that’s the way things are with oil at the moment.

The gold rush that began at the Klondike River in 1896 firmly established gold as a measure of value throughout the modern world. Up until the First World War, gold coins were used as a means of payment all around the globe (examples include the goldmark, the sovereign, and the vreneli), and banks and exchanges became the trading platform for this raw material. These institutions are still involved in the gold trade today, and governments maintain gold reserves stored in vaults to guarantee the stability of their currencies and economies.

Electricity was a disruptive invention that replaced windmills, water mills, and steam engines, and it also led to the creation of a completely new type of economy. Electricity was first produced in a piecemeal fashion with a factory’s own generators, for example. However, this setup soon gave way to the construction of power plants, which initially faced public opposition and distrust, even as the distribution networks being developed simultaneously offered the promise of a reliable supply of electricity for industry and society at large. The mindset from that time is very similar to the attitude regarding the huge amounts of data that have been created over the last few years, as the IT community is only now really beginning to turn the management of this data over to “central power plants” – i.e. the cloud.

Oil is the fundamental raw material of the global economy today. Ever since it first began to be exploited as a fuel and energy source in the 1850s, oil has been used as the basis for the creation of a variety of substances, whereby this development was driven especially by the petrochemical industry after the Second World War. These substances were in turn used to produce plastics, various types of materials, fertilizers, and even medications, all of which have made their way into every area of our lives and every sector of the economy, and some of which had previously been considered impossible to produce.

So, if data is indeed the new gold or oil, then the following four statements can by no means be considered unreasonable:

1. Data must be mined, stored, processed, and traded like a raw material.

2. We are still in the piecemeal exploitation stage: Everyone keeps their relevant data raw materials to themselves. No system for data trading exists yet; there are no “data banks” or “data exchanges.” But there will be.

3. A lot of data is still collected and managed locally, although the trend toward the use of “data power plants” (the cloud) is growing. The cloud, the edge, and their associated networks will eventually form a reliable and resilient overall data grid.

4. The “refineries” for data are the algorithms, and the “petrochemical experts” are the data scientists and engineers. They will combine huge amounts of data to create an understanding and insight previously considered impossible to achieve.

The essential thing here is that all of these points must be in conjunction – i.e. all four points are conditional upon one another. And it all starts with the extraction (mining) of the raw material.

Data raw material extraction platform: Enterprise architecture

Companies that have long since begun to address and utilize enterprise architecture (EA) enjoy a competitive edge today. Such companies have a domain model for their business and have organized the data in their IT landscape using master data and data stewards. They also participate in associations like CBA Lab in order to promote the further development of enterprise architecture. What companies need to do now is extract all the data from processes and IT systems in order to make it available in a targeted manner for cross-sector use within or even outside the organization.

If one accepts for a moment the tremendous potential harbored by the development described above, one would then have to view the provision and utilization of data as THE critical factor of success in business today. In addition, if one also assumes the possibility of a data economy, then the exploitation of only one’s own data will certainly not be enough to achieve success in that economy.

EA suddenly has new tasks to perform:

1. Organization and provision of a company’s own data as a raw material for internal use (as has previously been the case)

2. Provision of a company’s own data at the edge of the organization for the purpose of sharing or trading data

3. Research and acquisition of external data for internal use (“petro(data)chemistry”).

The established methods of enterprise architecture management will change slightly here. They will also take on new meaning and carry more weight, as they will become the structural technology for extracting data, just as oil is extracted with other technologies. The “Data-Driven Business Models” workstream that was completed at CBA Lab in 2018 gets right to the point here by calling for the systematic strengthening and utilization of the following EAM capabilities:

1. Data strategy development

2. Data model management

3. Data provision management

4. Data governance management

5. Data (IT/reference) architecture management


Enterprise architects – we can do all of this, and more!

Dr. Karsten Schweichhart, Member of the Board

More than anything else, these developments are based on a breakthrough in the field of artificial intelligence. AI in turn requires the availability of virtually unlimited amounts of data, which can and must be provided by well organized enterprise architecture systems: In other words, no AI without data, and no data without EAM.