How Leading Data Organizations Achieve Success: Prioritize People, Process, and Product

Technology is not the biggest challenge anymore when building data platforms

Jonny Daenen
datamindedbe

--

Many data organizations have reached a stage where technology is not the biggest hurdle anymore. As data technology is maturing, challenges are shifting from scaling data processing to materializing a sustainable data platform. Our Data Maturity Test confirms this shift and indicates organizations today struggle more with the People, Process, and Product aspects.

To evolve into a top data organization, we recommend an approach that shifts away from a technology-heavy focus:

  • Treat your data platform as a product (not a project) to enable its users.
  • Make sure you have a team of people with the right skill set.
  • Put processes in place that enable the people.

Successfully scaling your data efforts requires increasing your maturity

After building an initial data or AI product, many organizations struggle with building new ones. These initial products most likely have a negative return on investment (ROI) due to heavy investments in so-called “data foundations.” The pyramid below shows the relative effort that goes into the different levels of building data products. The bottom, or foundational, part takes the most effort, so reaching a positive ROI can take a while. No wonder Gartner predicted that 85% of AI projects would fail through 2022.

When developing a data product, most efforts go into the foundational parts: infrastructure and operations. The real business value is at the top of the pyramid, so it can take a while before reaching a positive ROI.

Organizations always need multiple data products. This is because the ambitions of an organization often translate directly into data needs, potentially spanning multiple departments. Even ambitions that seem unrelated to data, such as “increasing the adoption of our product” or “making our team 30% more efficient,” require data processing to follow up on their success. The data needs will need to be covered by multiple data products, each having the pyramid shape depicted above. So, how do we avoid all the “overhead” needed to build these data products?

An obvious approach to reducing overhead is constructing a data platform with common components in infrastructure and operations (for example, a data catalog). The idea is to avoid overhead costs to turn the ROI positive.

As organizations typically need many data products, a good split between data platform and data product can help you reach a positive ROI.

However, at Data Minded, we have seen many organizations struggle to successfully deliver both data products and platforms. Based on our experience, we have collected several key best practices and bundled them into our Data Maturity Test. This test allows organizations to identify whether they can reach their data ambitions.

Our Data Maturity Test indicates whether an organization can reach its data ambitions.

Data maturity is the level of sophistication an organization achieves in using its data assets. Our Data Maturity Whitepaper elaborates on the test, and provides insights into six key dimensions for building a sustainable data platform. Learn more about Data Maturity in our previous blog posts Why grow in data maturity? and The 6 pillars of data maturity.

Towards sustainable data platforms

In the field, we see many organizations struggle with building data platforms. Making sure we can process “Big Data” is generally not the issue anymore: many modern technologies often come with horizontal scalability that lets you add power when your load increases. Combine this with the power of the cloud (on-demand storage and compute), and this allows for a seemingly low time-to-market. The result — the ability to scale fast — looks like the silver bullet for building data platforms, right? This is not the case in practice: the challenge has shifted from scalability to sustainability.

The challenge has shifted from scalability to sustainability.

In an ideal world, when you add more use cases (data products), you don’t need additional work to keep a platform healthy. Your platform remains up-to-date, usable, secure, self-servicable, scalable, flexible, cost-efficient, etc. Most of all, it should remain in place for many years to come. So, we must consider the continuous investments into the data platform. And therefore, ROI becomes relevant again, this time on the level of the platform itself. However we build it, it needs to be sustainable.

For us at Data Minded, sustainability means that a data platform is future-proof. The image below shows two organizations that have scalable data platforms. The organization on the left needs to add more people whenever they add new data products. This is because the new data product might need new functionalities or data. As platform engineers and data engineers are difficult to find, this often translates into people being moved around, resulting in a build-up of technical debt in some other part. Even if you do manage to add more people, over time, this approach will lead to significant overhead in terms of coordination.

On the left, we see an organization that scales with people, who quickly become a bottleneck. On the right, the organization employs a product mindset, combining the right processes and people to scale without needing more engineers. Only the latter approach is sustainable for building a future-proof data platform.

The organization on the right follows a different approach. They make sure to build a platform that can easily deal with new data products being added. We believe this requires the right tools, but especially a focus on the people, the processes, and employing a product mindset. In this scenario, when you add more people, it’s at the top of the pyramid where the biggest value resides.

Shift from Technology to People, Process & Product

The graph below shows the results of our ongoing Data Maturity Test. We see the share of respondents that reach different maturity levels for each dimension. It’s immediately clear that the highest scores are reached in the Technology dimension, while all other dimensions are lagging behind.

More organizations are reaching a high data maturity level in the Technology dimension. This indicates that technology is not the biggest challenge anymore in becoming a data-mature organization.

Why is it more difficult to reach high levels of data maturity for the non-technology dimensions? When we look at the percentage of organizations implementing the foundational best practices for every dimension, we find that many organizations still struggle with these. The lack of a solid foundation could explain the difficulties in reaching higher maturity levels in these dimensions.

When we look at the two foundational best practices per dimension, the ones for Technology have the highest adoption. The lack of a solid foundation could explain the difficulties in reaching higher maturity levels.

Technology is clearly in the best shape, followed by the organization dimension. The high technology score is explained by the many organizations exploring modern (cloud) technologies to accelerate their data efforts. Today, many technologies come with scalability built-in, and one-click setups make deployments look (too) easy. However, combining technologies into a scalable & sustainable data platform is more complex. Especially if it needs to implement specific requirements and you need to run it for a long time.

The relatively high organizational score stems from the business side of organizations that is often pushing to do more with data. This indicates a desire for more data and AI efforts and confirms the findings from our previous post on Why grow in data maturity?.

Instead of a technology-heavy focus, turn your attention to People, Process, and Product.

The other dimensions score significantly lower. As data is organization-specific, we turn our attention to People, Process, and Product. To build a sustainable data platform and organization, we recommend improving the foundational data maturity in these areas instead of a technology-heavy focus. This can be done by focusing on the following three key aspects:

  • Product: treat your data platform as a product (not a project) to enable its users.
  • People: make sure you have a team of people with the right skill set.
  • Process: put processes in place that enable the people.

Treat your data platform as a product (not a project) to enable its users

Your data platform is not a “project” — it must remain operational after its initial release. While some data products might have a limited lifespan (for example, a one-off yearly report), data platforms themselves are typically developed to remain in place for a long time. At the same time, you don’t want to hold off creating value from your data. So, as with any product, you’ll have to develop a platform that balances time-to-market, cost, and features.

Without users, there is no data platform.

Keep in mind that without users, there is no data platform. Hence, a user-centric design of your data platform is key. A data platform should enable its users to do more with data. However, this does not mean purely delivering features is the way to go. A focus on self-service and automation is paramount, as is keeping your platform consistent and usable. Wrong decisions in this area can quickly lead to a data organization becoming a bottleneck.

On top of the feature (or functional) part, you’ll need to have a clear view of the non-functional part: governance, versioning, updates, SLAs, recovery plans, documentation, security, etc., play an important role in the usability of your platform. Furthermore, these aspects will determine the operating model of the platform.

As your platform evolves, technical debt will pile up and must be controlled. Manual activation scripts that were not automated, legacy dependencies that are flagged as security risks, and so on. We often see components of a platform left in a non-industrialized state, waiting to pop back up in the worst possible moments.

In summary, we recommend employing a product mindset that balances features, operations, and technical debt — already while building the platform. This is a challenging task as new features are more immediately visible to stakeholders and users, but it is the only way to create a sustainable data platform.

If you’re interested in scaling data organizations, make sure to read up on Data Mesh, a methodology that attempts to solve the data team bottleneck.

Make sure you have a team of people with the right skillset

To build a sustainable data platform and data products, you need people with the right skills. Ideally, you have a decent balance between technical and business-specific skill sets. But even for the technical skills, it’s more complex than just looking at “junior” and “senior” roles.

Teams should be able to deliver and operate their product end-to-end.

When looking at data-related skillsets, we see a more fine-grained role definition. Recently it’s becoming more clear what ML engineers, data engineers, platform engineers, data scientists, and data analysts mean. However, we often focus on the development part when building data products, while aspects such as industrialization and automation are equally, if not more, important when considering sustainability.

Acknowledging these different skills allows teams to more naturally add complementary profiles to help deliver and operate their product end-to-end. Furthermore, this will enable them to evolve faster towards the abovementioned product mindset.

Put processes in place that enable the people

To deliver a great product, it needs to enable the users and requires a diverse team with the right skill set. Some minimal process best practices need to be in place to ensure the team can succeed. These aspects include testing your code, limiting work in progress, sharing knowledge in time, and performing high-quality code reviews (no “rubber stamping”).

We’re not advocating for highly bureaucratic processes here; it’s merely about having the right checks in place to ensure your data platform and product are, and will remain, in good shape. Once you have the basics right, you can work towards a process that allows you to deliver to production many times a day, which brings you to the same level as elite software development teams.

Test your maturity!

In this post, we looked at why data maturity is relevant to reaching your (data) ambitions: ensuring you have a scalable and especially a sustainable way of delivering data products is key. A big risk is building an organization where your data efforts need people to scale, as this will hinder your data products’ development and, more importantly, their long-term viability.

The results of our Data Maturity Test confirmed that as data technology matures, the challenges are shifting toward People, Process, and Product dimensions. The only way to build a sustainable data platform is to increase maturity in these areas. For each of the three underdeveloped dimensions, we devised a high-level direction that is complemented by the specific improvement tracks in our whitepaper.

If you’re interested in benchmarking your own organization, we recommend completing our free Data Maturity Test. After a few minutes, you’ll know how you measure up against others and immediately receive specific recommendations to improve your data maturity.

Example report of our Data Maturity Test.
Example report of the Data Maturity Test by Data Minded.

Thanks to Frederic Vanderveken, Michelle Gybels, Nathan Derave, Kris Peeters, Maikel Peeters and Cedric Mingneau for providing valuable input and feedback on this post.

--

--

Jonny Daenen
datamindedbe

Data Engineer @ Data Minded, AI Coach @ PXL Next - Unleashing insights through data, clouds, AI, and visualization.