The build vs buy debate has existed as long as packaged software itself. Any serious discussion quickly concludes that there’s no one right answer and real question is when to do one or the other. That discussion, in turn, usually leads to a recommendation that companies build software which will create unique competitive advantage and otherwise buy when a satisfactory option exists. The implicit assumption behind that recommendation is that buying is cheaper than building. This isn’t always true but it will apply in most cases, especially when cost calculations include staff cost, on-going maintenance and feature updates, the risk of project failure or under-performance, and the opportunity cost of not using scarce developers to build other systems that do create unique advantages.
But the discussion changes when Customer Data Platforms are involved.* I’ve recently heard pro-build arguments that raise other valid issues. The two big ones are:
- most of the work in deploying a CDP is in data collection, which includes identifying source systems, understanding their contents, deciding which elements to include, and defining transformations to make the data usable. This work is the same whether you’re building or buying. Since it accounts for the bulk of the project cost, the cost to build or buy can’t be very different.
- many companies (especially big ones) already have systems that do much of what a CDP is intended to offer. In these situations, the incremental cost to extend existing systems will be less than the cost of adding a separate CDP, which will unnecessarily (and expensively) duplicate many existing functions.
Both arguments attack the “buying is cheaper” assumption. Neither should be summarily dismissed. Rather, let’s fit them into a larger framework that looks at more factors to consider in a CDP build vs buy decision. To make things manageable, this framework identifies items that are the same for both, favor build, and favor buy, and groups them based on whether they apply to data collection, processing, or outputs. The table below summarizes my list; I’m sure there are others.
|same for both||build advantage||buy advantage|
|data collection||source analysis||existing connectors||prebuilt connectors|
|processing||define requirements||less redundancy,
custom features for
|outputs||define requirements||existing connectors||prebuilt connectors|
Exploring this in more detail:
- Data Collection/Same: as already mentioned, much of the work in assembling a CDP is understanding source data. This is required regardless of whether the CDP is built or bought. If the data is well understood, a purchased CDP benefits as much as a built system – so long as IT staff who understand the data are available to the project.
- Data Collection/Build Advantage: a built CDP will take advantage of whatever connectors have already been created to feed existing systems. Note that any relative advantage is diminished if a purchased CDP can also use these connectors, either to create direct feeds or by reading data the connectors have pulled into existing data lakes or warehouses. That should be true in most cases.
- Data Collection/Buy Advantage: a purchased CDP will have prebuilt connectors for many source systems and often a standard API for creating new connectors. The value of this depends on how many of your company’s systems are covered, which is likely to depend on how modern they are.
- Processing/Same: both built and purchased systems depend on effective requirement definition, another critical task that can consume substantial project resources. There may be some additional work bringing vendor staff up to speed for a purchased system, compared with having a system built by internal staff who already understand the business. But this is probably balanced by CDP vendor staff having deeper experience with CDP-specific issues.
- Processing/Build Advantage: extending existing systems may mean that existing data stores can expanded, rather than copying data into a separate CDP database. This is especially important for companies with massive data volumes. But the actual advantage depends on the technical details, since the CDP often needs data placed into a different format from existing systems, in which case both build and buy solutions will require a new data store.
A built system will only add features that are not already available in existing systems, so there’s less potential redundancy compared with a purchased system. This could mean lower operating costs but, again, it depends on how much of what the CDP does is really new. And there’s a reasonable chance a purchased CDP will actually reduce operating costs by enabling the company to sunset some existing systems or processes.
A built system can include features that are not available in a purchased CDP, creating unique competitive advantage. How much this matters will depend on how unique the company’s requirements really are, and whether a purchased CDP can also be extended to meet them. CDP vendors would argue that their systems are extremely flexible and extensible.
- Processing/Buy Advantage: a purchased CDP will have core CDP processes already built, saving the cost of custom development. This is probably the strongest argument for a purchased system. Of course, it depends on how many new processes are needed and how hard it would be for the company to build them on its own.
A purchased system will also include advanced features that wouldn’t be delivered in early versions of a built system, which will inevitably focus on meeting basic requirements as quickly as possible. It could easily take years for the in-house system to catch up with the refinements of a mature purchased product, and the purchased product will also be improving during that time. There’s a reasonable argument that a purchased CDP is likely to add features even before any particular company knows it needs them, in which case it would be impossible for the built CDP to ever meet user needs as quickly as the purchased product. One caution: purchased CDPs themselves vary greatly in their maturity, so this will apply more to some than others.
Build and buy choices both have their risks. There’s always a chance that a purchased system won’t perform as expected, won’t evolve to meet future needs, or will be discontinued if its developer runs into business problems. But these risks can be limited through careful vendor selection and contracts. By contrast, development failures for custom software are almost the norm: industry lore is filled with high-priority projects that ran over time and over budget and still failed to meet expectations. The risk is greater for systems like CDPs, which have requirements that are less familiar to many corporate IT groups than operational systems like order processing or CRM. So, on balance, I think it’s fair to say that built solutions are higher risk – even though I realize that many in-house IT teams would disagree. Perhaps we can all agree that this is something to be assessed on a case-by-case basis.
In making all these assessments, it’s important to look at the full scope of long-term CDP requirements. It might be relatively easy to extend existing systems to meet a handful of initial requirements, but there would then be a backlog of further enhancements that would each require additional investments. A purchased CDP should deliver a much broader set of features from the start and within its original purchase price.
- Outputs/Same: again, the work to define output requirements will be pretty much the same whether a system is built or bought.
- Outputs/Build Advantage: existing systems may have connectors in place to deliver outputs to company reporting, marketing, messaging, analytical, and other systems. This is especially helpful if the targets are legacy systems that are difficult to work with. As with inputs, a purchased CDP should also be able to take advantage of many existing connectors, so the net advantage for built systems is limited.
- Outputs/Buy Advantage: as with data collection connectors, purchased CDPs will have a library of prebuilt connectors for output systems. This could save considerable effort, especially if the CDP project requires large numbers of connections that don’t already exist and the CDP vendor can provide them.
Summing all this up: some issues, such as data preparation, are less relevant to the build/buy choice than it might seem. The main factor driving the decision is the incremental work needed to build and maintain an internal solution compared with the cost of adding a purchased system. If existing systems can meet CDP requirements with relatively few changes, a built solution makes sense. If a major development project is needed, it’s probably better to buy. Because CDPs are inherently flexible, it’s unlikely that a built solution will truly provide any competitive advantage that a purchased CDP cannot duplicate with the same or less development effort.
One important caveat to all this is that build vs buy is less a choice than a continuum. Even built solutions rely heavily on purchased components, such as data storage platforms, function libraries, and external services (e.g. third party identity resolution). Many purchased CDPs use exactly the same tools. To the degree that builders can rely on purchased tools, they get the same benefits of using pre-built components that they would get from a purchased CDP.
The tools available for purchase continue to improve: platforms like Google Cloud keep adding new CDP-supporting services; databases like Snowflake make it easier to manage CDP data structures; applications like Rudderstack and Informatica provide complex process flows. But assembling a functional CDP will never be as simple as snapping these together like the proverbial Lego Blocks. Then again, deploying a CDP also takes more than just plugging it in.
What matters is that the tools keep getting better, meaning the cost of building is reduced. At the margin, this shifts the balance towards built systems, at least for companies with the resources to use the tools effectively. But in many cases – perhaps the vast majority – a purchased CDP still makes the most sense.
In other words: it depends.
* CDP Institute defines a CDP as “packaged software that creates a persistent, unified customer database that is accessible to other systems”.” This means that, strictly speaking, there is no such thing as a custom-built CDP. But we’ll use “CDP” here to refer to any system that performs the functions listed in the definition: “creates a persistent, unified customer database that is accessible to other systems.” In practice, many home-built “CDP” systems won’t be fully accessible to other systems, either. We’ll ignore that here but note that ease of connecting with new systems is one of the advantages of buying rather than building.