Building a Defensible Data Moat in an Analog Industry: Why Real Estate Still Runs on Excel and How We're Changing It
Real estate runs on Excel. This is not a criticism — it's a structural observation about how a highly fragmented, relationship-driven industry with irregular transaction volume and no standardized data formats organizes information. Excel is flexible, portable, and universally understood. For deal-by-deal work with custom inputs, it's a reasonable tool.
It's also where data goes to die.
The knowledge embedded in a development team's collection of Excel models — which sites were evaluated, why they advanced or didn't, what capital stack structures worked, how feasibility varied across markets and deal types — is largely inaccessible for organizational learning, aggregate analysis, or knowledge transfer. Every model is a document, not a record. Every analyst who leaves takes their institutional knowledge with them.
Building a defensible data moat in an analog industry isn't about scraping public records faster than competitors. It's about replacing the Excel-as-system-of-record with something that captures the same decisions and analysis in a structured, queryable, accumulating form.
Why the real estate data moat is harder than it looks
In industries that have already digitized — financial services, healthcare, logistics — the data moat question is relatively well understood. Data accumulates in systems of record; incumbents with the most data have an advantage that's hard to overcome; new entrants have to find gaps or underserved segments where they can accumulate data faster than the incumbents defend.
Real estate hasn't digitized in this way. The systems of record, to the extent they exist, are property records, title insurance, and MLS data — all public or semi-public, all backward-looking, none of them capturing the decision-making process that produces real estate outcomes.
This means the data moat in real estate hasn't been built yet. Not because the opportunity wasn't there, but because the workflow hasn't been digitized in a way that would generate the data.
The pre-development gap
The digitization deficit is most acute at the pre-development stage. Transaction data — what properties sold for, when, to whom — is reasonably available. Post-transaction data — how properties perform as assets — is increasingly available as institutional real estate has expanded.
The gap is in the middle: the process by which development opportunities are identified, evaluated, structured, and advanced to the point where they become transactions. This is where the most consequential decisions in real estate are made, and it's where the data infrastructure is thinnest.
In affordable housing specifically, this gap is extreme. The subsidy program landscape, the competitive allocation environment, the local soft debt availability — all of these factors that determine whether a deal is viable — are evaluated manually, documented inconsistently, and rarely captured in a form that accumulates into organizational knowledge.
What a real data moat looks like
A genuine data moat in affordable housing pre-development would capture: which sites were evaluated and what their characteristics were; which program combinations were modeled and what they produced; which deals advanced and which were abandoned, and why; which capital structures closed and at what terms; how feasibility varied across markets, deal types, and team types.
This data doesn't exist in any public source. It's not scrapable. It can't be assembled from property records or transaction data. It exists only in the workflow of development teams actively doing the work.
Software embedded in that workflow generates this data as a byproduct. Not by asking users to enter data into a database — that's a friction-laden approach that fails in practice — but by being the tool in which the work actually gets done. The site evaluation happens in the product. The capital stack modeling happens in the product. The go/no-go decision gets recorded in the product. The data accumulates without requiring any additional effort from the user.
Why Excel can't defend this position
Excel doesn't lose to a more capable spreadsheet. It loses to software that does things spreadsheets fundamentally can't: accumulate structured data across deals and users, surface patterns from that accumulated data, and improve over time as more decisions flow through the system.
The organizations that have successfully replaced Excel in other industries — in loan origination, in supply chain management, in healthcare operations — didn't do it by building a better spreadsheet. They did it by building workflow software that made Excel irrelevant for the specific use case by doing things Excel couldn't do.
In affordable housing pre-development, the workflow is complex enough, the information sources fragmented enough, and the decision logic specialized enough that a well-built product has a real chance to replace Excel as the primary tool for site evaluation and deal management. Not for every user immediately, but for enough users to establish the data flywheel.
The compounding advantage
Once established as the system through which real decisions get made, the data advantage compounds in a way that pure data products can't replicate. A data product that aggregates public records is always vulnerable to a competitor who can aggregate the same records. A workflow product that has accumulated three years of real feasibility decisions across hundreds of development teams and thousands of sites has an asset that can't be replicated without years of adoption.
That's what a defensible data moat looks like in an analog industry. Not a better scrape of public records. A better record of the decisions that public records don't capture.
Alpha Deal is building the workflow system that replaces Excel for affordable housing pre-development — capturing the decisions that public data misses and building the data moat from the inside out.