navigate   N notes
Vanderbilt Owen MBA · Spring 2026
Session 4: Knowledge
Graph Design
From conversations to structure. Entities, facts, triples.
Wednesday, March 18, 2026
OL
Oliver Luckett
BA '96 · French Renaissance Literature
🔗 WHAT Layer 🧠 Entity Resolution ★ Guest: Jeff Jonas 🛠️ 100-Triple Challenge
Today's Session

Agenda

🔄
WHO → WHAT Transition
You've built the social layer. Now we structure what you know. The Knowledge Graph is where opinion becomes fact.
Guest Speaker — Jeff Jonas
Former IBM Fellow. National Geographic's "Wizard of Big Data." Founder of Senzing. The creator of entity resolution.
🔺
Entities, Facts, Triples
The building blocks of knowledge. How to decompose any domain into structured, queryable, composable units.
🛠️
100-Triple Challenge
Hands-on workshop: take your project and build a domain knowledge graph with at least 100 triples.
What We Don't Know
The most important graph is the one that maps your ignorance. What's missing? Where are the gaps?
Guest Speaker
JJ
Jeff Jonas
Data Scientist · Former IBM Fellow · Founder, Senzing
The creator of entity resolution
14 patents · National Geographic feature · IBM Fellow · EFF Advisory Board
⭐ Jeff Jonas

The Wizard of Big Data

🎰 The Origin Story
In the 1990s, Jeff built NORA — Non-Obvious Relationship Awareness — for Las Vegas casinos. The problem: clever people using different names, Social Security numbers, and birth dates to cheat the system. His system found the hidden connections that no single database could reveal.
📋 Career Highlights
• Created NORA for casino fraud detection (1990s)
• Sold company to IBM (2005)
• IBM Fellow — led Context Computing
• Modernized US voter registration with Pew
• Built Singapore maritime domain awareness
• Founded Senzing — democratizing entity resolution
• 14 patents · National Geographic feature
🏛️ Boards & Affiliations
USGIF — US Geospatial Intelligence Foundation
EPIC — Electronic Privacy Information Center
EFF — Electronic Frontier Foundation (advisory)
CSIS — Senior Associate
Why This Matters
NORA = awareness refraction parallel — finding non-obvious connections is exactly what the Trinity Graph does.

Privacy + power tension — Jeff sits on boards of both intelligence AND privacy orgs.

Every Inkwell venture needs entity resolution — BackyardOne, Block BMOS, Artiquity all face this problem.
⭐ Entity Resolution

The Problem Jeff Solved

Entity resolution answers one question: "Is this the same thing?"
Without Entity Resolution
Record 1: Jon Smith, 123 Main St
Record 2: Jonathan Smith, 123 Main Street
Record 3: J. Smith, 123 Main St, Apt 2
→ System sees 3 different people
3 records, 3 invoices, fraud goes undetected
With Entity Resolution
Record 1: Jon Smith, 123 Main St
Record 2: Jonathan Smith, 123 Main Street
Record 3: J. Smith, 123 Main St, Apt 2
→ System sees 1 person, 3 records
Complete picture, one graph node, patterns visible
For Your Projects
Is "123 Main St" = "123 Main Street" in BackyardOne?  ·  Is "The Weeknd" = "Abel Tesfaye" in Block BMOS?  ·  Is the same collector listed under two gallery names in Artiquity? Entity resolution is the invisible foundation of every knowledge graph.
🔄 The Transition
WHOWHAT
Week 1 was about people and connections.
Week 2 is about what those people know — and what they don't.
Two Graph Layers

Social Graph vs. Knowledge Graph

🟢
Social Graph (WHO)
Nodes: People, organizations, teams
Edges: knows, works_with, reports_to, trusts
Question: "Who is connected to whom?"
You built this in Weeks 1–3 ✓
🔵
Knowledge Graph (WHAT)
Nodes: Entities, concepts, facts, documents
Edges: is_a, has_property, causes, requires, contradicts
Question: "What do we know — and how sure are we?"
We build this today →
💡 The power comes from connecting them. When a person in your social graph is linked to a fact in your knowledge graph, you know not just WHAT is true but WHO knows it — and how much you trust that source.
The Building Block

What Is a Triple?

Every fact in a knowledge graph is stored as a triple: a subject, a predicate (relationship), and an object.
Subject Predicate Object
BackyardOne — solves → LA Property Research Fragmentation
LADBS — publishes → Permit Data
Permit Data — requires → Socrata API Access
BackyardOne — competes_with → ZIMAS Direct Access
LA Developer — needs → Zoning + Permit Cross-Reference
💡 Five triples. Already you can traverse: BackyardOne → solves → fragmentation, and the same developer who needs zoning data also needs permits — which come from LADBS via Socrata. The graph connects what spreadsheets can't.
Taxonomy

Entity Types for Your Project

Every domain has the same core entity categories. Map yours:
👤
Actors
People, orgs, systems that DO things
📦
Assets
Products, data, content, IP
Events
Things that happen with timestamps
📐
Concepts
Abstract ideas, frameworks, categories
📏
Constraints
Rules, regulations, limits, dependencies
Unknowns
Gaps, assumptions, open questions
💡 The Unknowns category is the most important. A graph that only maps what you know is dangerous. A graph that maps what you don't know is strategic.
In Practice

What This Looks Like in Neo4j

Your VanderBot stores every triple in a Neo4j graph database. Here's the Cypher query language:
// Create entities CREATE (b:Venture {name: "BackyardOne", stage: "PMF"}) CREATE (l:DataSource {name: "LADBS", api: "Socrata"}) CREATE (z:DataSource {name: "ZIMAS", type: "Zoning"}) CREATE (d:Actor {name: "LA Developer", segment: "Target User"}) // Create relationships (triples) CREATE (b)-[:INGESTS]->(l) CREATE (b)-[:INGESTS]->(z) CREATE (d)-[:NEEDS]->(b) CREATE (l)-[:PUBLISHES]->(p:Asset {name: "Permit Records"}) // Query: What does a developer need? MATCH (d:Actor)-[:NEEDS]->(v)-[:INGESTS]->(s) RETURN d.name, v.name, collect(s.name)
You don't need to write Cypher. Your VanderBot does this for you. But understanding the structure helps you ask better questions — and catch when the graph is wrong.
🛠️ Workshop
🛠️
Build Your
100-Triple Graph
Open your VanderBot. You have 30 minutes.
Build the knowledge graph for your project.
Workshop — 30 Minutes

The 100-Triple Challenge

📋 Instructions
1. Open VanderBot — tell it: "Build a knowledge graph for [project]"
2. Map your Actors — users, partners, competitors, regulators
3. Map your Assets — data, products, IP you're building
4. Map your Constraints — regulations, dependencies, limits
5. Map your Unknowns — assumptions, open questions
6. Connect everything — edges matter more than nodes
🎯 Tip: Don't be comprehensive — be specific. "BackyardOne needs data" is useless. "BackyardOne requires Socrata API rate limit of 1000 req/hr" is a real node.
🎯 Triple Targets
20 You've listed some things
50 You have a real structure
100 You have a queryable knowledge base
200+ You have an intelligence system
✅ Good Triple Examples
BackyardOne → requires → LADBS API access
Permit processing → takes → 4–6 months (LADBS)
ADU market → unknown → conversion rate to paid
❓ The Hard Part
What We
Don't Know
The most valuable part of your knowledge graph
is the map of your ignorance.
Risk Analysis

Mapping Your Unknowns

For every project, there are four quadrants of knowledge:
✅ Known Knowns
Facts in your graph. Verified, sourced.
"LADBS publishes permit data via Socrata API"
🔵 Known Unknowns
Questions you know to ask. Gaps you can name.
"We don't know the conversion rate for property data → paid subscription"
🟠 Unknown Knowns
Things you know but haven't structured. Tribal knowledge, intuition.
"The team knows permit expediters use workarounds, but it's not documented"
🔴 Unknown Unknowns
Risks you haven't imagined. This is where failure lives.
"What if LADBS changes their API policy next month?"
💡 Assignment: Your "What We Don't Know" risk analysis is about surfacing the blue and orange quadrants — and imagining the red. The graph should make ignorance visible, not hide it.
Pod Status

Quick Check-In: Where Are You?

Each pod: 2 minutes. Share with the class:
📋
Project
What are you working on?
📊
Graph Status
How many nodes/triples have you created?
Biggest Unknown
What's the one thing you most need to figure out?
🤝
Help Needed
Is there a connection, skill, or resource another pod might have?
💡 Listen to other pods. The graph is shared. If BackyardOne's data sources overlap with your project, that's a connection worth making.
Deliverables

Due Before Session 5 (Monday)

📐
Knowledge Architecture Document
2–3 pages. Map your project's entity types, key relationships, and data sources. Include a visual graph diagram (hand-drawn is fine, or export from VanderBot). Submit to Brightspace.
🔴
"What We Don't Know" Risk Analysis
1–2 pages. Four-quadrant analysis for your project. At least 5 Known Unknowns and 3 possible Unknown Unknowns. This is the hard one — and the most valuable.
💬
VanderBot: 100-Triple Milestone
Your project's knowledge graph should hit 100 triples by Monday. Ask your VanderBot: "How many triples do I have?" If you're under 100, keep going.
📚
Read: Survey of Knowledge Graph Embedding Models
Available on Brightspace. Focus on: what are embeddings, why do they matter for search and reasoning, how do they connect to the WHAT IF layer we're building toward.
"A fact without context is trivia.
A fact in a graph is intelligence."
Session 4 · Knowledge Graph Design
Monday: Knowledge Graph Verification — we test what you've built
1 / 17
Speaker Notes