BloodHound - Generating OpenGraphs in Python with bhopengraph

Table of contents :

Introduction

When SpecterOps released BloodHound 8.0, they introduced a feature I had been waiting for: OpenGraph. It is now the officially supported way to inject arbitrary nodes and edges into BloodHound. Any data source you can model as a graph — IoT inventories, SaaS permissions, custom infrastructure — can live alongside your Active Directory and Azure environments.

I immediately wanted to use it during a pentest to visualise relationships between assets that BloodHound does not natively collect. My first attempt was hand-crafting the JSON. That worked for two nodes and one edge. It stopped being fun around the fifteenth node, when I mixed up displayname and name for the third time and the upload silently dropped half my data.

So I wrote bhopengraph: a small Python library that builds valid OpenGraph JSON files in a few lines of code. The goal was simple — never hand-write that JSON again.

Understanding the OpenGraph schema

Before writing any code, I needed to understand what BloodHound actually expects. The OpenGraph JSON format has a predictable top-level structure with three sections:

{
  "graph": {
    "nodes": [ ... ],
    "edges": [ ... ]
  },
  "metadata": {
    "source_kind": "YourTag"
  }
}

A node has the following fields:

FieldTypeDescription
idstringUnique identifier within the file
kindsarray of stringsPrimary kind first (drives icon/visual)
propertiesobjectFlat key-value pairs only (no nested objects)

An edge links two nodes together:

FieldTypeDescription
kindstringRelationship name (e.g. "Knows", "AdminTo")
startobjectContains value, match_by ("id" or "name"), and optional kind filter
endobjectSame structure as start
propertiesobjectOptional flat key-value pairs

The official docs include a Minimal Working JSON with two Person nodes and one Knows edge. That is a useful sanity check when you are debugging an upload that fails silently.

First attempt - Hand-crafting JSON

My first approach was writing the JSON by hand in a Python script using json.dumps(). It looked something like this:

import json

data = {
    "graph": {
        "nodes": [
            {"id": "1", "kinds": ["Person", "Base"], "properties": {"displayname": "bob", "objectid": "1", "name": "BOB"}},
            {"id": "2", "kinds": ["Person", "Base"], "properties": {"displayname": "alice", "objectid": "2", "name": "ALICE"}},
        ],
        "edges": [
            {"kind": "Knows", "start": {"value": "1", "match_by": "id"}, "end": {"value": "2", "match_by": "id"}}
        ]
    },
    "metadata": {"source_kind": "Base"}
}

with open("output.json", "w") as f:
    json.dump(data, f, indent=2)

The problem with this approach is that it does not scale. Every node requires repeating the same boilerplate structure. Typos in field names are invisible — BloodHound silently ignores properties it does not recognise. I spent more time debugging missing objectid fields than actually analysing data.

I needed a proper abstraction.

Designing the bhopengraph library

I wanted the API to be small enough to memorise. Four classes, no magic:

  • OpenGraph(source_kind=None) — the container that holds nodes and edges and exports to JSON.
  • Node(id, kinds, properties) — represents a single node. Kinds are always explicit.
  • Edge(start_node_id, end_node_id, kind, properties=None) — a relationship between two nodes, referenced by ID.
  • Properties(...) — a convenience wrapper for the common property fields (displayname, name, objectid, etc.).

The key design decisions were:

  • Explicit kinds — no auto-detection. You always declare what a node is. This keeps the data model clean and predictable.
  • Stable IDs — so re-ingestion is idempotent. Upload the same file twice and nothing breaks.
  • Match by ID by default — edges reference node IDs, not names. This avoids collisions when two nodes share the same display name.

These defaults align with SpecterOps’ best practices documentation.

Installing bhopengraph

The library is available on PyPI. To install it, run:

$ pip install bhopengraph

Minimal example - Bob knows Alice

Now that the library is installed, here is the simplest possible example: two people and a relationship between them.

from bhopengraph.OpenGraph import OpenGraph
from bhopengraph.Node import Node
from bhopengraph.Edge import Edge
from bhopengraph.Properties import Properties

graph = OpenGraph(source_kind="Base")

bob = Node(
    id="123",
    kinds=["Person", "Base"],
    properties=Properties(
        displayname="bob",
        property="a",
        objectid="123",
        name="BOB",
    ),
)

alice = Node(
    id="234",
    kinds=["Person", "Base"],
    properties=Properties(
        displayname="alice",
        property="b",
        objectid="234",
        name="ALICE",
    ),
)

graph.addNode(bob)
graph.addNode(alice)

knows = Edge(
    start_node_id=alice.id,
    end_node_id=bob.id,
    kind="Knows",
)

graph.addEdge(knows)
graph.exportToFile("minimal_example.json")

Running this script generates a valid OpenGraph JSON file:

$ python minimal_example.py
$ jq . minimal_example.json

The output looks like this:

{
  "graph": {
    "nodes": [
      { "id": "123", "kinds": ["Person", "Base"], "properties": { "displayname": "bob", "property": "a", "objectid": "123", "name": "BOB" } },
      { "id": "234", "kinds": ["Person", "Base"], "properties": { "displayname": "alice", "property": "b", "objectid": "234", "name": "ALICE" } }
    ],
    "edges": [
      {
        "kind": "Knows",
        "start": { "value": "123", "match_by": "id" },
        "end":   { "value": "234", "match_by": "id" }
      }
    ]
  },
  "metadata": { "source_kind": "Base" }
}

After uploading this file through the BloodHound interface, the graph renders immediately:

Validating the ingestion in BloodHound

To support this, I always validate by round-tripping a small dataset into a local BloodHound CE instance before scaling up. The quickest way to confirm your data landed correctly is a Cypher query.

This query returns all Knows relationships in the graph:

MATCH p=()-[:Knows]-() RETURN p

If the query returns nothing, the most common causes are:

  • A mismatch between the kinds array and what BloodHound expects.
  • A missing objectid in properties — BloodHound uses this as a deduplication key.
  • Nested objects in properties — the schema only accepts flat primitives and arrays.

Once the shapes and properties are correct on a small dataset, scaling up to hundreds or thousands of nodes is safe.

Lessons learned - Extending the model

After using bhopengraph on several engagements, here are the patterns that worked best:

  • Use stable id values. UUIDs or deterministic hashes make re-ingestion idempotent. Avoid sequential integers — they will collide across separate imports.
  • Keep kinds short and meaningful. Two or three kinds per node is plenty. The first kind in the array drives the icon in the BloodHound UI.
  • Flatten all properties. No nested dictionaries. If you need structured data, serialise it as a string or split it into separate properties.
  • Tag your imports with metadata.source_kind. This lets you filter or delete an entire ingest later without touching other data.
  • Add custom icons for custom kinds. BloodHound supports custom icons that make non-standard node types immediately recognisable in the graph view.

References