What to remember:

  • OKF is a directory of Markdown files with a YAML header, no SDK required, no proprietary runtime: any agent can read it, anyone can produce it.
  • The format formalizes the “LLM-Wiki pattern” described by Andrej Karpathy: a living knowledge base, maintained by the agents themselves, organized into interrelated concepts.
  • For SEO and GEO, this standard represents a major shift: it is no longer just a question of being found by search engines, but of making knowledge usable by agents.
  • Google has already updated its Knowledge Catalog to ingest OKF and serve it to its own agents, giving the format immediate credibility.

The problem OKF seeks to solve

In most organizations, the knowledge that models need is fragmented between dozens of incompatible systems: metadata catalogs with their own APIs, internal wikis, comments in code, documentation in shared drives, and tacit knowledge in the heads of a few senior experts.

When an agent has to answer a question like “how to calculate our weekly active users from our events feed?“, he must assemble the answer from mutually incompatible platforms. Each catalog publisher reinvents the same data models, and the knowledge remains trapped in the surface that created it.

The result: each team that builds an agent solves the same context assembly problem from scratch, in a bespoke way, with no possible interoperability.

What OKF actually is

The Open Knowledge Format intends to respond to this problem with a deliberately minimalist approach. An OKF bundle is a directory of Markdown files. Each file represents a concept: a database table, a business metric, a runbook, a procedure, a deprecated API. The file path corresponds to the concept identity.

Each file starts with a YAML block with a small set of structured fields: type, title, description, resource, tags, timestamp. Only the type field is mandatory. Everything else, including the Markdown body structure, is left to the discretion of the producer.

This is what a minimal OKF document looks like, as provided in the official Google specification:

---

type: BigQuery Table

title: Orders

description: One row per completed customer order.

resource: https://console.cloud.google.com/bigquery?p=acme&d=sales&t=orders

tags: [sales, revenue]

timestamp: 2026-05-28T14:30:00Z

---

# Schema

| Column     | Type   | Description                         |

|------------|--------|-------------------------------------|

| order_id   | STRING | Globally unique order identifier.   |

| customer_id| STRING | FK to [customers](/tables/customers.md). |

# Joins

Joined with [customers](/tables/customers.md) on `customer_id`.

Concepts link together via standard Markdown links, transforming the directory into a relationship graph. Bundles can also include index.md files for hierarchical navigation, and log.md files for chronological change history.

What Google insists on emphasizing: no complex compression scheme, no new runtime, no mandatory SDK. The OKF bundle is simple “Markdown, files and YAML frontmatter”. It can be versioned in Git, hosted on any repository, made readable on GitHub, indexed by any search tool.

The three design principles

Google articulates design around three axes.

  1. Minimalism : the OKF only imposes one thing on each document; a type field. What the types are, what other fields to include, what structure to adopt in the body: all this remains at the discretion of the producer. The specification defines the interoperability surface, not the content model.
  2. Producer/consumer independence : a bundle handwritten by a human can be used by an AI agent. A bundle generated by a metadata export pipeline can be browsed in a viewer. A bundle synthesized by one LLM can be queried by another. The format is the contract; the tools at both ends are independently interchangeable.
  3. A format, not a platform: OKF is not linked to any cloud, any database, any model provider, or any agent framework. It will never require an owner account or SDK to read, write or serve bundles. Google releases the specification as open source explicitly because the value of a knowledge format comes from how many parties adopt it, not who owns it.

The relationship with the “LLM-Wiki pattern”

The OKF explicitly formalizes a pattern that had emerged in the community of agent developers, theorized in particular by Andrej Karpathy in a gist published on GitHub. The basic idea is this: rather than sending agents to search for the same documents for the same facts over and over, we give them a shared Markdown library that grows in usefulness over time.

Karpathy puts it this way: LLMs don’t get bored, don’t forget to update a cross-reference, and can modify fifteen files in a single pass. The maintenance bureaucracy that pushes humans to abandon their personal wikis is precisely what LLMs are good for.

This pattern appears in various forms: Obsidian vaults connected to code agents, AGENTS.md or CLAUDE.md file conventions, index.md and log.md repositories that agents consult before any real work. Each instance is custom made. The OKF provides the layer of standardization that allows these wikis to cooperate with each other.

What Google delivers with the specification

Beyond the spec itself, Google is publishing several concrete elements to initiate the ecosystem:

  • A enrichment agent which iterates through a BigQuery dataset, generates an OKF document for each table and view, then performs a second LLM pass that enriches each concept with citations, schemas, and join paths.
  • A static HTML viewer which turns any OKF bundle into an interactive graphical view in a self-contained HTML file, no backend, no installation, no data leaving the page.
  • Three example bundles ready to browse, based on public BigQuery datasets (GA4 e-commerce, Stack Overflow, Bitcoin), produced by the reference agent and committed to the repository as living examples of compliant OKF.

Google also updated its Knowledge Catalog to ingest the OKF and serve it to its agents, which anchors the format in real production use from its launch.

Implications for SEO and Agent Visibility

Mary Haynesa renowned SEO consultant, makes a central observation about what this paradigm shift represents: we are moving from a job consisting of being found by search engines to a job consisting of making knowledge of a company usable by agents to accomplish tasks.

This evolution is profound. Until now, GEO (Generative Engine Optimization) consisted of optimizing content so that it was cited by generative models in their responses. With OKF, the question is different: how to structure the knowledge of an organization so that an agent can take hold of it, navigate in it, and act with it?

Marie Haynes emphasizes that building a quality OKF bundle for a company will require in-depth work: understanding in depth the concepts on which an organization has knowledge, documenting its processes, mapping the relationships between its data. It’s not just converting web pages to Markdown. It’s building the structured brain of an organization.

She also notes an emerging business opportunity: the ability to sell OKF bundles of expert knowledge. A lawyer, an accountant, an SEO consultant could sell a bundle of their proprietary processes, which other organizations could integrate directly into their own knowledge system to make it accessible to their agents.

First practical feedback

Haynes documents his first attempts at creating an OKF bundle based on its own traffic drop evaluations. She used a tool to extract key concepts from multiple documents and store them as separate Markdown files, then visualized them as a graph, where each node represents a concept and the edges express the relationships between them. She then queried this bundle through Gemini 2.0 Flash.

She specifies that her test only covered three training documents, and that the system will be greatly improved. But the principle is validated: we can build a functional OKF knowledge base today with accessible tools.

Tools for converting web pages into OKF bundles already exist, such as the one developed by Suganthan Mohanadasan. But Haynes insists that the real value lies in creating a bespoke OKF, not in simply mechanically transposing existing content.

An open standard designed to evolve

Google explicitly presents OKF v0.1 as a starting pointnot as a completed standard. The specification is on one page. The format will evolve as producers and consumers emerge, and as the community collectively learns what knowledge representations agents actually need in practice.

Publishing as open source from day one is a deliberate choice: the value of a knowledge format comes from the number of parties that adopt it. The next steps Google encourages: read the spec, write producers for different data sources, write consumers (viewers, search indexes, agents), test the reference implementation on its own data, and contribute to the GitHub repository.