When you click on”Accept all cookies“click, you agree to the storage of cookies on your device to improve website navigation, analyze site usage, and support our marketing efforts. For more information, see our privacy policy.

How different wiki systems approach AI — and why it matters for regulated organisations

For regulated organisations, the key question is not whether a wiki has AI features, but whether they can control which LLM processes their knowledge, where that processing happens, and whether their data is used for vendor AI training.
Stefan Haller
June 19, 2026

AI has arrived in the wiki market. Every major platform now offers some form of AI-assisted writing, semantic search, or conversational knowledge retrieval. For teams in unregulated environments, the choice is largely about features and price. For organisations operating in regulated industries, public administration, or anywhere with sensitive data, the choice has a more consequential dimension: who controls what the AI does with your knowledge, and under whose laws.

The wiki platforms available today do not all take the same approach to AI. Understanding the architectural models — and what each implies for data governance — is more useful than comparing feature lists, which change frequently. The architecture changes rarely.

The questions that matter for regulated buyers

When a wiki platform adds AI capabilities, several questions follow immediately:

  • Which LLM processes your content? Is it a vendor-selected model you have no control over, or can you choose?
  • Where does the processing happen? US-based infrastructure means US jurisdiction, regardless of where data is stored.
  • Is your content used to train AI for others? There is a difference between a vendor processing your data to serve you, and using it to improve their product for all customers.
  • Can you opt out? Can specific content — sensitive documents, compliance records — be kept away from LLM processing entirely?
  • Is the AI production-ready? A beta feature cannot be relied upon for mission-critical use.

The answers depend almost entirely on which of the three architectural models a platform has adopted.

Model 1: No native AI

Some wiki platforms ship no AI capabilities at all. Search is keyword-based. There is no content generation, no semantic retrieval, no LLM integration of any kind. AI functionality, if needed, requires custom engineering: a separate vector database, an external LLM API, a retrieval and writing layer to build and maintain.

From a sovereignty perspective, this model is clean by default. The platform introduces no third-party AI processing, no external data flows, no vendor-imposed LLM. What the organisation does with AI is entirely its own choice and its own responsibility.

The practical limitation is the other side of that coin. Building a production-quality AI layer on top of a platform that was not designed for it is a significant project — one that most organisations will defer indefinitely. And “no AI today” increasingly means “no path to AI tomorrow” without a platform change. Several lightweight open-source wikis — including well-known tools like BookStack and DokuWiki — sit in this category.

Model 2: Vendor-managed AI

The majority of major commercial wiki platforms have adopted a model where AI is built in, powered by one or more LLMs the vendor has selected, and delivered as part of the platform subscription. The customer gets the AI features; the vendor controls the AI stack.

The capabilities in this model are often impressive — in-editor generation, semantic search, conversational retrieval, autonomous agents. The sovereignty implications are structural. The LLM providers are typically major US-based AI companies. The processing happens on infrastructure outside the customer’s control. The customer cannot substitute a different model, route processing to a jurisdiction they have assessed, or keep specific content away from the AI pipeline without losing access to the features entirely.

A further risk in this model is the vendor’s ability to change the terms of AI use unilaterally. Several major platforms have recently moved to use customer content — in de-identified form — to improve their AI for all users. De-identification does not protect the substance of strategic documentation, compliance records, or sensitive business content from being used as training signal. Opt-out is typically available only on the most expensive tiers, and the default is opt-in.

Example: content-for-training terms changes in 2026

In May 2026, one major enterprise wiki vendor announced a terms change effective August 2026: customer metadata and in-app content — including wiki page content, project descriptions, and chat conversations — will be used to improve AI features for all customers on the platform. Opt-out is restricted to the Enterprise tier and certain compliance-qualified accounts. Free and Standard plan customers have no opt-out for metadata contribution.

This is not a unique or isolated development. It reflects the economic logic of vendor-managed AI: the vendor bears the infrastructure cost of the LLM and recovers it through a combination of pricing and data. Customers who cannot negotiate Enterprise terms are accepting conditions they may not have reviewed carefully.

Platforms in this category include the leading commercial wiki and knowledge-base products, including Confluence, Notion, Microsoft Loop, and most other US-headquartered SaaS knowledge platforms. The specific terms vary; the structural position — vendor controls the AI stack, customer cannot substitute — is consistent across the category.

Model 3: Bring your own LLM (BYOLLM)

A third model separates the AI features from the AI infrastructure. The platform provides the capabilities — semantic search, content generation, conversational retrieval — but connects them to whichever LLM endpoint the organisation supplies. The customer chooses the model: a commercial provider they have assessed and contracted with directly, a model deployed in their own infrastructure, or an on-premise model with no external connectivity at all.

This model gives the organisation meaningful control over the sovereignty questions. The jurisdiction of AI processing is determined by the LLM the organisation selects, not by the platform vendor. Training on customer data is governed by the organisation’s own agreement with its chosen LLM provider, not by the platform’s standard terms. Sensitive areas of the knowledge base can be excluded from AI processing entirely by simply not connecting them to the LLM endpoint.

This architecture is the natural home for European sovereign wiki platforms — those built and operated outside US jurisdiction, with data sovereignty as a design principle rather than a compliance add-on. Several platforms in this category are actively building BYOLLM capabilities; the maturity of those implementations varies.

Phonemos — BYOLLM in production

Phonemos implements the BYOLLM model with production-ready capabilities. Three AI features are live:

  • AI-powered semantic search — finds content by meaning across wiki pages, Word and PDF files, and OCR-extracted image text. The default embedding model is Mistral AI; any OpenAI-compatible endpoint can be substituted.
  • MCP server — exposes the Phonemos knowledge base as grounded context for any enterprise AI platform or agent that supports the Model Context Protocol, without the data leaving Phonemos.
  • Bring your own LLM — connect any OpenAI-compatible endpoint: a commercial provider, a model deployed in your own cloud, or a fully on-premise model. Per-site opt-in means sensitive areas can be kept entirely outside LLM processing.

Phonemos is developed and operated in Bern, Switzerland. No US companies are involved in the platform or its default infrastructure. The CLOUD Act does not apply. Customer content is never used to train or improve Phonemos AI for other customers. AI processing happens within the LLM the organisation selects, under the terms that organisation has agreed with its chosen provider.

Summary comparison

What this means for regulated organisations

The wiki market now covers the full spectrum on AI governance: no capability, vendor-managed AI with fixed providers and US jurisdiction, and BYOLLM platforms that give the organisation meaningful control. Where a platform sits on that spectrum is increasingly a procurement criterion as significant as price or feature set.

Platforms with no native AI sidestep the sovereignty question today — but at the cost of an increasingly wide capability gap and no clear path to closing it without significant custom engineering or a platform change.

Vendor-managed AI platforms offer capable, production-ready features. The trade-off is structural: the organisation accepts the vendor’s choice of LLM providers, the vendor’s jurisdiction, and increasingly the vendor’s right to use content as training data. For organisations with strict data governance requirements, these are not configuration details — they are procurement constraints.

BYOLLM platforms resolve the governance question architecturally. Among European sovereign wiki platforms, this architecture is well established as the right direction — the question for any given platform is how far along the implementation is. For organisations evaluating now, production readiness matters as much as architectural intent.

The broader trend is clear: AI is becoming a core layer of how knowledge platforms work, and vendors are beginning to assert corresponding rights over the data that flows through it. Organisations that have not yet defined their AI governance position — which LLMs are approved, under what conditions, with what data — will face these questions from their platform vendor whether they are ready or not.

Kostenlos ausprobieren?

Das geht! Falls du in der Zwischenzeit Fragen haben solltest, darfst du dich gerne via Chat bei uns melden.
Jetzt testen