Every word you speak
has cousins across the world.

Search any word or concept. The atlas returns every documented form across language families, scored for structural similarity and traced to shared origins.

400+Corpus Entries
12Language Families
3Analytical Axes
0Contributions
Featured connections
High-confidence cognates — words that share structure across unrelated families.
The core proposition
Words diverge at different rates. Bodily and kinship terms remain near-identical across all documented families. Abstract and cultural terms drift furthest. The path of divergence follows recoverable rules.
Conf = w₁·Scos(VL₁,VL₂) + w₂·P(C₁→C₂) + w₃·Overlap(PV₁,PV₂)
Explore the framework →
Add your language
Know this word in a language not listed? Submit it — the engine scores it and it enters the review queue.

Dictionary

Cross-referenced across all documented families. Filter by language or browse the full corpus. Annual print editions published under the PUO imprint.

Contribute to
the corpus

The atlas grows through community knowledge. Submit words, roots, and connections from your language. All entries are scored by the PUO engine and enter a review queue before publication.

Submit a connection
Know a word that connects across families? Start with the concept.
Your contributions
Every submission is timestamped and attributed to your account.
No submissions yet.
Corpus statistics
400+Core entries
0Community submissions
5Bantu languages
7Other families

The PUO Framework

A predictive, falsifiable model for mapping cognacy across language families. Three analytical axes form a theorem that generates testable hypotheses about structural relationships.

Axis A
Vowel Signatures
Vowels function as semantic anchors. The AEI core is stable across Bantu roots. O and U nominalise and modulate.
VowelFunctionExample
ABase / Action / Rootthusa, dira
ESubjunctive / Negationthuse, ga a bue
IAgentive / Continuousmothusi, moriti
ONominalizationthuso, puo, tiro
UProgressive / Pluralubuntu, ma pundu
Axis B
Consonant Rotation
Sound shifts follow probability matrices — not random drift. The D↔T↔R↔L chain is the most documented rotation across all families.
ShiftExampleP
D↔Tdira → tiro0.70
B↔P↔MBotho → Motho0.60
S↔F↔ZmFana ↔ mSana0.55
R↔LTura ↔ Dula0.50
G↔K↔HneGus → Kgosi0.45
Axis C
Morphological Packaging
Prefix intensity, phoneme inventory, and suffix logic determine how roots are realised and how they map across families.
FeatureDescription
PIPrefix intensity (0–1)
MDMorphological depth
‑elaAction for / toward others
S‑SeriesRemove S → base root revealed
PIFPhoneme inventory flags
Confidence Score Formula
Conf = w₁ · Scos(VL₁,VL₂)  +  w₂ · P(C₁→C₂)  +  w₃ · Overlap(PV₁,PV₂)
VL = vowel vector · P(C) = consonant rotation probability · PV = morphological packaging vector
The Deviation Principle
Words diverge at different rates depending on how universal the experience they encode. Bodily and kinship terms — the first things named — show near-zero deviation across all families. Abstract, cultural, and traded concepts show the highest drift.

One origin. Shared base. Rule-governed deviation. Recoverable structure.
Open Research Questions
The framework is explicit about what it still requires.
— Corpus-derived vowel vectors (target: 200–300 cognates)
— Formalized prefix packaging metrics (PI, MD, PIF)
— Extension to Afro-Asiatic and Austronesian families
— Exceptions registry — stress-tests, not weaknesses
— Peer-reviewed publication and institutional validation

Research Portal

For academic institutions, universities, and professional researchers. Partner access opens the full corpus, raw scoring APIs, and collaborative annotation tools.

The PUO corpus and theorem are in active development. Research partners contribute to corpus expansion, provide peer review, and gain full API access to the scoring engine. Institutional partnerships include co-authorship on published findings and curriculum integration support.

Ministries of Education
Curriculum Integration
Pilot programme endorsement, teacher certification pathways, and assessment design. Starting with Botswana — structured for continental adoption.
Universities & Research Institutes
Academic Partnership
Co-develop the corpus, contribute peer review, design joint research programmes. Linguistics, African studies, and computational departments are primary partners.
Technology Partners
ASR & Computational Tools
Development of automatic speech recognition for low-resource African languages — a commercially significant application built directly from the corpus.
Corpus sample — available to research partners
RootLanguageFamilyAEIOUNotes
THUS-SetswanaBantuthusathusemothusithusohelp base root
BUA-SetswanaBantubuabuemmui/sebuipuoB↔P mapping
PUNDA-OtjiHereroBantupundapunde(mu)pundo(ma)punduprogressive
DIR-SetswanaBantudiradire?diriletiroD↔T rotation
DUL-Tswana/Herero/EngCross-familyduladule(bo)dulodwell (English) cognate
KIT-SetswanaBantuikitsekitsoknowledge / self-awareness
Request Research Access
Institutional and academic access by application.

My Profile

G
Guest
No languages set
Member since today · 0 contributions to the corpus
Your linguistic identity
Languages you speak
Your contributions
No contributions yet. Every word you add expands the atlas.