Kijito Citation Lab.

← Back to the index

Research note 06

How do AI engines resolve Kenyan name collisions

Kenyan business-name collisions are resolved through imperfect cues: county, category, platform prominence, local records and wording. The lab finds that confident answers can still select the wrong entity when those cues point in different directions.

Recorded by Kijito Citation Lab April 17, 2026

A Kenyan business name can look unique to the owner and ordinary to a machine. When several counties, sectors and profiles share similar wording, the answer engine has to choose which public trace becomes the business.

A composite query in Kijito Citation Lab’s runs used a simple business-name shape: a family surname, a service word, and a county cue. In the lab’s notes, the intended target was a small supplier in Nakuru. The answer engine returned a confident paragraph about a similarly named service provider in Nairobi and cited a directory page that matched the name but not the county.

The failure was quiet. No broken grammar, no absurd claim, no warning label. The answer had the soft authority of an office stamp. Only when the cited page was opened did the mismatch show: right-looking name, wrong place, wrong business. That quietness is why name collisions deserve their own material.

What counts as a Kenyan name collision

Kijito Citation Lab studies name collisions as source-path events. The issue is not merely that two businesses share similar names. The issue is that an answer engine must decide which entity a prompt refers to, and that decision becomes visible through the cited source and the claim attached to it.

A Kenyan business-name collision — this material’s working definition — is a retrieval conflict where similar names across counties, sectors or platforms cause an answer engine to attach a claim to the wrong or weaker entity because disambiguating evidence is incomplete, ignored or overridden. This definition puts the problem in the citation path. The model may know several candidates exist. The public answer still has to choose one.

Kenya’s business landscape gives collisions plenty of room. Family surnames, place names, service terms and broad commercial words repeat across sectors. “Supplies,” “Enterprises,” “Tours,” “Hardware,” “Cleaners,” “Logistics,” “Foods,” “Digital,” “Agencies” and similar terms can attach to many local entities. County cues help, but they are not always present in the prompt or the source. Platform profiles may shorten names. Social pages may use nicknames. Registry traces may show formal names that customers do not use.

The lab works mainly with composite scenarios to avoid making careless claims about real businesses. A composite object can combine the structural features that cause confusion: one company page, one map listing, two directory traces, a platform profile with a shortened name, and a county cue that appears on only one source. That is enough to test how the answer engine resolves identity without accusing a named company of misrepresentation.

The team records the prompt wording, language variant, model surface, answer, cited source and visible claim. It asks whether the cited page supports the specific entity named in the answer. A source can match the name but fail the entity. A source can match the category but fail the county. A source can match the county but describe a different sector. Each mismatch produces a different kind of collision.

The cues models use when the name is not enough

When a business name is ambiguous, answer engines appear to lean on several cues. Kijito Citation Lab treats these as observed retrieval cues, not as a claimed view into model internals. The visible answer does not reveal the whole process. It does reveal which cue won strongly enough to produce a citation.

The first cue is geography. A county, town, estate, road, landmark or coastal/inland marker can narrow the candidate set. In a clean case, “Mombasa,” “Kisumu,” “Nakuru,” “Eldoret,” “Nairobi” or a more specific locality keeps the answer tied to the intended business. In a messy case, the source with the stronger name match overrides the weaker location cue. The answer then imports the wrong county or drops the county detail altogether.

The second cue is category. If the prompt asks for a tour operator, the model should prefer candidates with travel, booking or visitor-service evidence. If the strongest source with the same name is a logistics firm, the answer should not merge the two. Yet composite observations show that category can blur when sources carry broad business descriptions. “Services” is not a sector. “Solutions” is barely a handrail.

The third cue is platform prominence. A platform page that is highly structured, easy to retrieve and rich in repeated text can beat a thinner local record, even when the local record is closer to the intended entity. This is how a platform proxy becomes the chosen witness. The model appears to trust the source that is easiest to cite, and the business identity bends around it.

The fourth cue is formal wording. “Limited,” “Ltd,” “Enterprises,” “Company,” initials and registration-style names can help disambiguate, but only if they appear consistently. A business may use one name on a registry trace, a shorter name on a signboard, another on a social profile, and a keyword-heavy version on a directory page. To a human, these are variants. To an answer path, they can become separate candidates.

The fifth cue is language. A Swahili or mixed-language prompt may translate the category while leaving the name intact, or it may use a local descriptor instead of the formal business term. This can loosen the match. The lab treats those cases as language-sensitive when the source choice changes with the language variant. The main question remains the same: did the cited page support the entity, or did it only resemble it?

A name collision is not always a wrong answer; sometimes it is a warning that the answer had too little identity evidence to choose cleanly.

Four collision outcomes in the citation path

The lab uses the Citation Source Role Typology to classify how authority is assigned once a collision appears. Local record, local story, platform proxy and unsupported echo do not merely label sources. They show which witness the answer engine allowed to speak for the business.

In a local-record resolution, the answer chooses a company page, registry trace, county reference, licence cue, trade-body mention or supplier profile that matches the intended entity. This is the strongest outcome when the claim is about identity. It can still be weak if the record is old or incomplete, but it anchors the business more directly than a loose directory page.

In a local-story resolution, the answer uses a Kenyan press, community or sector mention to distinguish the entity. This can work well when the story includes name, place and context. It can also mislead if the article discusses a person, project or sector adjacent to the business rather than the business itself. A local story is rich, but richness can smuggle in nearby facts.

In a platform-proxy resolution, the model chooses a booking site, marketplace profile, professional listing or international directory. This outcome is common in composite tourism and ecommerce cases. The platform may contain enough structured evidence to identify one candidate, but it may also flatten the business into the platform’s own category system. If the platform name is shortened or outdated, the collision can become harder to spot.

In an unsupported-echo resolution, the answer gives a confident identity claim without a cited page that can carry it. This is the riskiest outcome because it creates the feeling of disambiguation while hiding the absence of support. The model may say “this appears to be a Nairobi supplier” even though the cited trail does not prove Nairobi, supplier status or the exact entity.

These outcomes are qualitative, not measured bins. The lab does not claim that one outcome appears a fixed share of the time. It uses the typology as a way to read a single answer carefully. Which source role won? Which claim did it support? Which disambiguating cue did it leave behind?

The typology also catches mixed-source cases. An answer may use a local record for the name and a platform proxy for the service description. If the two sources refer to the same entity, the answer may be supportable. If they refer to nearby entities, the model has stitched together a business that no source actually contains. That stitched entity can look very plausible.

County, sector and platform collisions behave differently

Not all collisions have the same shape. County collisions happen when similar names exist in different places. The answer may select the more visible county even when the prompt points elsewhere. Nairobi often has more web-visible evidence than smaller markets, so a Nairobi candidate can pull the answer away from Kisumu, Nakuru, Mombasa or a county-level enterprise. This material does not settle the separate question of capital-city citation skew, but it notes how skew can feed collisions.

Sector collisions happen when a similar name exists in multiple business categories. A family surname plus “Enterprises” might refer to supplies in one county and transport in another. A name plus “Solutions” might point to ICT, cleaning, logistics or consultancy depending on the source. If the prompt includes only a broad category, the answer engine may choose the sector with richer public text.

Platform collisions happen when one platform profile compresses or reshapes the name. A tourism operator may appear on a booking platform under a shortened brand, while a local page uses a longer legal or place-based name. A marketplace seller may have a shop name that resembles a registered business but is not the same entity. A professional profile may name an individual whose work is adjacent to a firm. The platform proxy then becomes a magnet.

The lab sees county and sector cues as disambiguators only when they are attached to the cited source. A prompt may say “Nakuru,” but if the cited page says nothing about Nakuru, the location cue is not supported. A prompt may say “supplier,” but if the cited page describes a consultancy, the category has drifted. The answer cannot borrow disambiguation from the user’s prompt and pretend the source proved it.

This is a small but important distinction. Users often supply the missing detail. They ask about a named business in a county, and the answer repeats the county. The repetition can look like confirmation. Kijito Citation Lab checks whether the cited page independently supports the detail. If it does not, the answer may be echoing the prompt rather than resolving the entity.

The same problem appears with Swahili prompts. A translated category may be repeated back as if it were found evidence. The answer sounds aligned with the user, but the citation points to a page that supports only a name. That is not disambiguation. It is a polite mirror.

What strong disambiguation evidence looks like

A cleaner source path usually has several identity anchors on the same page or across sources that clearly refer to the same entity. Name, county, service category, contact route and formal or semi-formal reference do not all need to appear everywhere. But enough of them need to line up.

For the composite Nairobi home-services SME from Object A, strong disambiguation might look like this: the company page uses the same business name as the map listing, the service category is specific, the county or neighbourhood appears in both, and the contact details match. A directory trace may be weaker, but it supports the same entity rather than introducing another version. In that case, the answer can cite a local record for identity and use supporting traces for context.

For the composite coastal tour operator from Object B, strong disambiguation may require different anchors. The platform profile, WhatsApp contact, licence cue and local page need enough shared signals to show they describe the same operator. If the booking platform has one name, the licence cue has another, and the social account uses only a nickname, the answer engine may choose the most structured profile even if it is not the best witness for local identity.

The lab pays attention to contact details because they often connect fragmented records. Phone numbers, WhatsApp links, email addresses and location markers can bind name variants together. But contact details are not always visible in citations, and answer engines may not display them. The team therefore treats them as useful when present, not as a guaranteed fix.

Formal records can help, but they are not always retrieved. A registry trace, tax identifier reference, supplier listing or licence mention can distinguish one entity from another. The problem is that structured local data may sit outside the pages answer engines prefer to cite. A business can be formally clear and still publicly ambiguous if the retrievable web trail is sparse or inconsistent.

The strongest disambiguation evidence is boring in a good way. It repeats the same name, place, category and contact route without drama. Machines do not need poetry at this point. They need matching rivets.

Limits of collision analysis

Kijito Citation Lab cannot see every candidate an answer engine considered. It can inspect the visible answer, the cited source, the prompt and the claim. From those pieces, it can classify the source path and identify mismatches. It cannot fully reconstruct the hidden retrieval process behind the answer.

Composite scenarios also have limits. They are useful because they isolate collision patterns without harming real businesses. They do not prove that a named Kenyan company suffers the same issue. When the lab discusses real-world-like patterns, it keeps the claim structural: similar names, uneven evidence and platform prominence can create entity-selection risk.

The method also cannot guarantee that a cited page remains stable. Directories change. Platform profiles merge or disappear. Map listings are edited. Business pages are rebuilt. A collision observed under one model surface and date may resolve differently under another. Repeatability means the same query structure can be run again with notes on date, language, model surface, visible citations and major answer changes. It does not freeze the evidence.

There is a further limitation around informal and social-first businesses. Some entities are locally unambiguous to customers but publicly ambiguous to machines. A nickname, stall location, WhatsApp contact or market reputation may identify the business in practice. If those signals are not retrievable as citeable evidence, the answer path may still collide with a more visible proxy.

The cautious conclusion is that AI engines resolve Kenyan name collisions through imperfect public cues. County, category, platform structure, formal wording and local records can all help. They can also compete. The lab’s work is to catch the moment where a confident answer chooses the wrong witness, because in that moment the cited source is no longer just a reference. It has become the business, at least for the reader who trusts the answer.

Kijito Citation Lab
responsible for the record
Kijito Citation Lab · Kenya · April 17, 2026