The hard question is not whether informal businesses exist. They plainly do. The harder question is whether an answer engine can find enough public evidence to name one, place it, describe it, and cite a source without borrowing from a weaker proxy.
A workshop can be visible from the road and nearly invisible to a machine. In one composite run used by Kijito Citation Lab, the prompt asked for a jua kali metal fabricator near an industrial edge of Nairobi. The answer produced a tidy description: gates, grills, repairs, custom work. The cited trail, though, was mostly a map listing, an old directory trace and a social post that mentioned a phone number but not the full business name.
The team saw a familiar awkwardness. The model could describe the category better than the specific enterprise. It knew what such a business might do, how customers might search for it, and which area names were plausible. But the source path thinned out when the answer had to prove this operator, in this place, with this service. One sentence sounded like local knowledge. The citation did not quite carry it.
Visibility starts before citation
Jua kali businesses often live in public, but not always in documents. A fundi may be known through a stall, a yard, a WhatsApp number, a painted sign, customer referrals, map pins, supplier relationships or photos circulating through social platforms. Some traces are machine-readable. Some are half-readable. Some are visible only to people who already know where to look.
The lab treats this as a source-dependency problem. An answer engine does not meet the business in the physical lane. It meets whatever public evidence can be retrieved, connected and cited. For a formal SME, that evidence may include a company page, registration traces, trade references and press snippets. For a jua kali operator, the public trail may be thinner and more uneven: a map listing with reviews, a Facebook post with workshop photos, a marketplace advert, a county reference to an artisan cluster, or a directory page copied from another directory.
A jua kali business becomes citeable only when a public trace can support a specific claim about the entity. That means a source must carry at least a name, place cue, activity cue or contact cue strongly enough to connect the answer to the business being described. This is not a high bar in human terms. In machine terms, it can be enough to break the trail.
The lab’s early observations suggest a blunt pattern: informal businesses are easier for models to describe as categories than to cite as individual entities. “Jua kali furniture makers in Nairobi” may produce fluent general text. “A specific jua kali furniture maker in a named estate” demands evidence that many small operators never publish in a stable form.
That difference matters. A business can be economically real and locally trusted while still being weakly represented in answer engines. The absence is not proof that the business lacks customers, reputation or competence. It may only show that its public evidence is hard for the engine to isolate.
The composite workshop problem
Object A in the research plan is a composite Nairobi home-services SME with a simple company page, a Google Business listing, scattered directory traces and occasional social posts. For this material, the lab uses a rougher cousin of that scenario: a typical jua kali workshop with map evidence, social traces and a thin or absent website. It is not a real named case. It combines patterns the team expects to inspect through repeatable prompt runs.
The prompt shape is simple. Ask for a type of informal service business in a Kenyan location. Then vary the wording: English, Swahili-facing phrasing, category-first, location-first, service-first, and sometimes a prompt that asks for sources. The lab records the visible answer, cited page, claim and language variant. The result is not a ranking. It is a small source-path map.
A typical answer begins with category fluency. It may describe welders, carpenters, motorbike mechanics, sign painters or furniture makers with reasonable surface detail. The trouble arrives when the answer names a specific business. The cited page might be a platform proxy, such as a directory or marketplace page that repeats the business name but gives little evidence of activity. It might be a map listing with enough location support but weak detail about services. It might be a local story about a jua kali cluster rather than the individual operator named in the answer.
This is where the lab’s Citation Source Role Typology becomes useful. The same answer can contain a local record, a local story, a platform proxy and an unsupported echo in one paragraph. A map listing may function as a local record for location. A county article about an artisan cluster may function as a local story for context. A copied directory page may act as a platform proxy. A claim about years of experience, best quality or exact service range may be an unsupported echo if no cited page backs it.
The typology keeps the team from treating visibility as one clean event. A jua kali business can be partly visible. It may be citeable for location, weakly supported for category and unsupported for reputation. That partial shape is the point.
Minimal signals that make an informal business legible
The lab is careful with the word “minimal.” It does not mean the smallest amount of evidence a business should publish. It means the smallest public trace that appears to let an answer engine connect a business to a claim. In early source-path reviews, four kinds of trace tend to matter: a stable name, a place cue, an activity cue and a retrievable page that can be cited.
A stable name is harder than it sounds. A jua kali operator may be known by a person’s name, a workshop nickname, a stall description or a phone number saved in customers’ contacts. A map listing might use one spelling. A social profile might use another. A directory might abbreviate the name. The answer engine has to decide whether these fragments belong to one entity or several. If the name is too fluid, the model may retreat to the category and avoid the entity.
The place cue is usually stronger. Estate names, road names, markets, industrial areas and town centres help answer engines narrow the match. Yet place can also mislead. A directory may attach a broad county label. A map pin may sit near a cluster where many similar operators work. A local story may describe the cluster but not the individual business. The engine may then produce a confident answer that feels locally plausible while resting on a source that speaks only for the area.
The activity cue matters because “jua kali” is too broad to carry a specific recommendation. The answer must know whether the operator repairs tools, fabricates gates, builds furniture, paints signs, fixes motorbikes or sells spare parts. Social posts often help here because photos and captions may show the work more directly than directories do. But captions are not always stable, searchable or citeable. A post can make the business visible to people and still be awkward evidence for an answer engine.
The fourth signal is the most mechanical: a retrievable page. A source may exist, but if it is buried inside a platform, blocked from indexing, poorly titled or detached from the business name, it may not serve as cited evidence. The answer engine may know a trace exists through retrieval, yet cite something weaker because the cleaner page is easier to surface.
When the answer borrows a stronger voice
The most common danger is not total disappearance. It is borrowed authority. An answer names or describes an informal business, then relies on a source that speaks for a platform, a cluster, a category or a similar entity. The surface answer looks helpful. The evidence has shifted sideways.
For a coastal operator in Object B, the platform proxy problem is easier to see because international booking or travel pages can dominate the trail. For jua kali businesses, the weaker supporting trail may be less polished: copied directories, marketplace posts, map fragments or supplier pages. Under the typology, copied directories and marketplace pages usually act as platform proxies; map listings and supplier pages may instead be local records or partial traces, depending on who controls them and what claim they support.
A source path is weak when the cited page supports the general category but not the named business. It is mixed when one source supports the location and another suggests the service, but no source connects both cleanly. It is language-sensitive when English and Swahili prompts retrieve different traces or change whether the answer names an entity at all. These labels come from the canon because they prevent a false binary. The business is not simply present or absent.
The lab’s working definition is deliberately plain: informal AI visibility is the condition where an answer engine can connect a non-formalized business to citeable public traces because name, place and activity cues line up. If one of those cues breaks, the answer may still speak, but it starts to lean on inference.
This is the small crack the lab watches. A model may infer that a workshop offering welding also makes gates. That may be true. It may also be a guess drawn from category knowledge. Without a cited page that backs the specific service claim, the answer has moved from evidence to plausible fill-in.
What a repeatable run can show
A repeatable run cannot measure the whole informal economy. Kijito Citation Lab does not pretend otherwise. It can, however, show whether the same query structure tends to produce entity names, category-only answers, platform proxies or unsupported echoes under comparable conditions.
For work-item 10, the useful run is not “find all jua kali businesses.” That is too broad and would encourage false precision. The better run asks a narrower question: when a prompt describes a category, location and service need, does the answer engine surface individual informal businesses, and what source role supports each claim? The team can run the same structure across Nairobi, Mombasa, Kisumu, Nakuru or county-level phrasing, then compare the source paths without turning the outcome into a percentage.
The date of each run matters because answer engines change. The model surface matters because different answer engines expose citations differently. The language matters because Swahili wording may shift the category, local intent or evidence path. A prompt asking for “jua kali welders” may not behave the same way as a prompt using a more local service phrase. The lab records the difference rather than smoothing it away.
The team also watches what the answer refuses to do. A model may avoid naming specific operators if sources are thin. That restraint can be good. A category answer with no invented names is safer than a confident list built on mismatched pages. The lab does not score silence as failure automatically. Sometimes silence is the cleaner result.
Still, there is a practical implication for SMEs and trade bodies. If informal operators want to appear as citeable entities, their public traces need to line up. A stable name, location, service description and locally controlled page can reduce the engine’s need to borrow from a proxy. That is a research interpretation, not a promise of appearance. The engine may still ignore the evidence.
Limits of the jua kali evidence trail
This material cannot say how many jua kali businesses appear in AI answers across Kenya. The lab is not running a national census, and the canon rejects invented precise counts. The question here is structural: what happens to an informal business when the answer engine needs evidence it can retrieve and cite?
There are also limits in the word “jua kali” itself. It can name a sector, a style of work, a place cluster, a business identity or a social category depending on context. A prompt may pull the model toward informal manufacturing, repairs, craft, roadside services or general small enterprise. Swahili and English phrasing can alter that intent. The lab marks those cases as language-sensitive when the wording changes the source path.
Citations can be partial. A map listing may support existence and location but not claims about service quality. A social post may show a product but not prove the business is currently active. A directory page may carry a name but not distinguish one operator from another. A local story about a cluster may explain the setting without validating a named recommendation.
The fairest current reading is modest: jua kali businesses can appear in AI answers when their public traces are stable enough to connect. Where the traces are scattered, the answer often shifts upward to category description or sideways to proxy evidence. That is not a verdict on the business. It is a view of what the machine can cite.