Bringing Afya-Yangu AI to the Edge: Designing for Patchy Networks and Low-Power Devices | Sapashe Insights

If you build a beautiful AI system that only works on a high-speed fibre connection and a big GPU, you haven’t built for Kenyan primary care.

Designing Afya-Yangu AI means designing for the edge: low-power devices, intermittent connectivity, and busy clinics where every second counts.

The Constraints We Have to Respect

In many level 2 and 3 facilities, the technical reality looks like this:

If our AI assistant depends on a large cloud model and stable internet, clinicians will use it once, get frustrated, and never open it again.

So we made two key design decisions:

Why Small Models Matter

Large language models are powerful, but they come with trade-offs:

A small model, carefully chosen and fine-tuned, gives us:

We’re not chasing flashy benchmarks. We’re optimising for “Does this work reliably in a busy clinic on Tuesday morning?”

FAISS for Fast Local Search

FAISS helps us store our guideline knowledge base in a way that’s:

Because we only retrieve a handful of relevant chunks for each query, we keep memory and compute usage low—which is exactly what we need on the edge.

Trade-offs: Latency vs Accuracy, Size vs Coverage

Every design choice involves compromise. Some of the trade-offs we’re navigating:

Smaller vs larger model:
Smaller → faster, cheaper, easier to deploy.
Larger → potentially more nuanced language understanding.
Our bias is towards “small enough to run everywhere, good enough to be safe and useful.”
Latency vs complexity:
More retrieval steps and checks could improve answer quality.

But each extra step adds time.
We aim for answers in <5 seconds under normal loads.
On-device vs cloud hybrid:
Full offline mode is essential for many sites.
But when connectivity exists, we might allow optional cloud enhancements (e.g. syncing logs, model updates).

Possible Edge Architectures

We’re exploring a few deployment patterns:

The goal is to avoid a brittle system that dies when the internet drops. Afya-Yangu AI should feel like part of the clinic, not a remote service.

Keeping the System Up-to-Date

Offline doesn’t mean frozen.

We’re designing an update pathway where:

That way, clinicians get the benefits of offline reliability and can stay aligned with evolving national guidance.

Afya-Yangu AI at the edge is a work in progress—but the principle is clear:

If it can’t run where patients are seen, it doesn’t count as “real” clinical AI.