Author: Salton Massally, CTO, iDT Labs
West Africans generally associate open-source software that is freely available to use and modify as being inferior to commercial alternatives. This is a consequence of both their lack of understanding of the open-source model of software development, and a general consensus that whatever is free is usually of little or no value, as is the case with most commodities in West Africa. Compared to normal users, the governments in the region are even more distrustful of open-source software, since they erroneously believe that the closed nature of commercial software offers them increased security.
While we were planning on digitization of hazard payments to Ebola Response Workers (ERW) in Sierra Leone, which was a project funded by UNDP, some of the major criteria set forth to us were as follows:
- The solution had to be relatively inexpensive
- There was great emphasis on quick development of the system; we had two weeks to plan, develop, and deploy the system
- The solution had to be stable considering the consequences of major downtime or data corruption
With these criteria in our minds, we set out evaluating open-source projects that would not only help us to meet them, but would also be responsive enough for our functional requirements.
Naturally, realizing that finding a ready-made holistic solution meeting all of our specifications was improbable, we divided our scope and evaluated various options against these:
- A Core database to house ERW records, generate pay lists, and to record payment history and payment issues.
- An SMS interface through which alerts and status updates could be sent to ERWs and queries received and processed from them.
- Deduplication of ERW records based on their profile
- Deduplication of ERW records based on facial recognition
- A process for field based data collection
Odoo, formerly known as OpenERP, is a suite of open-source enterprise management applications that easily emerged as the best fit for our core database requirements. It’s major attraction was the ease with which one could develop addons, while its large user base and browser-based user interface made it even more appealing. With Odoo and its PostgreSQL database engine forming the core of our solution, it was relatively simple to develop our extensions, with its webservice architecture integration with external components via its excellent XML-RPC interface being fairly straightforward. Another advantage was that Odoo already had a human resource addon whose design was not far away from what we wanted.
For SMS communications with the system, we decided on using the robust Kannel SMS gateway. We already had extensive experience with Kannel and so didn’t experience the configuration nightmare that first-time users typically go through.
Data Deduplication was to be an integral part of the system. With tens of thousands of record supplied to us via excel spreadsheets, we had to properly plan for a considerable amount of duplications in the records. Deduplicating data was particularly tricky because of the unlimited ways a person’s data can be represented. No convention existed for recording ERW data and so there was an absence of important fields like names, telephone numbers, address etc that we could use to deduplicate our dataset. Factor in the possibility of spelling mistakes and largely incomplete data, and the magnitude of the deduplication nightmare increased. For this problem of extracting, matching and resolving entity we had to turn to machine learning, natural language processing and statistical techniques. This ruled out using the power of postgresql alone as relational databases are not meant to handle complex entity resolutions. Instead, we turned to yet another open-source project, Dedupe.
Initially we flirted with using elasticsearch, an open-source search engine.However we quickly realized that this solution would take much more than the two weeks we had, so we settled with Dedupe, a python library that uses machine learning to quickly perform de-duplication and entity resolution on structured data. After adding some custom code, we got it firing exactly the way we wanted it to.
Deduplication using the passport pictures of ERWs, when available, was done with the help of OpenBR and OpenCV, a facial recognition library and computer vision project respectively.
A requirement of the project was the ability for data to be collected in the field using smartphones. For this we leveraged the excellent Open Data Kit project (ODK), a suite of tools that allows data collection using mobile devices and data submission to an online server, even without an internet connection or mobile carrier service at the time of data collection.
Adopting open source projects in our solution ensured that we finished well within our rather stringent timeframe and budget, putting together a robust solution that would have taken us years to develop if we were to do it from scratch. The works of thousands of excellent programmers in the open-source space ensured that Sierra Leone effectively and efficiently solved what was the most complex and easily volatile aspect of the Ebola response, the distribution of hazard payment. The only component we had to pay directly for was the SSL certificate used to secure communication with the server.
Seeing how successfully open source software was deployed in our context we hope to see this ideology embraced by West Africans. With billions of dollars being wasted by both Governments and the International Development Community operating in our sub-region on low-quality, non-functional software shelved after few months due to poor quality or low user adoption, it is imperative that we turn to the open source landscape for our software requirements. The billions being poured into commercial software licensing can easily be redirected at more pressing needs, such as poverty alleviation, HIV and Malaria prevention, to name a few.