The project started with data sets covering patient and health records covering a period of three years from 8 NHS institutions. Additional updates can be uploaded by medical staff in a secure portal or retrieved automatically by using secure batch updates. The platform can cater towards a variety of institutions and scenarios and new data import methods or institutions can be added quickly.
One of the main challenges here is how to combine data, from a number of sources, often involving manual entry, into one coherent dataset, without ‘manipulating’ test data. By using data transformations and quality checks the incoming data is structured and stored in a single repository. Validation rules are applied, allowing for fault tolerance, privacy settings incorporated, and a logical, canonical model is implemented for retrieval, statistical analysis, or, in future, intelligent applications like treatment prediction.